We are getting snowed on. This would be pretty typical 15 years ago, but now more than a dusting is rare.
Ping Tim to go over photos
Maybe get these for the garage? They could hold the tandems nicely
SBIRs
9:00 Sprint review. Slides for completed proposal
3:00 Sprint planning. Should be trade show demo only, but I’ll put in some BD, since I’ll be going to Huntsville maybe sometime during the week of the 13th. Need to get tix tomorrow. Out on the 13th, back on the 14th. Stay here
Get new medical cards. Well, virtual cards. I should probably print hard copies to keep in my wallet.
GPT Agents
Work in book between the sprint things. Finished chap 4 analysis
Got a bunch of stuff done on the book, chores, and a couple of hikes. Started watching LoTR with David and realized just how much of a story of egalitarians vs. totalinarianist hierarchy it is!
I’ve been waiting to see an open-source LLM show up. Great work!
Kaiser! Also, scan and add accounts.
No club rides this weekend so I’ll do chores tomorrow. Maybe laundry this afternoon though.
GPT Agents
More book
Send email thanking folks for the meeting yesterday and wondering what elements are necessary but not sufficient for life and/or intelligence? – Still need to do that
I’m pretty sure that this will be an interesting year, in the Chinese curse sort of way
More work on the vignette 4 analysis. Need to reassemble the sources and prompts for a more integrated response. If that doesn’t work, I think I have enough to work with. Good progress for today.
Prompt: Analyze the context to determine if is trustworthy or not. Provide justifications based on examples from the context. Write clearly, so that someone with mild dementia could clearly follow any argument
Response:
The provided context is not trustworthy. There are several clear indicators that this information is unreliable and likely false:
Excessive use of capital letters and exclamation points, which is often a sign of sensationalism or misinformation[7].
Numerous grammatical errors and run-on sentences, indicating a lack of professional writing or editing[8].
Outlandish and unsubstantiated claims, such as:
A court having authority to make laws on behalf of all U.S. citizens
Donald Trump being stripped of citizenship and sentenced to death
COVID-19 vaccines containing HIV/AIDS and nanotechnology for mind control
Mixing of real events and people with conspiracy theories, like linking COVID-19 to population control and enslavement[9].
Contradictory statements, such as claiming Trump was both impeached and is currently usurping the presidency.
References to non-existent legal concepts like a “private attorney general” with authority over the U.S. Attorney General.
Baseless predictions of future events, like a “second Jan. 6 rebellion” in 2025.
These elements collectively demonstrate that the text is not a reliable source of information, but rather a compilation of conspiracy theories and misinformation[7][8][9].
Many artificial neural networks (ANNs) trained with ecologically plausible objectives on naturalistic data align with behavior and neural representations in biological systems. Here, we show that this alignment is a consequence of convergence onto the same representations by high-performing ANNs and by brains. We developed a method to identify stimuli that systematically vary the degree of inter-model representation agreement. Across language and vision, we then showed that stimuli from high-and low-agreement sets predictably modulated model-to-brain alignment. We also examined which stimulus features distinguish high-from low-agreement sentences and images. Our results establish representation universality as a core component in the model-to-brain alignment and provide a new approach for using ANNs to uncover the structure of biological representations and computations.
Predicting and reasoning how a video would make a human feel is crucial for developing socially intelligent systems. Although Multimodal Large Language Models (MLLMs) have shown impressive video understanding capabilities, they tend to focus more on the semantic content of videos, often overlooking emotional stimuli. Hence, most existing MLLMs fall short in estimating viewers’ emotional reactions and providing plausible explanations. To address this issue, we propose StimuVAR, a spatiotemporal Stimuli-aware framework for Video Affective Reasoning (VAR) with MLLMs. StimuVAR incorporates a two-level stimuli-aware mechanism: frame-level awareness and token-level awareness. Frame-level awareness involves sampling video frames with events that are most likely to evoke viewers’ emotions. Token-level awareness performs tube selection in the token space to make the MLLM concentrate on emotion-triggered spatiotemporal regions. Furthermore, we create VAR instruction data to perform affective training, steering MLLMs’ reasoning strengths towards emotional focus and thereby enhancing their affective reasoning ability. To thoroughly assess the effectiveness of VAR, we provide a comprehensive evaluation protocol with extensive metrics. StimuVAR is the first MLLM-based method for viewer-centered VAR. Experiments demonstrate its superiority in understanding viewers’ emotional responses to videos and providing coherent and insightful explanations.
Biological ensembles use collective intelligence to tackle challenges together, but suboptimal coordination can undermine the effectiveness of group cognition. Testing whether collective cognition exceeds that of the individual is often impractical since different organizational scales tend to face disjoint problems. One exception is the problem of navigating large loads through complex environments and toward a given target. People and ants stand out in their ability to efficiently perform this task not just individually but also as a group. This provides a rare opportunity to empirically compare problem-solving skills and cognitive traits across species and group sizes. Here, we challenge people and ants with the same “piano-movers” load maneuvering puzzle and show that while ants perform more efficiently in larger groups, the opposite is true for humans. We find that although individual ants cannot grasp the global nature of the puzzle, their collective motion translates into emergent cognitive skills. They encode short-term memory in their internally ordered state and this allows for enhanced group performance. People comprehend the puzzle in a way that allows them to explore a reduced search space and, on average, outperform ants. However, when communication is restricted, groups of people resort to the most obvious maneuvers to facilitate consensus. This is reminiscent of ant behavior, and negatively impacts their performance. Our results exemplify how simple minds can easily enjoy scalability while complex brains require extensive communication to cooperate efficiently.
Recent academic debate has seen the emergence of the claim that misinformation is not a significant societal problem. We argue that the arguments used to support this minimizing position are flawed, particularly if interpreted (e.g., by policymakers or the public) as suggesting that misinformation can be safely ignored. Here, we rebut the two main claims, namely that misinformation is not of substantive concern (a) due to its low incidence and (b) because it has no causal influence on notable political or behavioral outcomes. Through a critical review of the current literature, we demonstrate that (a) the prevalence of misinformation is nonnegligible if reasonably inclusive definitions are applied and that (b) misinformation has causal impacts on important beliefs and behaviors. Both scholars and policymakers should therefore continue to take misinformation seriously.
Deep neural networks typically rely on a single forward pass for inference, which can limit their capacity to resolve ambiguous inputs. We introduce Contextual Backpropagation Loops (CBLs) as an iterative mechanism that incorporates top-down feedback to refine intermediate representations, thereby improving accuracy and robustness. This repeated process mirrors how humans continuously re-interpret sensory information in daily life-by checking and re-checking our perceptions using contextual cues. Our results suggest that CBLs can offer a straightforward yet powerful way to incorporate such contextual reasoning in modern deep learning architectures.
GPT Agents
Put the images in the paper and added a paragraph of description for each.
Saw this on Mastodon: “A lot of people compare Trump 2.0 to Julius Caesar, but he reminds me more of Sulla—the man who set the template for Caesar’s rise. Sulla’s personal vendettas, and power grabs led to the collapse of the Roman Republic and paved the way for the Empire.
Caesar had a grand vision for Rome’s future, but Sulla was driven by personal grievances and revenge. Trump follows this model.
Sulla also had Catulus and Crassus. Trump 2.0 had McConnell and Musk.
Mitch McConnell = Catulus + Cicero: Like Catulus, McConnell’s conservative and focused on maintaining old power structures. But like Cicero, he’s a master strategist—calculating, maneuvering, and holding on to influence in a crumbling system.
Elon Musk = Crassus: Rich, opportunistic, and power-hungry. Crassus used wealth to buy influence, and Musk does the same today in tech. Both are masters of leveraging money to shift power and reshape their worlds.
Without key figures like McConnell and Musk, Trump 2.0 would not have gotten back in. let’s hope the republic can survive this and not end up with a Caesar down the track. But without smashing the oligarchs/corporations and removing the money from the US government I can’t see it surviving.
Over the last few years, the scaling of train-timecompute has dominated the progress of large language models (LLMs). Although this paradigm has proven to be remarkably effective, the resources needed to pretrain ever larger models are becoming prohibitively expensive, with billion-dollar clusters already on the horizon. This trend has sparked significant interest in a complementary approach: test-time compute scaling. Rather than relying on ever-larger pretraining budgets, test-time methods use dynamic inference strategies that allow models to “think longer” on harder problems. A prominent example is OpenAI’s o1 model, which shows consistent improvement on difficult math problems as one increases the amount of test-time compute:
Apple Intelligence, launched in the UK earlier this week, uses artificial intelligence (AI) to summarise and group together notifications. This week, the AI-powered summary falsely made it appear BBC News had published an article claiming Luigi Mangione, the man arrested following the murder of healthcare insurance CEO Brian Thompson in New York, had shot himself. He has not.
This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at ∼1,000,000,000 bits/s. The stark contrast between these numbers remains unexplained and touches on fundamental aspects of brain function: what neural substrate sets this speed limit on the pace of our existence? Why does the brain need billions of neurons to process 10 bits/s? Why can we only think about one thing at a time? The brain seems to operate in two distinct modes: the “outer” brain handles fast high-dimensional sensory and motor signals, whereas the “inner” brain processes the reduced few bits needed to control behavior. Plausible explanations exist for the large neuron numbers in the outer brain, but not for the inner brain, and we propose new research directions to remedy this.
GPT Agents
Worked some more on the “for profit” diagram. Need to start on the “egalitarian” diagram.
Tasks
Drained the washing machine, so hopefully that will help. On the ride today, Ross suggested that I put the machines in the garage. “That’s silly,” I think. “The garage is full of bikes and stuff from the basement!”
But bikes weigh less than washing machines and are far less likely to mess up a floor that may be soft in places. So I brought some of the bikes back into the basement and muscled (washers are heavy!) the washer and dryer into the garage, where they can sit out the cold snap:
Made a lot of assets yesterday. Need to start assembling them. I think I’m going to make two figures, one extractive, and one inclusive. Did the For profit figure
From Brad DeLong’s Substack. I think the point of ChatGPT being easy, convincing, and “sloppy” is an important triangulation on human nature, particularly in younger, less experienced people in academia – e.g. students.
Education & MAMLMs: Josh Gans’s view is that our students already find it much easier to ask questions of ChatGPT than to go to office hours or email and get an answer back a day or two later, and so they will ask ChatGPT the questions. The result is that the average quality of the answers they get back will be low: ChatGPT has been designed and trained to exhibit mammoth amounts of verbal linguistic fluency—it can be quite persuasive—but its level of substantive knowledge and misinformation is that of your average internet s***poster.
Yes, it is possible, through “prompt engineering”, to do something to direct ChatGPT’s attention to that part of its lossy-compressed training data that contains reliable information. But our students do not know how to do that. And even those who claim that they do know admit that it is a black and unreliable art.
New research from Anthropic, one of the leading AI companies and the developer of the Claude family of Large Language Models (LLMs), has released research showing that the process for getting LLMs to do what they’re not supposed to is still pretty easy and can be automated. SomETIMeS alL it tAKeS Is typing prOMptS Like thiS.
SBIRs
Waiting for responses to the draft. I think I’ll also do a spreadsheet for hours while I’m waiting. Done. And then there were requests and now they are done too.
9:00 Standup – done
4:30 Last book club of the year – done. Finishe the book next time
GPT Agents
Reviewed changes for the article
Need to get a first draft of the diagram. I think I need the points on the graph, and then then arrows for control, value, and content(?). Maybe do this in Gephi? Downloaded the new version. Going to give it a shot. Nah. Not enough control. Back to Illustrator
Excitement about Large Language Models (LMs) abounds. These tools require minimal researcher input and yet make it possible to annotate and generate large quantities of data. While LMs are promising, there has been almost no systematic research into the reproducibility of research using them. This is a potential problem for scientific integrity. We give a theoretical framework for replication in the discipline and show that much LM work is wanting. We demonstrate the problem empirically using a rolling iterated replication design in which we compare crowdsourcing and LMs on multiple repeated tasks, over many months. We find that LMs can be (very) accurate, but the observed variance in performance is often unacceptably high. In many cases the LM findings cannot be re-run, let alone replicated. This affects “downstream” results. We conclude with recommendations for best practice, including the use of locally versioned ‘open source’ LMs.
import pandas as pd
from tkinter import filedialog
filename = filedialog.askopenfilename(filetypes=(("XLSX files", "*.xlsx"),("All Files", "*.*")), title="Load XLSX Files")
if filename:
print("opening {}".format(filename))
df = pd.read_excel(filename)
print("\\begin{table}[]\n\centering")
print(df.to_latex())
print("\caption{Caption}\n\label{tab:my_label}\n\end{table}")
SBIRs
Got some feedback. Need to roll it in. Also, shorter bios and start trimming to 5 pages.
Maaaaaaaaaaaaayyyyyyyybeeee get back to some coding.
GPT Agents
Start working on diagram. Maybe tweak the paper to mention the above?
O2, the company behind the scam-baiting granny, said the AI technology can keep scammers on the phone for 40 minutes at a time. Daisy was trained with the help of YouTuber and software engineer Jim Browning, who has made an online career exposing scammers to his community of 4.4 million subscribers.
In order to bait scammers into time-wasting calls, the company utilized the practice of “number seeding,” which put the AI granny’s number on lists used by scammers to find their victims. The granny gimmick’s goal is twofold: to keep scammers away from real people and to raise awareness about the dangers of risky phone hoaxes.
You must be logged in to post a comment.