Category Archives: Phil

Phil 12.25.2024

Happy Isaac Newton’s birthday for those who celebrate! He’d be 1,418 years old today. Calculus is old.

Comparing cooperative geometric puzzle solving in ants versus humans

Biological ensembles use collective intelligence to tackle challenges together, but suboptimal coordination can undermine the effectiveness of group cognition. Testing whether collective cognition exceeds that of the individual is often impractical since different organizational scales tend to face disjoint problems. One exception is the problem of navigating large loads through complex environments and toward a given target. People and ants stand out in their ability to efficiently perform this task not just individually but also as a group. This provides a rare opportunity to empirically compare problem-solving skills and cognitive traits across species and group sizes. Here, we challenge people and ants with the same “piano-movers” load maneuvering puzzle and show that while ants perform more efficiently in larger groups, the opposite is true for humans. We find that although individual ants cannot grasp the global nature of the puzzle, their collective motion translates into emergent cognitive skills. They encode short-term memory in their internally ordered state and this allows for enhanced group performance. People comprehend the puzzle in a way that allows them to explore a reduced search space and, on average, outperform ants. However, when communication is restricted, groups of people resort to the most obvious maneuvers to facilitate consensus. This is reminiscent of ant behavior, and negatively impacts their performance. Our results exemplify how simple minds can easily enjoy scalability while complex brains require extensive communication to cooperate efficiently.

Phil 12.24.202

Why Misinformation Must Not Be Ignored

Recent academic debate has seen the emergence of the claim that misinformation is not a significant societal problem. We argue that the arguments used to support this minimizing position are flawed, particularly if interpreted (e.g., by policymakers or the public) as suggesting that misinformation can be safely ignored. Here, we rebut the two main claims, namely that misinformation is not of substantive concern (a) due to its low incidence and (b) because it has no causal influence on notable political or behavioral outcomes. Through a critical review of the current literature, we demonstrate that (a) the prevalence of misinformation is nonnegligible if reasonably inclusive definitions are applied and that (b) misinformation has causal impacts on important beliefs and behaviors. Both scholars and policymakers should therefore continue to take misinformation seriously.

Contextual Backpropagation Loops: Amplifying Deep Reasoning with Iterative Top-Down Feedback

Deep neural networks typically rely on a single forward pass for inference, which can limit their capacity to resolve ambiguous inputs. We introduce Contextual Backpropagation Loops (CBLs) as an iterative mechanism that incorporates top-down feedback to refine intermediate representations, thereby improving accuracy and robustness. This repeated process mirrors how humans continuously re-interpret sensory information in daily life-by checking and re-checking our perceptions using contextual cues. Our results suggest that CBLs can offer a straightforward yet powerful way to incorporate such contextual reasoning in modern deep learning architectures.

GPT Agents

Put the images in the paper and added a paragraph of description for each.

Phil 12.23.2024

Saw this on Mastodon: “A lot of people compare Trump 2.0 to Julius Caesar, but he reminds me more of Sulla—the man who set the template for Caesar’s rise. Sulla’s personal vendettas, and power grabs led to the collapse of the Roman Republic and paved the way for the Empire.

Caesar had a grand vision for Rome’s future, but Sulla was driven by personal grievances and revenge. Trump follows this model.

Sulla also had Catulus and Crassus. Trump 2.0 had McConnell and Musk.

Mitch McConnell = Catulus + Cicero: Like Catulus, McConnell’s conservative and focused on maintaining old power structures. But like Cicero, he’s a master strategist—calculating, maneuvering, and holding on to influence in a crumbling system.

Elon Musk = Crassus: Rich, opportunistic, and power-hungry. Crassus used wealth to buy influence, and Musk does the same today in tech. Both are masters of leveraging money to shift power and reshape their worlds.

Without key figures like McConnell and Musk, Trump 2.0 would not have gotten back in. let’s hope the republic can survive this and not end up with a Caesar down the track. But without smashing the oligarchs/corporations and removing the money from the US government I can’t see it surviving.

uspol #HistoryRhymes #rome”

GPT Agents

Tweaking figures – done! (hopefully)

Saw Flow. Excellent!

Phil 12.21.2024

Winter solstice! Tomorrow will be one second longer!

Scaling test-time compute – a Hugging Face Space by HuggingFaceH4

Over the last few years, the scaling of train-time compute has dominated the progress of large language models (LLMs). Although this paradigm has proven to be remarkably effective, the resources needed to pretrain ever larger models are becoming prohibitively expensive, with billion-dollar clusters already on the horizon. This trend has sparked significant interest in a complementary approach: test-time compute scaling. Rather than relying on ever-larger pretraining budgets, test-time methods use dynamic inference strategies that allow models to “think longer” on harder problems. A prominent example is OpenAI’s o1 model, which shows consistent improvement on difficult math problems as one increases the amount of test-time compute:

Another crazy AI slop thing: BBC complains to Apple over misleading shooting headline

Apple Intelligence, launched in the UK earlier this week, uses artificial intelligence (AI) to summarise and group together notifications. This week, the AI-powered summary falsely made it appear BBC News had published an article claiming Luigi Mangione, the man arrested following the murder of healthcare insurance CEO Brian Thompson in New York, had shot himself. He has not.

The unbearable slowness of being: Why do we live at 10 bits/s? (ArXiv link)

This article is about the neural conundrum behind the slowness of human behavior. The information throughput of a human being is about 10 bits/s. In comparison, our sensory systems gather data at ∼1,000,000,000 bits/s. The stark contrast between these numbers remains unexplained and touches on fundamental aspects of brain function: what neural substrate sets this speed limit on the pace of our existence? Why does the brain need billions of neurons to process 10 bits/s? Why can we only think about one thing at a time? The brain seems to operate in two distinct modes: the “outer” brain handles fast high-dimensional sensory and motor signals, whereas the “inner” brain processes the reduced few bits needed to control behavior. Plausible explanations exist for the large neuron numbers in the outer brain, but not for the inner brain, and we propose new research directions to remedy this.

GPT Agents

Worked some more on the “for profit” diagram. Need to start on the “egalitarian” diagram.

Tasks

Drained the washing machine, so hopefully that will help. On the ride today, Ross suggested that I put the machines in the garage. “That’s silly,” I think. “The garage is full of bikes and stuff from the basement!”
But bikes weigh less than washing machines and are far less likely to mess up a floor that may be soft in places. So I brought some of the bikes back into the basement and muscled (washers are heavy!) the washer and dryer into the garage, where they can sit out the cold snap:

Laundry! Bring soap! Done!

Phil 12.20.2024

We are at the bottom of the curve:

Tasks

Bills – done
8:00 Floor – done! Now it needs to cure
Clean house – done
Dishes – done
Wrap gifts
12:00 Greg done
2:30 WSJ done. Fun!

GPT Agents

Made a lot of assets yesterday. Need to start assembling them. I think I’m going to make two figures, one extractive, and one inclusive. Did the For profit figure

Phil 12.19.2024

From Brad DeLong’s Substack. I think the point of ChatGPT being easy, convincing, and “sloppy” is an important triangulation on human nature, particularly in younger, less experienced people in academia – e.g. students.

Education & MAMLMs: Josh Gans’s view is that our students already find it much easier to ask questions of ChatGPT than to go to office hours or email and get an answer back a day or two later, and so they will ask ChatGPT the questions. The result is that the average quality of the answers they get back will be low: ChatGPT has been designed and trained to exhibit mammoth amounts of verbal linguistic fluency—it can be quite persuasive—but its level of substantive knowledge and misinformation is that of your average internet s***poster.
Yes, it is possible, through “prompt engineering”, to do something to direct ChatGPT’s attention to that part of its lossy-compressed training data that contains reliable information. But our students do not know how to do that. And even those who claim that they do know admit that it is a black and unreliable art.

APpaREnTLy THiS iS hoW yoU JaIlBreAk AI

New research from Anthropic, one of the leading AI companies and the developer of the Claude family of Large Language Models (LLMs), has released research showing that the process for getting LLMs to do what they’re not supposed to is still pretty easy and can be automated. SomETIMeS alL it tAKeS Is typing prOMptS Like thiS.

SBIRs

Waiting for responses to the draft. I think I’ll also do a spreadsheet for hours while I’m waiting. Done. And then there were requests and now they are done too.
9:00 Standup – done
4:30 Last book club of the year – done. Finishe the book next time

GPT Agents

Reviewed changes for the article
Need to get a first draft of the diagram. I think I need the points on the graph, and then then arrows for control, value, and content(?). Maybe do this in Gephi? Downloaded the new version. Going to give it a shot. Nah. Not enough control. Back to Illustrator

Phil 12.18.2024

Johns Hopkins is still doing tracking of COVID. Here’s Maryland:

Replication for Language Models Problems, Principles, and Best Practice for Political Science

Excitement about Large Language Models (LMs) abounds. These tools require minimal researcher input and yet make it possible to annotate and generate large quantities of data. While LMs are promising, there has been almost no systematic research into the reproducibility of research using them. This is a potential problem for scientific integrity. We give a theoretical framework for replication in the discipline and show that much LM work is wanting. We demonstrate the problem empirically using a rolling iterated replication design in which we compare crowdsourcing and LMs on multiple repeated tasks, over many months. We find that LMs can be (very) accurate, but the observed variance in performance is often unacceptably high. In many cases the LM findings cannot be re-run, let alone replicated. This affects “downstream” results. We conclude with recommendations for best practice, including the use of locally versioned ‘open source’ LMs.

import pandas as pd
from tkinter import filedialog

filename = filedialog.askopenfilename(filetypes=(("XLSX files", "*.xlsx"),("All Files", "*.*")), title="Load XLSX Files")
if filename:
    print("opening {}".format(filename))

    df = pd.read_excel(filename)
    print("\\begin{table}[]\n\centering")
    print(df.to_latex())
    print("\caption{Caption}\n\label{tab:my_label}\n\end{table}")

SBIRs

Got some feedback. Need to roll it in. Also, shorter bios and start trimming to 5 pages.
Maaaaaaaaaaaaayyyyyyyybeeee get back to some coding.

GPT Agents

Start working on diagram. Maybe tweak the paper to mention the above?

Phil 12.17.2024

Learned a new word today:

Tasks

Ping Carlos – done
Check for TW Ellis – done
Call Dentist – and now I have a new filling

SBIRs

Frame out a rough draft – done

Phil 12.16.2024

Email for Carlos

SBIRs

9:00 Sprint Demos – done
3:00 Sprint Planning – stories are written
Started on the SBIR proposal. Template is done
Two unplanned meetings that might mean a trip to Huntsville again
Got together with Aaron to figure out the approach to the SBIR

Phil 12.14.2024

I Traded My News Apps for Rumble, the Right-Wing YouTube. Here’s What I Saw

Blame for any hiccups in Mr. Trump’s strategy was assigned to Democrats or even Republicans who were not sufficiently obedient.

Speaking of detecting and disrupting manipulation: A phone company developed an AI ‘granny’ to beat scammers at their own game. The number seeding is a literal counterattack/exploit in its own right

O2, the company behind the scam-baiting granny, said the AI technology can keep scammers on the phone for 40 minutes at a time. Daisy was trained with the help of YouTuber and software engineer Jim Browning, who has made an online career exposing scammers to his community of 4.4 million subscribers.
In order to bait scammers into time-wasting calls, the company utilized the practice of “number seeding,” which put the AI granny’s number on lists used by scammers to find their victims. The granny gimmick’s goal is twofold: to keep scammers away from real people and to raise awareness about the dangers of risky phone hoaxes.

Phil 12.13.2024

Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs

Natural Language Processing (NLP) research is increasingly focusing on the use of Large Language Models (LLMs), with some of the most popular ones being either fully or partially closed-source. The lack of access to model details, especially regarding training data, has repeatedly raised concerns about data contamination among researchers. Several attempts have been made to address this issue, but they are limited to anecdotal evidence and trial and error. Additionally, they overlook the problem of indirect data leaking, where modelsare iteratively improved by using data coming from users. In this work, we conduct the first systematic analysis of work using OpenAI’s GPT-3.5 and GPT-4, the most prominently used LLMs today, in the context of data contamination. By analysing 255 papers and considering OpenAI’s data usage policy, we extensively document the amount of data leaked to these models during the first year after the model’s release. We report that these models have been globally exposed to ∼4.7M samples from 263 benchmarks. At the same time, we document a number of evaluation malpractices emerging in the reviewed papers, such as unfair or missing baseline comparisons and reproducibility issues. We release our results as a collaborative project on https://leak-llm.github.io/, where other researchers can contribute to our efforts.

SBIRs

9:00 Meeting
12:50 USNA

Phil 12.12.2024

Atlantic circulation collapse? New clues on the fate of a crucial conveyor belt

So in summary, we have at least some reassurance from the North Atlantic data that a full-on AMOC collapse hasn’t begun. And it’s unlikely that any future collapse would reach its end point any sooner than the early to mid-2100s. Yet there’s also legitimate concern – stoked by recent work from climate modelers and statisticians – that a tipping point toward eventual collapse could arrive as soon as the next several decades, especially if fossil-fuel emissions aren’t cut sharply. In part two of this two-part post, we’ll look at some of the new research on early-warning signs of AMOC collapse, what those scientists and other assessments are telling us about the threat, and how we can help limit the odds of an AMOC collapse happening in the first place.

SBIRs

9:00 standup
11:00 drop off car / pick up car – done
2:30 “AI” Meeting – could not work it in
4:30 Book club – Rukan couldn’t make it
Good progress today, but still not quite right

GPT Agents

LLM meeting – worked on the diagram for the egalitarian AI paper

Phil 12.11.24

Ugh:

SBIRs

See how the line segment intersection code improves things. Maybe make some test files?
Also, I want to see how fast I could calculate a 256×256 grid given a source, destination, and a spectator.

GPT Agents

3:00 Alden meeting. Bring up the GPTZero / Chess model as human result.
Work on book?

Phil 12.10.2024

Open source maintainers are drowning in junk bug reports written by AI

“Recently I’ve noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects,” he wrote, pointing to similar findings from the Curl project in January. “These reports appear at first glance to be potentially legitimate and thus require time to refute.”

SBIRs

9:00 standup
Finish refactoring and integrate with trajectories?

GPT Agents

Add a section on Salt Typhoon

Phil 12.9.2024

Write up something about the chess model fooling GPTZero:

SBIRs

9:00 tax thing?
3:00 Tradeshow demo tagup
Work on getting foms generated. Small runs first! Got some good refactoring done and then pulled into 2025 planning

GPT Agents

Good progress on the KA book! Need to bring in some content from the slide deck – nah. Didn’t really work
Reach out to Dr. Bryson about proposal – done
Ping Greg too – done

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Phil

Phil 12.25.2024

Phil 12.24.202

Phil 12.23.2024

Phil 12.21.2024

Phil 12.20.2024

Phil 12.19.2024

Phil 12.18.2024

Phil 12.17.2024

Phil 12.16.2024

Phil 12.14.2024

Phil 12.13.2024

Phil 12.12.2024

Phil 12.11.24

Phil 12.10.2024

Phil 12.9.2024