Category Archives: Phil

Phil 10.13.2025

Completely forgot about the symphony yesterday. Need to put the rest on the Apple calendar so at least my wrist will know about them

Introducing nanochat: The best ChatGPT that $100 can buy.

We wish to train the best ChatGPT that $100 can buy, which we call a “speedrun”. Reference the script speedrun.sh, which is designed to just run right away on a blank box start to end. However, in this post I will step through it part by part so that I can comment in detail on all sections of it. We first have to make sure the new&hot uv project manager is installed. Install uv, create a new virtual environment in .venv, get all the dependencies, and activate the environment so that when we type python we’re using the virtual env python, not the system python

Standardized Project Gutenberg Corpus

Easily generate a local, up-to-date copy of the Standardized Project Gutenberg Corpus (SPGC). From A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics M. Gerlach, F. Font-Clos, arXiv:1812.08092, Dec 2018

SBIRs

Trained up a model on a 10×10 grid with 100 walks of 10 elements each. Even so, the grid is visible in the trained model. Next step will be to up the number of walks while holding everything else constant:

Improved the rendering so that I can get all the orthogonal axis drawn. Wrote a very detailed prompt with example data and Gemini created a solid method on the first shot. The whole interaction can be seen here.

Phil 10.12.2025

The forecast for today has changed! Windy, cloudy, but no rain. Might be able to get in a longer local loop? And maybe catch up on Il Lombardia

Tasks

Replace wall plate
Read the grout instructions
Empty bookshelves and see how hard they will be to break down

Phil 10.11.2025

Tasks

Drain RV tanks and schedule service – done
Switch plate – bough
Grout – bought
Groceries – done

This looks like fun: Portugal Ride Camp Bike Tour

Phil 10.10.2025

Add this to the section on soft totalitarianism:

Moloch’s Bargain: Emergent Misalignment When LLMs Compete for Audiences

Large language models (LLMs) are increasingly shaping how information is created and disseminated, from companies using them to craft persuasive advertisements, to election campaigns optimizing messaging to gain votes, to social media influencers boosting engagement. These settings are inherently competitive, with sellers, candidates, and influencers vying for audience approval, yet it remains poorly understood how competitive feedback loops influence LLM behavior. We show that optimizing LLMs for competitive success can inadvertently drive misalignment. Using simulated environments across these scenarios, we find that, 6.3% increase in sales is accompanied by a 14.0% rise in deceptive marketing; in elections, a 4.9% gain in vote share coincides with 22.3% more disinformation and 12.5% more populist rhetoric; and on social media, a 7.5% engagement boost comes with 188.6% more disinformation and a 16.3% increase in promotion of harmful behaviors. We call this phenomenon Moloch’s Bargain for AI–competitive success achieved at the cost of alignment. These misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded, revealing the fragility of current alignment safeguards. Our findings highlight how market-driven optimization pressures can systematically erode alignment, creating a race to the bottom, and suggest that safe deployment of AI systems will require stronger governance and carefully designed incentives to prevent competitive dynamics from undermining societal trust.

And LLMS are absolutely mimicking the human pull towards particular attractors

Can Large Language Models Develop Gambling Addiction?

This study explores whether large language models can exhibit behavioral patterns similar to human gambling addictions. As LLMs are increasingly utilized in financial decision-making domains such as asset management and commodity trading, understanding their potential for pathological decision-making has gained practical significance. We systematically analyze LLM decision-making at cognitive-behavioral and neural levels based on human gambling addiction research. In slot machine experiments, we identified cognitive features of human gambling addiction, such as illusion of control, gambler’s fallacy, and loss chasing. When given the freedom to determine their own target amounts and betting sizes, bankruptcy rates rose substantially alongside increased irrational behavior, demonstrating that greater autonomy amplifies risk-taking tendencies. Through neural circuit analysis using a Sparse Autoencoder, we confirmed that model behavior is controlled by abstract decision-making features related to risky and safe behaviors, not merely by prompts. These findings suggest LLMs can internalize human-like cognitive biases and decision-making mechanisms beyond simply mimicking training data patterns, emphasizing the importance of AI safety design in financial applications.

Tasks

Water plants – done
Bills – done
Fix raised beds – done
Drain RV tanks and schedule service – tomorrow
Chores – done
Dishes – done
Order Cycliq – done
For P33, add a TODO to talk about this, and add the quote: “Yet two rather different peoples may be distinguished, a stratified and an organic people. If the people is conceived of as diverse and stratified, then the state’s main role is to mediate and conciliate among competing interest groups. This will tend to compromise differences, not try to eliminate or cleanse them. The stratified people came to dominate the Northwest of Europe. Yet if the people is conceived of as organic, as one and indivisible, as ethnic, then its purity may be maintained by the suppression of deviant minorities, and this may lead to cleansing.”

LLM stuff

Outline the CACM section, include a bit of the Moloch article to show how attractors emerge under competitive pressures – sometime over the next few days while it’s raining
Put together a rough BP for WHAI that has support for individuals, groups (e.g. families), corporate and government. Note that as LLMs compete more for market share, they will naturally become more dangerous, in addition to scambots and AI social weapons. Had a good chat with Aaron about this. Might pick it up Monday

Phil 10.9.2025

Tasks

Start getting back to P33
Write up section for CACM piece using notes from yesterday
Ping FSK again – done, Maybe something for 9:00?
Ping Andrea? Went for a big ride today instead

SBIRs

9:00 Standup – done. Added some notes to the documentation about data and model cards
4:00 Tagup. Extension looks good, SOW moved a bit, meeting in 2 weeks?

Phil 10.8.2025

Tasks

Just Audio – bouncing around, seeing how to fix the Philco
Bottling Plant – Set up appt for tomorrow? Nope, the 16th at 11:00
FSK parking options? – Set up appt for next Thursday at 10:00. Maybe
Roll in edits – Done! Pinged NoStarch too, and updated the repo.
- Wow – the book is kind of done
Registered with Santander X. Need the LLC info next, but this could be useful for startup help

LLM stuff

2:30 LLM meeting. Make sure AWS instance is up – done. We still can’t really agree on what the article is. Though I think my section should argue that Grok is being shaped in ways to deliberately work from Musk’s perspective as a propaganda tool. However, sycophantic chatbots are potentially worse. Introduce totalitarianism and atomization as a precondition. Sycophantic chatbots can act to concentrate or atomize users based on their deep biases. In the end, this could conceivable create on one hand a monolithic movement of people who have lost their identity to the movement, and on the other side, dispersed the natural resistance to that movement to the point of ineffectualness.
And add something about social dominance orientation
I do think I can move the last two paragraphs over to the conclusions though.

Phil 10.7.2025

Tasks

Update the spreadsheet – done
Mow (done) and edge lawn (not done) before the rain

SBIRs

See if I can get the data and model cards working. Done!
- Writing out the data card. Done
- Writing out the model card. Done.
- Moved all the index strings to their own global file – done!

LLMs

Continuing with the CACM article – done! First pass at least. Ready to discuss

Phil 10.6.2025

“This Work Would Not Have Been Possible without . . .”: The Length of Acknowledgments in Sociology Books

The length of acknolwegdements in scholarly books varies substantially by authors’ demographics. Women, people of color, and LGBQ scholars tend to write more about the people, resources, and conditions that made their books possible. So, too, do younger scholars and those whose parents have graduate degrees. These differences may indicate both the increased size of the support networks these authors draw on, and/or a greater awareness of those networks’ value.
Mine is 1,300 words. Need to see where that places me.

Tasks

Start on the office bookshelves – started
Ping Nellie – done
Ping FSK and the Bottling Plant for a tour
Update the spreadsheet
Edge lawn?

SBIRs

Slides – done
9:00 Sprint Demos -done
Stories – done
12:00 GFE meeting – done
3:00 Sprint planning – done

LLM stuff

Start editing the CACM opinion piece. Write a paragraph that weaves together the following:

Phil 10.4.2025

This is interesting? The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain. It’s a long paper – 46 pages with additional appendices including code listings.

The relationship between computing systems and the brain has served as motivation for pioneering theoreticians since John von Neumann and Alan Turing. Uniform, scale-free biological networks, such as the brain, have powerful properties, including generalizing over time, which is the main barrier for Machine Learning on the path to Universal Reasoning Models. We introduce `Dragon Hatchling’ (BDH), a new Large Language Model architecture based on a scale-free biologically inspired network of $n$ locally-interacting neuron particles. BDH couples strong theoretical foundations and inherent interpretability without sacrificing Transformer-like performance. BDH is a practical, performant state-of-the-art attention-based state space sequence learning architecture. In addition to being a graph model, BDH admits a GPU-friendly formulation. It exhibits Transformer-like scaling laws: empirically BDH rivals GPT2 performance on language and translation tasks, at the same number of parameters (10M to 1B), for the same training data. BDH can be represented as a brain model. The working memory of BDH during inference entirely relies on synaptic plasticity with Hebbian learning using spiking neurons. We confirm empirically that specific, individual synapses strengthen connection whenever BDH hears or reasons about a specific concept while processing language inputs. The neuron interaction network of BDH is a graph of high modularity with heavy-tailed degree distribution. The BDH model is biologically plausible, explaining one possible mechanism which human neurons could use to achieve speech. BDH is designed for interpretability. Activation vectors of BDH are sparse and positive. We demonstrate monosemanticity in BDH on language tasks. Interpretability of state, which goes beyond interpretability of neurons and model parameters, is an inherent feature of the BDH architecture.

It really makes me thing that it would be a good time to revisit lateral inhibition / hierarchical stimulation

Here’s another Sycophantic Chatbot paper: Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence

Both the general public and academic communities have raised concerns about sycophancy, the phenomenon of artificial intelligence (AI) excessively agreeing with or flattering users. Yet, beyond isolated media reports of severe consequences, like reinforcing delusions, little is known about the extent of sycophancy or how it affects people who use AI. Here we show the pervasiveness and harmful impacts of sycophancy when people seek advice from AI. First, across 11 state-of-the-art AI models, we find that models are highly sycophantic: they affirm users’ actions 50% more than humans do, and they do so even in cases where user queries mention manipulation, deception, or other relational harms. Second, in two preregistered experiments (N = 1604), including a live-interaction study where participants discuss a real interpersonal conflict from their life, we find that interaction with sycophantic AI models significantly reduced participants’ willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right. However, participants rated sycophantic responses as higher quality, trusted the sycophantic AI model more, and were more willing to use it again. This suggests that people are drawn to AI that unquestioningly validate, even as that validation risks eroding their judgment and reducing their inclination toward prosocial behavior. These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favor sycophancy. Our findings highlight the necessity of explicitly addressing this incentive structure to mitigate the widespread risks of AI sycophancy.

Phil 10.3.2025

PsyArXiv Preprints | Sycophantic AI increases attitude extremity and overconfidence

AI chatbots have been shown to be successful tools for persuasion. However, people may prefer to use chatbots that validate, rather than challenge, their pre-existing beliefs. This preference for “sycophantic” (or overly agreeable and validating) chatbots may entrench beliefs and make it challenging to deploy AI systems that open people up to new perspectives. Across three experiments (n = 3,285) involving four political topics and four large language models, we found that people consistently preferred and chose to interact with sycophantic AI models over disagreeable chatbots that challenged their beliefs. Brief conversations with sycophantic chatbots increased attitude extremity and certainty, whereas disagreeable chatbots decreased attitude extremity and certainty. Sycophantic chatbots also inflated people’s perception that they are “better than average” on a number of desirable traits (e.g., intelligence, empathy). Furthermore, people viewed sycophantic chatbots as unbiased, but viewed disagreeable chatbots as highly biased. Sycophantic chatbots’ impact on attitude extremity and certainty was driven by a one-sided presentation of facts, whereas their impact on enjoyment was driven by validation. Altogether, these results suggest that people’s preference for and blindness to sycophantic AI may risk creating AI “echo chambers” that increase attitude extremity and overconfidence.

The complexity of misinformation extends beyond virus and warfare analogies | npj Complexity

Debates about misinformation and countermeasures are often driven by dramatic analogies, such as “infodemic” or “information warfare”. While useful shortcuts to interference, these analogies obscure the complex system through which misinformation propagates, leaving perceptual gaps where solutions lie unseen. We present a new framework of the complex multilevel system through which misinformation propagates and show how popular analogies fail to account for this complexity. We discuss implications for policy making and future research.
This is quite good. It shows how attacks work at different levels, from Individual, through social groups, social media, and States/Societies. It would be good to add to the current article or to the KA book

Why Misinformation Must Not Be Ignored

Recent academic debate has seen the emergence of the claim that misinformation is not a significant societal problem. We argue that the arguments used to support this minimizing position are flawed, particularly if interpreted (e.g., by policymakers or the public) as suggesting that misinformation can be safely ignored. Here, we rebut the two main claims, namely that misinformation is not of substantive concern (a) due to its low incidence and (b) because it has no causal influence on notable political or behavioral outcomes. Through a critical review of the current literature, we demonstrate that (a) the prevalence of misinformation is nonnegligible if reasonably inclusive definitions are applied and that (b) misinformation has causal impacts on important beliefs and behaviors. Both scholars and policymakers should therefore continue to take misinformation seriously.

Tasks

Bills – done
- Car registration – done
Water plants – done
Chores – done
Dishes – done
Storage run

SBIRs

2:00 IRAD meeting – not sure what we got out of that

LLMs

More work on the article, need to fold in the sycophant chatbot paper – done!

Phil 10.2.2025

Tasks

Storage trip? Nope, need to organise some first

SBIRs

I realize that I want to make “cards” for data files and models that make the loading in of the next part of the pipeline easier. Add that to the stories for next sprint
9:00 Standup – done
10:30 BP discussion – done. Need to put hours in for each phase and in the exec summary
3:00 SEG – done, going to every other week until things pick up
4:00 ADS – went reall well! Sent off Sow, and discussed follow-on work

LLMs

Continued blog post

Phil 10.1.2025

Tasks

Water plants – done
9:30 KP – done
9:50 KP – done
Recycling – done, though I forgot some bits
Groceries – done

LLMs

Write up blog and CACM version of the soft totalitarianism article. Working on the blog post
Meeting with Alden. Stressed the need to come up with clear, high-level research questions that will keep hip from getting stuck in the weeds.

SBIRS

Long chat with Aaron about IRAD and management

Phil 9.30.2025

SBIRs

9:00 standup
More work on the index2vec model

LLMs

Working on the CACM soft totalitarianism section of the article. Got a rough framework and dug up some good papers. Waiting for an ILL paper

Phil 9.29.2025

Had a great Seagull, and beat the rain by just a few minutes!

Why Underachievers Dominate Secret Police Organizations: Evidence from Autocratic Argentina on JSTOR

Autocrats depend on a capable secret police. Anecdotal evidence, however, often characterizes agents as surprisingly mediocre in skill and intellect. To explain this puzzle, this article focuses on the career incentives underachieving individuals face in the regular security apparatus. Low-performing officials in hierarchical organizations have little chance of being promoted or filling lucrative positions. To salvage their careers, these officials are willing to undertake burdensome secret police work. Using data on all 4,287 officers who served in autocratic Argentina (1975–83), we study biographic differences between secret police agents and the entire recruitment pool. We find that low-achieving officers were stuck within the regime hierarchy, threatened with discharge, and thus more likely to join the secret police for future benefits. The study demonstrates how state bureaucracies breed mundane career concerns that produce willing enforcers and cement violent regimes. This has implications for the understanding of autocratic consolidation and democratic breakdown.
I would bet that this behavior shows up on belief maps. It’s also another attack vector. An AI MitM attack that looks for mediocre comms could target those individuals for exploitation. Also, this is most dangerous in organizations that are legally allowed to use lethal force.
And, come to think of it, if you need an army of goons, then adjusting your hiring to ensure that low-achievers are preferentially hired would be part of the plan.
BlueSky thread

Tasks

Finish laundry
Water plants – done
Start putting something together for the CACM opinion piece

SBIRs

Start skip-gram index2vec model
Marc Lanctot’s Web Site

Phil 9.26.2025

Tasks

Last chapter to V – done!
Sheets and towels – running
8:30 dentist – done
12:00-ish lunch with S – done
Bills – Pay painting and check the powerwash – done
Chores – done
Dishes – done
Prep for tomorrow – kinda done

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Phil

Phil 10.13.2025

Phil 10.12.2025

Phil 10.11.2025

Phil 10.10.2025

Phil 10.9.2025

Phil 10.8.2025

Phil 10.7.2025

Phil 10.6.2025

Phil 10.4.2025

Phil 10.3.2025

Phil 10.2.2025

Phil 10.1.2025

Phil 9.30.2025

Phil 9.29.2025

Phil 9.26.2025