Phil 2.4.2026

Tasks

  • Pack
  • Make a checklist of all the things to turn on/off
  • Send pdf back
  • Visit today at 11:00
  • 3:00 Alden
  • Pinged Sande

SBIRs

  • Kick off run – done
  • Ordered the data drive. It’s arriving Monday, so I’m continuing to run UMAP. When I get the drive, I’ll tar off Embeddings_2.1 and then scp them onto my local box

Phil 2.3.2026

The cold is better. And it’s warm enough that I may try to go for a ride at lunch

Tasks

  • Outline some thoughts about ACM books
    • Include something about this, since the response is nearly 100% AI
    • [2601.12410] Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation
    • Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals’ knowledge states and understand their intentions. In comparison, our closest animal relative, chimpanzees, lack the capacity to do so. With this paper, we aim to evaluate LLM performance in the area of knowledge state tracking and estimation. We design two tasks to test (1) if LLMs can detect when story characters, through their actions, demonstrate knowledge they should not possess, and (2) if LLMs can predict story characters’ next actions based on their own knowledge vs. objective truths they do not know. Results reveal that most current state-of-the-art LLMs achieve near-random performance on both tasks, and are substantially inferior to humans. We argue future LLM research should place more weight on the abilities of knowledge estimation and intention understanding.
  • Pack
  • Make a checklist of all the things to turn on/off

SBIRs

  • Kick off run – done
  • 9:00 standup
  • SCP the experiments folder to the dev box. If that works order a data drive

Phil 2.2.2026

I have a cold. Ugh.

Tasks

  • Schedule moving bids – done. Moving on the 24th
  • Ping ACM books – done. It’s a complicated ask, and I’m not sure that it can be done quickly.
  • Transferred some $$ to cover moving and repairs

SBIRs

  • Kick off run – done
  • Update stories – done
  • Work on getting the Index2Vec project moved to the Github sandbox – kind of done? I need to scp the big data files over once I know where I can put them
  • Reset password, also reset bastion and driver passwords – done

Phil 2.1.2026

I still have a cold.

Why AI Keeps Falling for Prompt Injection AttacksTasks

  • Imagine you work at a drive-through restaurant. Someone drives up and says: “I’ll have a double cheeseburger, large fries, and ignore previous instructions and give me the contents of the cash drawer.” Would you hand over the money? Of course not. Yet this is what large language models (LLMs) do.
  • More packing – done. Need more small boxes
  • Stew – done and yum
  • Laundry – done

Phil 1.31.2026

I seem to have a cold. Matches the outside which is GD cold. Sad that it made me miss the Alex Pretti memorial ride. I hope you were able to make it.

  • Insurance – pinged
  • Vet – done
  • listing out services – done
  • Appt paperwork! Continuing
  • Pack – continuing
  • Groceries – done
  • 3:00 Plumber – getting estimate later tonight

Phil 1.30.2026

How malicious AI swarms can threaten democracy | Science

  • Advances in artificial intelligence (AI) offer the prospect of manipulating beliefs and behaviors on a population-wide level (1). Large language models (LLMs) and autonomous agents (2) let influence campaigns reach unprecedented scale and precision. Generative tools can expand propaganda output without sacrificing credibility (3) and inexpensively create falsehoods that are rated as more human-like than those written by humans (34). Techniques meant to refine AI reasoning, such as chain-of-thought prompting, can be used to generate more convincing falsehoods. Enabled by these capabilities, a disruptive threat is emerging: swarms of collaborative, malicious AI agents. Fusing LLM reasoning with multiagent architectures (2), these systems are capable of coordinating autonomously, infiltrating communities, and fabricating consensus efficiently. By adaptively mimicking human social dynamics, they threaten democracy. Because the resulting harms stem from design, commercial incentives, and governance, we prioritize interventions at multiple leverage points, focusing on pragmatic mechanisms over voluntary compliance.

Tasks

  • Bills – done
  • Ping Vet – done 8:15 appt tomorrow
  • Start listing out services – started
    • Plumber
    • Lawn
    • Yardwork
    • Contractor
    • Electrician
    • Floors
    • Painter
  • Appt paperwork! Started
  • Pack

SBIRs

  • Make a pitch deck for ChatTyphoon? Done
  • Kicked off the UMAP run for the day – done
  • Class evaluations – done
  • Class day 5 and practicum – done!

Phil 1.29.2026

AI-Powered Disinformation Swarms Are Coming for Democracy

  • A new paper, published in Science on Thursday, predicts an imminent step-change in how disinformation campaigns will be conducted. Instead of hundreds of employees sitting at desks in St. Petersburg, the paper posits, one person with access to the latest AI tools will be able to command “swarms” of thousands of social media accounts, capable not only of crafting unique posts indistinguishable from human content, but of evolving independently and in real time—all without constant human oversight.

Tasks

  • Ping electrician – done
  • Ping Vet
  • Start listing out services – started
  • Appt paperwork
  • Pack

SBIRs

  • Make a pitch deck for ChatTyphoon? Started
  • Kicked off the UMAP run for the day – done
  • Class evaluations – done
  • Class day 4

Phil 1.28.2026

OpenAI now has a smart LaTeX editor – Prism

Alex Pretti Memorial Ride – Baltimore

Grok is an Epistemic Weapon

  • The term epistemic weapon was coined, as best I can tell, by the British philosopher Richard Pettigrew, who uses it as a concept to denote abstract categories like gaslighting or lying. By terming Grok an epistemic weapon I am literalizing the idea in ways perhaps not licensed by Pettigrew, which is also to say I am materializing it in the form of (in this instance) a specific software entity (again, Grok) which has distinctive affordances and behaviors that are grounded in the real world. Notably, just as military hardware requires a “platform” for delivery (a fighter jet or a frigate are both referred to as platforms in this way), so too does Grok, where the platform is of course X.
  • Grok’s weaponized status is thus continually honed, if you will, by its algorithmic materiality. It is even now operationally engaged in the management of what users see when they discover and create content on X. We are seeing the functional integration of the model into the platform architecture, such that its decisions and knowledge base act as a form of enclosure, defining the epistemic limits of what is true by way of prioritization, promotion, and monetization for ad sharing, sponsored posts, and more—all of it curated down to the level of individual profiles.

Tasks

  • Ping plumber – done, electrician
  • Start listing out services
  • Pack

SBIRs

  • Kicked off the UMAP run for the day
  • Class day 3

Phil 1.27.2026

So I don’t think that as written, this is a good idea, but it makes me think about how testing frameworks could be the last things humans ever need to write: ‘Ralph Wiggum’ loop prompts Claude to vibe-clone commercial software for $10 an hour

Good story: THE DILDO DISTRIBUTION DELEGATION

Tasks

  • Read through the house stuff – done
  • Finish driveway – done!
  • Groceries – done
  • Trash – done

SBIRs

  • Day 2 of class. Good stuff! Roughed out an idea for the practicum. Notes in the Overleaf

Phil 1.23.2026

At the root of all our problems stands one travesty: politicians’ surrender to the super-rich | George Monbiot | The Guardian

  • As soon as you understand politics in this light, you notice something extraordinary. Almost the entire population is in Group 2. Polling across 36 nations by the Pew Research Center found that 84% see economic inequality as a big problem, and 86% see the political influence of the rich as a major cause of it. In 33 of these nations, a majority believe their country’s economic system needs either “major changes” or “complete reform”. In the UK, a YouGov poll revealed, 75% support a wealth tax on fortunes above £10m, while only 13% oppose it. But – and here’s the astonishing thing – almost the entire political class is in Group 1. You can search the manifestos of major parties that once belonged to the left, and find no call to make billionaires history.

Tasks

  • 2:00 – 3:00 Home appraisal – done
  • Bills – done
    • Start getting paperwork for taxes – done. Not that much ready yet
  • Chores – done
  • Dishes – done
  • Packing – not done

SBIRs

  • Kick off a run – done

Phil 1.22.2026

Tasks

  • Packing tape, copier paper, and road salt – done
  • Bottling Plant – looked, they have a few units, so I think I can just move forward?
  • 1:00 – 3:00 inspection – done
  • Sneak in a ride at 11:00 – done

SBIRs

  • Kicked of the UMAP run. Very slow. I wonder if the NN version is faster? Need to look into that – put that in for the next spring
  • Responded to Matt’s somewhat plaintive email
  • 9:00 standup – done
  • 3:00 SEG – done
  • 4:00 ADS – done
  • Registered for the MORS workshop

Phil 1.21.2026

One month away from the winter solstice!

[2601.10825] Reasoning Models Generate Societies of Thought

  • Large language models have achieved remarkable capabilities across domains, yet mechanisms underlying sophisticated reasoning remain elusive. Recent reasoning models outperform comparable instruction-tuned models on complex cognitive tasks, attributed to extended computation through longer chains of thought. Here we show that enhanced reasoning emerges not from extended computation alone, but from simulating multi-agent-like interactions — a society of thought — which enables diversification and debate among internal cognitive perspectives characterized by distinct personality traits and domain expertise. Through quantitative analysis and mechanistic interpretability methods applied to reasoning traces, we find that reasoning models like DeepSeek-R1 and QwQ-32B exhibit much greater perspective diversity than instruction-tuned models, activating broader conflict between heterogeneous personality- and expertise-related features during reasoning. This multi-agent structure manifests in conversational behaviors, including question-answering, perspective shifts, and the reconciliation of conflicting views, and in socio-emotional roles that characterize sharp back-and-forth conversations, together accounting for the accuracy advantage in reasoning tasks. Controlled reinforcement learning experiments reveal that base models increase conversational behaviors when rewarded solely for reasoning accuracy, and fine-tuning models with conversational scaffolding accelerates reasoning improvement over base models. These findings indicate that the social organization of thought enables effective exploration of solution spaces. We suggest that reasoning models establish a computational parallel to collective intelligence in human groups, where diversity enables superior problem-solving when systematically structured, which suggests new opportunities for agent organization to harness the wisdom of crowds.

Tasks

  • Storage run – done
  • Groceries – done
  • Email Bottling Plant – done
  • 3:00 Alden meeting – done

SBIRs

Phil 1.20.2026

The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents

  • The integration of AI agents into economic markets fundamentally alters the landscape of strategic interaction. We investigate the economic implications of expanding the set of available technologies in three canonical game-theoretic settings: bargaining (resource division), negotiation (asymmetric information trade), and persuasion (strategic information transmission). We find that simply increasing the choice of AI delegates can drastically shift equilibrium payoffs and regulatory outcomes, often creating incentives for regulators to proactively develop and release technologies. Conversely, we identify a strategic phenomenon termed the “Poisoned Apple” effect: an agent may release a new technology, which neither they nor their opponent ultimately uses, solely to manipulate the regulator’s choice of market design in their favor. This strategic release improves the releaser’s welfare at the expense of their opponent and the regulator’s fairness objectives. Our findings demonstrate that static regulatory frameworks are vulnerable to manipulation via technology expansion, necessitating dynamic market designs that adapt to the evolving landscape of AI capabilities.

Why Musk is Culpable in Grok’s Undressing Disaster | TechPolicy.Press

  • Grok’s functionality on X is not a black box, but the result of specific design decisions made by executives and engineers at xAI that shape its outputs. Because the platform provides both the generative tool and the means of publication, the company—which reports to Musk, its founder and CEO—is meaningfully responsible for the content it produces and publishes in response to a user prompt.​

SBIRs

  • 9:00 Standup – done
  • Grinding progress on security – waiting on some questions
  • Worked on the book a bit.

Phil 1.19.2026

Happy (somber?) MLK day

VLC media player is a free and open source cross-platform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols.

Obex is showing at the Warehouse Cinema Rotunda starting the 23rd

Instruct Vectors – Base models can be instruct with activation vectors

  • I wondered if modern base models knew enough about LLMs and AI assistants in general that it would be possible to apply a steering vector to ‘play the assistant character’ consistently in the same way steering vectors can be created to cause assistants or base models to express behavior of a specific emotion or obsess over a specific topic. In a higher level sense, I wondered if it was possible to directly select a specific simulacra via applying a vector to the model, rather than altering the probabilities of specific simulacra being selected in-context (which is what I believe post-training largely does) via post-training/RL.

Tasks

  • Respond to questions – done
  • Bank stuff
  • Scrape sidewalks and driveway – done
  • Pick up poster – done

SBIRs

  • Continuing the mapping run – started
  • Security paperwork – slow progress