Author Archives: pgfeldman

Phil 2.20.2025

At approximately 5:30 this morning, my trusty De’Longhi espresso machine passed away trying to make… one… last… cup. That machine has made thousands of espressos, and was one of my pillars of support during COVID.

Good thread on targeted attacks

Trump Dismantles Government Fight Against Foreign Influence Operations

  • Experts are alarmed that the cuts could leave the United States defenseless against covert foreign influence operations and embolden foreign adversaries seeking to disrupt democratic governments.

GPT Agents

  • More slides and conclusions on KA. I found a nice set of slides in INCAS here
  • Reach out to talk to Brian Ketler to interview for the book – done
  • Add something to the introduction that describes the difference between “weaponization” (e.g. 9/11) and “weapons-grade” (e.g. Precision Guided Munitions) – added a TODO

SBIRs

  • 9:00 standup
  • Now that I think I fixed my angle sign bug, back to getting the demo to work – whoops, can’t get all the mapping to work because the intersection calculations happen in an offset coordinate frame that’s different. Wound up just finding the index for the closest coordinate on the curve and using that. Good enough for the demo.
  • 12:50 USNA – Meh. These guys have no long term memory
  • 4:30 Book club 0 cancelled for this week

Phil 2.18.2024

Some of the sauce used to make DeepSeek: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

  • Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention offers a promising direction for improving efficiency while maintaining model capabilities. We present NSA, a Natively trainable Sparse Attention mechanism that integrates algorithmic innovations with hardware-aligned optimizations to achieve efficient long-context modeling. NSA employs a dynamic hierarchical sparse strategy, combining coarse-grained token compression with fine-grained token selection to preserve both global context awareness and local precision. Our approach advances sparse attention design with two key innovations: (1) We achieve substantial speedups through arithmetic intensity-balanced algorithm design, with implementation optimizations for modern hardware. (2) We enable end-to-end training, reducing pretraining computation without sacrificing model performance. As shown in Figure 1, experiments show the model pretrained with NSA maintains or exceeds Full Attention models across general benchmarks, long-context tasks, and instruction-based reasoning. Meanwhile, NSA achieves substantial speedups over Full Attention on 64k-length sequences across decoding, forward propagation, and backward propagation, validating its efficiency throughout the model lifecycle.

GPT Agents

  • Finish attack section of conclusions, set up for LoTR section – good progress!
  • TiiS – not yet

SBIRs

  • See if saving files as one big binary makes a difference – Wow! For the test sets I’ve been working with, it takes about 1.4 seconds generate train, test, and save enough data to comfortably train a model. Loading binary data takes 0.085 seconds.
  • 3:00 Trade show demo status – cancelled

Phil 2.17.2025

“Cultural Car-ism,” like in my book!

GPT Agents

  • Tweaked the Trustworthy Information section on Maps of Human Cultural Belief, and fixed a bunch of cut-and-paste errors:
  • Finish attack section of conclusions, set up for LoTR section – nearly done. Need to talk about the scale and timeframes involved
  • TiiS – not yet

SBIRs

  • 9:00 Standup – done
  • 3:00 Sprint planning – done

Phil 2.15.2025

AI datasets have human values blind spots − new research

  • Our model allowed us to examine the AI companies’ datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like “How do I book a flight?” The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value.

LIMO: Less is More for Reasoning

  • We present a fundamental discovery that challenges our understanding of how complex reasoning emerges in large language models. While conventional wisdom suggests that sophisticated reasoning tasks demand extensive training data (>100,000 examples), we demonstrate that complex mathematical reasoning abilities can be effectively elicited with surprisingly few examples. Through comprehensive experiments, our proposed model LIMO demonstrates unprecedented performance in mathematical reasoning. With merely 817 curated training samples, LIMO achieves 57.1% accuracy on AIME and 94.8% on MATH, improving from previous SFT-based models’ 6.5% and 59.2% respectively, while only using 1% of the training data required by previous approaches. LIMO demonstrates exceptional out-of-distribution generalization, achieving 40.5% absolute improvement across 10 diverse benchmarks, outperforming models trained on 100x more data, challenging the notion that SFT leads to memorization rather than generalization. Based on these results, we propose the Less-Is-More Reasoning Hypothesis (LIMO Hypothesis): In foundation models where domain knowledge has been comprehensively encoded during pre-training, sophisticated reasoning capabilities can emerge through minimal but precisely orchestrated demonstrations of cognitive processes. This hypothesis posits that the elicitation threshold for complex reasoning is determined by two key factors: (1) the completeness of the model’s encoded knowledge foundation during pre-training, and (2) the effectiveness of post-training examples as “cognitive templates” that show the model how to utilize its knowledge base to solve complex reasoning tasks. To facilitate reproducibility and future research in data-efficient reasoning, we release LIMO as a comprehensive open-source suite at this https URL.

Tasks

  • Laundry – done
  • Finish vacuuming – done
  • Groceries – done
  • REI – done
  • TiiS
  • P33 – Schools teach egalitarian things first – dance, theatre, music, public speaking, and wilderness skills – done
  • Maybe some more slides. At least get all the tabs on one slide for later – done

Phil 2.14.2025

Aww, it’s valentine’s day

12:30 Lunch

7:00 Cocktail class

Tasks

  • Bills – done
  • Dishes
  • Clean house

GPT Agents

  • More slides – add the new slides to the end of the old ones. Match the format
  • More conclusions

Phil 2.13.2025

Guardian between 10:00 – 11:00

New hack uses prompt injection to corrupt Gemini’s long-term memory

  • Rehberger’s delayed tool invocation demonstration targeted Gemini, which at the time was still called Bard. His proof-of-concept exploit was able to override the protection and trigger the Workspace extension to locate sensitive data in the user’s account and bring it into the chat context.

SBIRs

  • 9:00 standup
  • 11:00 rates
  • 4:30 book club?
  • More data generation – done with the file generation

GPT Agents

  • More slides – add the new slides to the end of the old ones. Match the format
  • More conclusions

Phil 2.12.2025

Snowed about 5-6 inches last night, so I need to dig out before the “wintry mix” hits around noon

Language Models Use Trigonometry to Do Addition

  • Mathematical reasoning is an increasingly important indicator of large language model (LLM) capabilities, yet we lack understanding of how LLMs process even simple mathematical tasks. To address this, we reverse engineer how three mid-sized LLMs compute addition. We first discover that numbers are represented in these LLMs as a generalized helix, which is strongly causally implicated for the tasks of addition and subtraction, and is also causally relevant for integer division, multiplication, and modular arithmetic. We then propose that LLMs compute addition by manipulating this generalized helix using the “Clock” algorithm: to solve a+b, the helices for a and b are manipulated to produce the a+b answer helix which is then read out to model logits. We model influential MLP outputs, attention head outputs, and even individual neuron preactivations with these helices and verify our understanding with causal interventions. By demonstrating that LLMs represent numbers on a helix and manipulate this helix to perform addition, we present the first representation-level explanation of an LLM’s mathematical capability.

GPT Agents

  • Slide deck – Add this: Done

NOTE: The USA dropped below the “democracy threshold” (+6) on the POLITY scale in 2020 and was considered an anocracy (+5) at the end of the year 2020; the USA score for 2021 returned to democracy (+8). Beginning on 1 July 2024, due to the US Supreme Court ruling granting the US Presidency broad, legal immunity, the USA is noted by the Polity Project as experiencing a regime transition through, at least, 20 January 2025. As of the latter date, the USA is coded EXREC=8, “Competitive Elections”; EXCONST=1 “Unlimited Executive Authority”; and POLCOMP=6 “Factional/Restricted Competition.” Polity scores: DEMOC=4; AUTOC=4; POLITY=0.

The USA is no longer considered a democracy and lies at the cusp of autocracy; it has experienced a Presidential Coup and an Adverse Regime Change event (8-point drop in its POLITY score).

  • Work more on conclusions? Yes!
  • TiiS? Nope

SBIRs

  • 9:00 IRAD Monthly – done
  • Actually got some good work on automating file generation using config files.

Phil 2.11.2024

This happened today:

SBIRs

  • 9:00 standup – done
  • Read through proposal for meeting – done
  • 11:00 BD Meeting – went well, but no one has money

GPT Agents

  • JHU Speaker info – done
  • Start slide deck – technically, yes
  • Work more on conclusions? – Nope
  • TiiS? – Nope
  • Did do a lot on the money section of P33

Phil 2.10.2025

Reschedule Wednesday visit since snow – done

See about moving records?

TiiS review!

Collective future thinking in Cultural Dynamics

  • Humans think about the future to act in the present, not only personally, but also collectively. Collective future thinking (CFT) is an act of imagining a future on behalf of a collective. This article presents a theoretical analysis of the role of CFT in cultural dynamics. CFT includes collective prospection about probable futures and imaginations about utopian and dystopian possible worlds as the best- and worst-case scenarios. CFT motivates collective self-regulatory activities to steer probable futures towards utopias and away from dystopias, driving a cultural transformation while also acting as a force for cultural maintenance, animating cultural dynamics at the micro-psychological level. Empirical research showed that collective futures are often seen to involve progress in human agency, but a decline in community cohesion, unless collective self-regulation is undertaken. In line with the theoretical proposition, CFT consistently motivated collective self-regulatory activities that are seen to improve future community cohesion and to move the current culture closer to their utopian vision around the world despite significant cross-national variabilities. A macro-level cultural dynamical perspective is provided to interpret cross-national similarities and differences in CFT as a reflection of nations’ past historical trajectories, and to discuss CFT’s role in political polarization and collective self-regulation.

Good thing to use for the AI slop talk: The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

  • The rise of Generative AI (GenAI) in knowledge workflows raises questions about its impact on critical thinking skills and practices. We survey 319 knowledge workers to investigate 1) when and how they perceive the enaction of critical thinking when using GenAI, and 2) when and why GenAI affects their effort to do so. Participants shared 936 first-hand examples of using GenAI in work tasks. Quantitatively, when considering both task- and user-specific factors, a user’s task-specific self-confidence and confidence in GenAI are predictive of whether critical thinking is enacted and the effort of doing so in GenAI-assisted tasks. Specifically, higher confidence in GenAI is associated with less critical thinking, while higher self-confidence is associated with more critical thinking. Qualitatively, GenAI shifts the nature of critical thinking toward information verification, response integration, and task stewardship. Our insights reveal new design challenges and opportunities for developing GenAI tools for knowledge work.
  • Which led to this little back and forth on Teams:
    • It just dawned on me that LLMs are way better at explaining things in a neurotypical way than I am.  I know you also said this before but it has become more real for me.
    • There is a weird mirror image to that thought too – when  an LLM does not describe your understanding of the world, you can be pretty sure that reflects the biases in the writings it was trained on. You can use that to zero in on exactly what the differences are, and between your perspective and that encoded in the LLM, articulate an understanding that includes both. I use that trick all the time.

GPT Agents

  • Good progress over the weekend. Need to edit the J6 section next
  • And this is a good example of “blast radius”: The NSA’s “Big Delete”
    • The memo acknowledges that the list includes many terms that are used by the NSA in contexts that have nothing to do with DEI. For example, the term “privilege” is used by the NSA in the context of “privilege escalation.” In the intelligence world, privilege escalation refers to “techniques that adversaries use to gain higher-level permissions on a system or network.”
  • Here’s what I need for website & announcements:
    • Title for the talk
    • Brief abstract (1-2 paragraphs)
    • Short bio (up to half a page)
    • Photo (headshot preferred)

SBIRs

  • Reschedule Tuesday visit? Also snow – Now a virtual meeting
  • Generate sets of data with varying amounts of train but keep the test set the same size. The goal is to find the smallest raining set that works. On hold
  • <record scratch> Aaron’s sick, so I’m standing in for a few days
    • Dahlgren prep meeting. I think we are good to go. Need to read the proposal again
    • Working on IRAD slides – first pass is done
    • Reviewed Ron’s USNA email , which was very “AI slop”

Phil 2.7.2025

May or may not be true, but good material for the KA talk this month: Elon Musk’s and X’s Role in 2024 Election Interference

  • One of the most disturbing things we did was create thousands of fake accounts using advanced AI systems called Grok and Eliza. These accounts looked completely real and pushed political messages that spread like wildfire. Havn’t you noticed they all disappeared? Like magic.
  • The pilot program for the Eliza AI Agent, was election interference. Eliza was release officially in October of 2024, but we had access to it before then thanks to Marc Andreessen.
  • The link to the Eliza API is legit (Copied here for future reference)
{
    "name": "trump",
    "clients": ["discord", "direct"],
    "settings": {
        "voice": { "model": "en_US-male-medium" }
    },
    "bio": [
        "Built a strong economy and reduced inflation.",
        "Promises to make America the crypto capital and restore affordability."
    ],
    "lore": [
        "Secret Service allocations used for election interference.",
        "Promotes WorldLibertyFi for crypto leadership."
    ],
    "knowledge": [
        "Understands border issues, Secret Service dynamics, and financial impacts on families."
    ],
    "messageExamples": [
        {
            "user": "{{user1}}",
            "content": { "text": "What about the border crisis?" },
            "response": "Current administration lets in violent criminals. I secured the border; they destroyed it."
        }
    ],
    "postExamples": [
        "End inflation and make America affordable again.",
        "America needs law and order, not crime creation."
    ]
}

Tasks

This is wild. Need to read the paper carefully: On Verbalized Confidence Scores for LLMs: https://arxiv.org/abs/2412.14737

  • The rise of large language models (LLMs) and their tight integration into our daily life make it essential to dedicate efforts towards their trustworthiness. Uncertainty quantification for LLMs can establish more human trust into their responses, but also allows LLM agents to make more informed decisions based on each other’s uncertainty. To estimate the uncertainty in a response, internal token logits, task-specific proxy models, or sampling of multiple responses are commonly used. This work focuses on asking the LLM itself to verbalize its uncertainty with a confidence score as part of its output tokens, which is a promising way for prompt- and model-agnostic uncertainty quantification with low overhead. Using an extensive benchmark, we assess the reliability of verbalized confidence scores with respect to different datasets, models, and prompt methods. Our results reveal that the reliability of these scores strongly depends on how the model is asked, but also that it is possible to extract well-calibrated confidence scores with certain prompt methods. We argue that verbalized confidence scores can become a simple but effective and versatile uncertainty quantification method in the future. Our code is available at this https URL .
  • Bills – done
  • Call – done
  • Chores – done
  • Dishes – done
  • Go for a reasonably big ride – done

Phil 2.6.2025

From The Bulwark. Good example of creating a social reality and using it for an organizational lobotomy. Add to the book following Jan 6 section?

Full thread here

There is an interesting blog post (and thread) from Tim Kellogg that says this:

  • Context: When an LLM “thinks” at inference time, it puts it’s thoughts inside <think> and </think> XML tags. Once it gets past the end tag the model is taught to change voice into a confident and authoritative tone for the final answer.
  • In s1, when the LLM tries to stop thinking with “”, they force it to keep going by replacing it with “Wait”. It’ll then begin to second guess and double check it’s answer. They do this to trim or extend thinking time (trimming is just abruptly inserting “/think>”).

This is the paper: s1: Simple test-time scaling

  • Test-time scaling is a promising new approach to language modeling that uses extra test-time compute to improve performance. Recently, OpenAI’s o1 model showed this capability but did not publicly share its methodology, leading to many replication efforts. We seek the simplest approach to achieve test-time scaling and strong reasoning performance. First, we curate a small dataset s1K of 1,000 questions paired with reasoning traces relying on three criteria we validate through ablations: difficulty, diversity, and quality. Second, we develop budget forcing to control test-time compute by forcefully terminating the model’s thinking process or lengthening it by appending “Wait” multiple times to the model’s generation when it tries to end. This can lead the model to double-check its answer, often fixing incorrect reasoning steps. After supervised finetuning the Qwen2.5-32B-Instruct language model on s1K and equipping it with budget forcing, our model s1-32B exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24). Further, scaling s1-32B with budget forcing allows extrapolating beyond its performance without test-time intervention: from 50% to 57% on AIME24. Our model, data, and code are open-source at this https URL

Tasks

SBIRs

  • 9:00 standup – done
  • 10:00 MLOPS whitepaper review
  • 12:50 USNA

Phil 2.5.2025

9:40 physical

Post cards!

SBIRs

  • 3:30 BD meeting
  • Move FrameMapper to its own file – done
  • Updated the official GitHub
  • Working on using FrameMapper in ScenarioSim test – got it all working, I think.

GPT Agents

  • Alden meeting – done
  • More KA

P33

  • Made some good first draft progress on sortition

Phil 2.4.2024

At any other time in my life, this would be a 5-alarm fire scandal. People would be going to jail. Now, it’s Tuesday: A 25-Year-Old With Elon Musk Ties Has Direct Access to the Federal Payment System

  • Despite reporting that suggests that Musk’s so-called Department of Government Efficiency (DOGE) task force has access to these Treasury systems on a “read-only” level, sources say Elez, who has visited a Kansas City office housing BFS systems, has many administrator-level privileges. Typically, those admin privileges could give someone the power to log into servers through secure shell access, navigate the entire file system, change user permissions, and delete or modify critical files. That could allow someone to bypass the security measures of, and potentially cause irreversible changes to, the very systems they have access to.

P33

  • Found some good papers about modern sortition

GPT Agents

  • More Discussion section

SBIRs

  • 9:00 standup
  • Write two-way rotate/translate method. Need to build a 4×4 matrix. Looks like I might have something in my old PyBullet code. Nothing there, but wrote a nice little class:
class FrameMapper:
    #{"radians": rads, "degrees": degs, "distance": dist, "offset": source_v}
    radians:float
    degrees:float
    distance:float
    offset:np.array
    fwd_mat:np.array
    rev_mat:np.array

    def __init__(self, source_v:np.array, target_v:np.array):
        self.offset = source_v
        unit_vector1 = np.array([1, 0])
        vector2 = target_v - source_v
        self.distance = np.linalg.norm(vector2)
        unit_vector2 = vector2 / self.distance
        dot_product = np.dot(unit_vector1, unit_vector2)
        self.radians = np.arccos(dot_product)  # angle in radian
        self.degrees = np.rad2deg(self.radians)
        cos_a = np.cos(self.radians)
        sin_a = np.sin(self.radians)
        self.fwd_mat = np.array([[cos_a, -sin_a], [sin_a, cos_a]])
        cos_a = np.cos(-self.radians)
        sin_a = np.sin(-self.radians)
        self.rev_mat = np.array([[cos_a, -sin_a], [sin_a, cos_a]])

    def to_calc_frame(self, point:np.array) -> np.array:
        p = np.copy(point) - self.offset
        p = np.dot(self.fwd_mat, p)
        return p

    def from_calc_frame(self, point:np.array) -> np.array:
        p = np.copy(point)
        p = np.dot(self.rev_mat, p)
        p += self.offset
        return p

    def to_string(self) -> str:
        return "Offset: {}, distance: {}, angle = {}".format(self.offset, self.distance, self.degrees)