Digital sovereignty is a real problem that matters to real people and real businesses in the real world, it can be explained in concrete terms, and we can devise pragmatic strategies to improve it
SBIRs
9:30 Data loader discussion – done. Going with binary files for now
11:00 RTAT demo. Went well? A lot of the people who were going to come were sick.
Going to make a sim “script runner” that advances everything at the right times. Going to start with 2 paths (which winds up being 4), so I can keep track of what is happening. Maybe send the result to excel?
We are entering a period of omnishambolic polycrisis.The ominous rumble of climate change, authoritarianism, genocide, xenophobia and transphobia has turned into an avalanche. The perpetrators of these crimes against humanity have weaponized the internet, colonizing the 21st century’s digital nervous system, using it to attack its host, threatening civilization itself.
SBIRs
9:00 Standup – done
3:00 Tradeshow meeting – done. We seem to be on track
GPT Agents
Walked through the presentation yesterday. The timing is good. Some other notes:
Explicitly label prompts and responses – done
Add a slide at the beginning that talks about “good fiction” experiments from 2022 – not sure. Maybe talk about it on the title slide
Add an end slide with Stampede Theory and Killer Apps (coming soon!) – done
In a Friday night massacre, Trump fired Gen. Charles Q. Brown Jr., the second African American to serve as chairman of the Joint Chiefs of Staff. Defense Secretary Pete Hegseth also fired Adm. Lisa Franchetti, the chief of naval operations, and Gen. James Slife, the vice chief of of staff the Air Force, along with the top lawyers — the judge advocates general — for the Air Force, Army and Navy. Another female officer — Adm. Linda Fagan, commandant of the Coast Guard, which is part of the Department of Homeland Security — was fired by the administration last month.
Hegseth justified this purge based on the supposed need to restore the U.S. military’s “warfighter ethos” and to stop focusing on DEI, or diversity, equity and inclusion. But the actual message the moves might send is far more chilling: namely, that the armed forces should be run by White men, and (as made clear in the selection of Brown’s replacement as Joint Chiefs chairman) that those men will be chosen more for perceived political loyalty than for professional qualifications.
In this context, we fielded parallel surveys of 520 political scientists (whom we refer to as “experts” below), 40 experts on online misinformation (whom we refer to “misinformation experts” below), and a representative sample of 2,750 Americans (whom we refer to as “the public” below). These surveys, which we refer to as the February 2025 survey, were fielded from January 31 — February 10, 2025.
KYIV — The Trump administration has asked Ukraine to withdraw an annual resolution condemning Russia’s war, and wants to replace it with a toned-down U.S. statement that was perceived as being close to pro-Russian in Kyiv, according to an official and three European diplomats familiar with the plan, who spoke on the condition of anonymity to discuss a sensitive political situation between nations that have typically acted as partners.
At approximately 5:30 this morning, my trusty De’Longhi espresso machine passed away trying to make… one… last… cup. That machine has made thousands of espressos, and was one of my pillars of support during COVID.
Experts are alarmed that the cuts could leave the United States defenseless against covert foreign influence operations and embolden foreign adversaries seeking to disrupt democratic governments.
GPT Agents
More slides and conclusions on KA. I found a nice set of slides in INCAS here
Reach out to talk to Brian Ketler to interview for the book – done
Add something to the introduction that describes the difference between “weaponization” (e.g. 9/11) and “weapons-grade” (e.g. Precision Guided Munitions) – added a TODO
SBIRs
9:00 standup
Now that I think I fixed my angle sign bug, back to getting the demo to work – whoops, can’t get all the mapping to work because the intersection calculations happen in an offset coordinate frame that’s different. Wound up just finding the index for the closest coordinate on the curve and using that. Good enough for the demo.
12:50 USNA – Meh. These guys have no long term memory
Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention offers a promising direction for improving efficiency while maintaining model capabilities. We present NSA, a Natively trainable Sparse Attention mechanism that integrates algorithmic innovations with hardware-aligned optimizations to achieve efficient long-context modeling. NSA employs a dynamic hierarchical sparse strategy, combining coarse-grained token compression with fine-grained token selection to preserve both global context awareness and local precision. Our approach advances sparse attention design with two key innovations: (1) We achieve substantial speedups through arithmetic intensity-balanced algorithm design, with implementation optimizations for modern hardware. (2) We enable end-to-end training, reducing pretraining computation without sacrificing model performance. As shown in Figure 1, experiments show the model pretrained with NSA maintains or exceeds Full Attention models across general benchmarks, long-context tasks, and instruction-based reasoning. Meanwhile, NSA achieves substantial speedups over Full Attention on 64k-length sequences across decoding, forward propagation, and backward propagation, validating its efficiency throughout the model lifecycle.
GPT Agents
Finish attack section of conclusions, set up for LoTR section – good progress!
TiiS – not yet
SBIRs
See if saving files as one big binary makes a difference – Wow! For the test sets I’ve been working with, it takes about 1.4 seconds generate train, test, and save enough data to comfortably train a model. Loading binary data takes 0.085 seconds.
Our model allowed us to examine the AI companies’ datasets. We found that these datasets contained several examples that train AI systems to be helpful and honest when users ask questions like “How do I book a flight?” The datasets contained very limited examples of how to answer questions about topics related to empathy, justice and human rights. Overall, wisdom and knowledge and information seeking were the two most common values, while justice, human rights and animal rights was the least common value.
We present a fundamental discovery that challenges our understanding of how complex reasoning emerges in large language models. While conventional wisdom suggests that sophisticated reasoning tasks demand extensive training data (>100,000 examples), we demonstrate that complex mathematical reasoning abilities can be effectively elicited with surprisingly few examples. Through comprehensive experiments, our proposed model LIMO demonstrates unprecedented performance in mathematical reasoning. With merely 817 curated training samples, LIMO achieves 57.1% accuracy on AIME and 94.8% on MATH, improving from previous SFT-based models’ 6.5% and 59.2% respectively, while only using 1% of the training data required by previous approaches. LIMO demonstrates exceptional out-of-distribution generalization, achieving 40.5% absolute improvement across 10 diverse benchmarks, outperforming models trained on 100x more data, challenging the notion that SFT leads to memorization rather than generalization. Based on these results, we propose the Less-Is-More Reasoning Hypothesis (LIMO Hypothesis): In foundation models where domain knowledge has been comprehensively encoded during pre-training, sophisticated reasoning capabilities can emerge through minimal but precisely orchestrated demonstrations of cognitive processes. This hypothesis posits that the elicitation threshold for complex reasoning is determined by two key factors: (1) the completeness of the model’s encoded knowledge foundation during pre-training, and (2) the effectiveness of post-training examples as “cognitive templates” that show the model how to utilize its knowledge base to solve complex reasoning tasks. To facilitate reproducibility and future research in data-efficient reasoning, we release LIMO as a comprehensive open-source suite at this https URL.
Tasks
Laundry – done
Finish vacuuming – done
Groceries – done
REI – done
TiiS
P33 – Schools teach egalitarian things first – dance, theatre, music, public speaking, and wilderness skills – done
Maybe some more slides. At least get all the tabs on one slide for later – done
Rehberger’s delayed tool invocation demonstration targeted Gemini, which at the time was still called Bard. His proof-of-concept exploit was able to override the protection and trigger the Workspace extension to locate sensitive data in the user’s account and bring it into the chat context.
SBIRs
9:00 standup
11:00 rates
4:30 book club?
More data generation – done with the file generation
GPT Agents
More slides – add the new slides to the end of the old ones. Match the format
You must be logged in to post a comment.