Phil 5.15.2025

AI-Generated Law

  • But AI can be constrained and directed to distribute power rather than concentrate it. For Emirati residents, the most intriguing possibility of the AI plan is the promise to introduce AI “interactive platforms” where the public can provide input to legislation. In experiments across locales as diverse as KentuckyMassachusetts, FranceScotlandTaiwan, and many others, civil society within democracies are innovating and experimenting with ways to leverage AI to help listen to constituents and construct public policy in a way that best serves diverse stakeholders.

Tasks

  • Continue with EU open calls – done enough, I think
  • Finish story section for P33
  • Dentist
  • Roof? – Started

SBIRs

  • 9:00 Standup – done
  • 3:00 Tradeshow demo meeting – no meeting LM instead
  • Want to change the surface function to look more like this, which would be infinite:
  • StackOverflow discussion here

Phil 5.14.2025

Elon Musk’s Grok AI Can’t Stop Talking About ‘White Genocide’

  • Numerous examples of the phenomenon could be found by searching the official Grok profile for posts containing the term “boer,” a word used to refer to people from South Africa of “Dutch, German, or Huguenot descent.” It is sometimes used by Black South Africans as a pejorative against white Afrikaners, or people associated with the apartheid regime. In response to topics ranging from streaming platform HBO Max’s name change to Medicaid cuts proposed by US lawmakers, the chatbot often seemed to initially stay on topic, before veering back to white genocide in South Africa, completely unprompted.

Tasks

  • EU open calls
  • Finish story section for P33
  • SSA
  • Dentist

SBIRs

  • See if Ron added anything to Overleaf – he did! Read his notes, which was really helpful. Updated my entries and fixed some LaTeX bugs.
  • 10:00 RTAT meeting
    • Simple model with more data
    • Simple model with a bigger set of inputs
    • Add to model as needed
  • Created a better surface to explore training the model:
  • Now I need to iterate over the values and produce the data file

GPT Agents

  • 3:00 Alden meeting
  • Add a “Calls for Proposals” section to Trustworthy Information – probably a folder that contains the information on each call in a separate .tex file – started

Phil 5.13.2025

It is raining.

Cracking The Dave & Buster’s Anomaly

  • if you try to send an audio message using the Messages app to someone who’s also using the Messages app, and that message happens to include the name “Dave and Buster’s”, the message will never be received.

Tasks

  • SSA – done
  • Bank stuff – done / bills
  • Expense spreadsheet – done
  • Overleaf harness for KA – done. And sample sent!
  • Look around for EU funding opportunities – started. Yay perplexity!

SBIRs

  • Big, in-person meeting at APL yesterday. Seemed to go well
  • Travel paperwork – done
  • Put in stories – done
  • Notify T for the next trip
  • Looks like mostly work on RTAT. Need to meet with Ron to see how to integrate – tomorrow

Phil 5.12.2025

Back from vacation. Here’s a nice picture of a church in Evora, Portugal. Just amazing tile:

Task

  • SSA
  • Bank stuff / bills
  • Expense spreadsheet
  • Overleaf harness for KA
  • Look around for EU funding opportunities
  • Groceries
  • Get a nice ride in because it’s going to RAIN tomorrow – done
  • Custom ChatGPTs for big memo

SBIRs

  • Catch up
  • APL meeting
  • Travel paperwork
  • Put in stories
  • Notify T for the next trip

Phil 5.9.2025

Human murmuration: Group polarisation as compression in interaction-language dynamics captured by large language models

  • New technologies enable a social psychology that sees individuals and society as co-constitutive elements of a complex system. Using the metaphor of a murmuration—a loosely organized, locally responsive flock—this paper proposes a “science of movement” focused on trajectories of individual activity within evolving social interactions and language. We illustrate human murmuration by reviewing research on group polarization, showing how conversational joint action shapes opinion and identity. Language evolves in this process, becoming a tool for differentiation through strategic bias articulation.
  • Polarization is understood as compression in the social information system—the medium of human murmuration. We explore how compression, bias and identity appear in large language models, reflecting the dynamic process of human thinking and activity. The paper concludes with a manifesto for social psychology, outlining directions for research that can leverage emerging methods to realize the discipline’s potential in the age of complex systems and computational tools.

Phil 5.4.2025

Our new AI strategy puts Wikipedia’s humans first

  • We will use AI to build features that remove technical barriers to allow the humans at the core of Wikipedia to spend their valuable time on what they want to accomplish, and not on how to technically achieve it. Our investments will be focused on specific areas where generative AI excels, all in the service of creating unique opportunities that will boost Wikipedia’s volunteers: 
    • Supporting Wikipedia’s moderators and patrollers with AI-assisted workflows that automate tedious tasks in support of knowledge integrity; 
    • Giving Wikipedia’s editors time back by improving the discoverability of information on Wikipedia to leave more time for human deliberation, judgment, and consensus building; 
    • Helping editors share local perspectives or context by automating the translation and adaptation of common topics;
    • Scaling the onboarding of new Wikipedia volunteers with guided mentorship

This is a really powerful pattern and needs to be incorporated in the White Hat AI proposal. The flag here will be that the attacks have to be emotionally manipulative or they don’t work.

Silvio Lorusso is an Italian writer, artist and designer based in Lisbon, Portugal. He published Entreprecariat (Onomatopee) in 2019 and What Design Can’t Do (Set Margins’) in 2023. Lorusso is an assistant professor at the Lusófona University in Lisbon and a tutor at the Information Design department of Design Academy Eindhoven. He holds a Ph.D. in Design Sciences from the Iuav University of Venice.

Phil 4.26.2025

ECLeKTic is a new benchmark designed to evaluate the ability of large language models (LLMs) to transfer knowledge across different languages. It uses a closed-book question answering task, where models must rely on their internal knowledge to answer questions based on information relevant to a specific language

Tasks

  • Bills – done
  • Lawn! Done
  • Phlox! Done
  • Groceries – done
  • Spothub
  • Dentist at 1:10 – leave at 12:00? – done, and a nice ride to boot
  • Aaron M at 5:30 – fun and done

GPT Agents

  • 3:00 LLM meeting
  • P33 Communities – something about how we’ve always had communities, and that there have always been communities based on virtual elements such as family, religion, language, and physical locations. And in some cases, the virtual is stronger than the physical; gerrymandering, redlining, ghettos, etc.

Phil 4.23.2025

Towards a Trajectory-powered Foundation Model of Mobility

  • This paper advocates for a geospatial foundation model based on human mobility trajectories in the built environment. Such a model would be widely applicable across many important societal domains currently addressed independently, including transportation networks, data-driven urban planning, tourism, and sustainability. Unlike existing large vision-language models, trained primarily on text and images, this foundation model should integrate the complex spatiotemporal and multimodal data inherent to mobility. This paper motivates this challenging research agenda, outlining many downstream applications that would be significantly impacted and enabled by such a model. It then explains the critical spatial, temporal, and contextual factors that such a model must capture in trajectories. Finally, it concludes with several research questions and directions, laying the foundations for future exploration in this exciting and emerging field.

Geospatial Reasoning: Unlocking insights with generative AI and multiple foundation models

  • Last November we introduced two pre-trained, multi-purpose models to address many of the challenges of geospatial modeling: the Population Dynamics Foundation Model (PDFM), which captures the complex interplay between population behaviors and their local environment, and a new trajectory-based mobility foundation model. Since then, over two hundred organizations have tested the PDFM embeddings for the United States and we are expanding the dataset to cover the UK, Australia, Japan, Canada, and Malawi for experimental use by selected partners.
  • Social trajectories would be a straightforward adaptation of these models

Tasks

  • Delete old objects – done
  • Reach out to Chen Qifan?
  • Plant plants – beds are done. Broke a soaker hose that I have to replace. Still need to do the flower boxes
  • 4:00 Fidelity – done. Interesting!

SBIRs

  • 10:00 SAIC meeting – need to put together a slide. Nope, couldn’t agree on what to do.

Phil 4.22.2025

Tasks

SBIRs

  • Create tradeshow overleaf – done
  • 9:30 APL discussion – done
  • 11:00 NGC2 discussion – done
  • 3:00 Tradeshow demo – done
  • 3:30 code review -done

Phil 4.21.2025

Sheesh

Ugh

Warding Off Muscle Cramps As We Age

🧗🏻 CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

  • Pre-training datasets are typically collected from web content and lack inherent domain divisions. For instance, widely used datasets like Common Crawl do not include explicit domain labels, while manually curating labeled datasets such as The Pile is labor-intensive. Consequently, identifying an optimal pre-training data mixture remains a challenging problem, despite its significant benefits for pre-training performance. To address these challenges, we propose CLustering-based Iterative Data Mixture Bootstrapping (CLIMB), an automated framework that discovers, evaluates, and refines data mixtures in a pre-training setting. Specifically, CLIMB embeds and clusters large-scale datasets in a semantic space and then iteratively searches for optimal mixtures using a smaller proxy model and a predictor. This strategy enables effective domain adaptation without relying solely on curated data. When continuously trained on 400B tokens with this mixture, our 950M model exceeds the state-of-the-art Llama-3.2-1B by 2.0% averaged across 12 general reasoning tasks. Moreover, we observe that optimizing for a specific domain (e.g., Social Sciences) yields a 5% improvement over random sampling. Finally, we introduce ClimbLab, a filtered 1.3-trillion-token corpus with 20 clusters as a research playground, and ClimbMix, a compact yet powerful 400-billion-token dataset designed for efficient pre-training that delivers superior performance under an equal token budget. We analyze the final data mixture, elucidating the characteristics of an optimal data mixture.

Oh, this looks interesting: Values in the wild: Discovering and analyzing values in real-world language model interactions

  • In the latest research paper from Anthropic’s Societal Impacts team, we describe a practical way we’ve developed to observe Claude’s values—and provide the first large-scale results on how Claude expresses those values during real-world conversations. We also provide an open dataset for researchers to run further analysis of the values and how often they arise in conversations.

Need to follow up for sure

SBIRs

  • Slides
  • Stories – Just Phase II deliverables
  • 9:00 Sprint review
  • 3:00 Sprint planning

Phil 4.18.2025

Tasks

  • BSO – need to call
  • Start moving items out of the trailer – started. Mostly goodwill
  • Dishes – done
  • Mow lawn
  • Start house cleaning – done
  • Bills

SBIRs

  • See if I can finish off the saving and loading of the model for inference – DONE! Everything looks much better
  • Sprint slides and adjust the points – checked. It all seems reasonable
  • Write one story for next week that’s just “write SBIR deliverables”

Phil 4.17.2025

LLM use in the wild:

Had a great early season ride in the PA hills

SBIRs

  • Going to have to spend some time focusing on the final deliverables. I’m going to need to write the final quarterly report and a summary. Should be able to finish next week
  • Finished the first pass though KA! Need to find an editor

Phil 4.16.2025

This is pretty wild: Generate videos in Gemini and Whisk with Veo 2. Not sure if I have a good use case, but I think I’d like to play around with something more abstract.

Wise – done

BSO

Nice album: DVOŘÁK, A.: Greatest Melodies (arr. P. Breiner for piano)

SBIRs

  • Create a MinimumTrain.py and MinimumInfer.py in the experiments directory, and get those working with the debug data – done
    • Dataloader – done
    • Model – done
    • Training loop – done
    • Save out pth weights and structure – trickier than you would think if you want to load a model without prior knowledge of its structure
    • Load in and test
    • It should be possible to combine both where a model is trained, saved and evaluated. Then we can do a grid search of some basic hyperparamerters and keep track of the accuracy

GPT Agents

  • 3:00 Alden meeting. Ask about NIST people who might need jobs