Monthly Archives: February 2024

Phil 2.29.2024

Hi Feb 29! See you again in 4 years!

SBIRs

  • Rukan gave his 2 weeks notice, dammit
  • 9:00 standup
  • 11:30 Touch point
  • Submit WE! Done
  • More slides. I made a Thing!

GPT Agents

  • Submit final Killer Apps paper
  • 2:00 Meeting. Fun discussion on ways to detect bias in models and provide provenance of generated material

Phil 2.28.2024

Can the “hallucination” / invention / lying problem be fixed? No. These are systems of prediction. Predictions made from insufficient data will always be random. The problem is that the same thing that makes them really useful (that they are learning about culture e.g. language at many different levels) also ensures that they are deeply inhuman – there is no way to tell from the syntax or tone of a sentence how correct the model is. Nothing in modelling performed this way retains information about how much data underlies the predictions. 

SBIRs

  • Slides for NIST talk
    • “A final example of possible Chinese disinformation came when Typhoon Jebi hit Osaka, Japan and stranded thousands of tourists at Kansai International Airport. A fabricated story spread on social media alleging that Su Chii-cherng, director of the Taipei Economic and Cultural Representatives Office did nothing to help stranded Taiwanese citizens, while the PRC Consulate in Osaka dispatched buses to help rescue stranded Taiwan citizens. Shortly after the story began circulating, Su came under intense criticism online and ultimately hung himself, with the Ministry of Foreign Affairs claiming he left a suicide note blaming the disinformation surrounding his office’s incompetence. The Taiwan government found no evidence to support the rumors of Chinese assistance during the typhoon, ostensibly illustrating that this was another case of China-linked disinformation. However, in December 2019, two Taiwanese citizens were charged with creating and spreading the rumor online. Although China might have played a role in furthering the rumors spread, it still remains unclear and again highlights the challenge of definitive attribution.” Via Geopolitical Monitor
  • Sent the above to Kyle
  • 3:00 Meeting with Rukan – nope
  • Meeting with Protima about generic madlibs JSON generator
  • SimAccel review/refactor meeting

GPT Agents

  • Poster for IUI. Going to play with generative features

Phil 2.27.2024

Lots of good stuff from Nature this morning:

SBIRs

  • Slides for NIST talk
  • WE paper – DUE THIRSDAY – First pass is done and could be submitted. Waiting for Aaron’s input
  • Sat in on the beginning of Ron’s pitch
  • Sat in on a SimAccel feature planning meeting that had something to do with Tradewinds

GPT Agents

  • Paper – DUE THURSDAY. Finished all the proofing and submission form(s)
    • Need to do the submission form thing.
    • Save the paper as a word doc and proofread for grammar
  • Need to update the poster
  • Need to work on the slides

Phil 2.26.2024

Generative AI’s environmental costs are soaring — and mostly secret

  • …one assessment suggests that ChatGPT, the chatbot created by OpenAI in San Francisco, California, is already consuming the energy of 33,000 homes. It’s estimated that a search driven by generative AI uses four to five times the energy of a conventional web search. Within years, large AI systems are likely to need as much energy as entire nations.
  • And it’s not just energy. Generative AI systems need enormous amounts of fresh water to cool their processors and generate electricity. In West Des Moines, Iowa, a giant datacenter cluster serves OpenAI’s most advanced model, GPT-4. A lawsuit by local residents revealed that in July 2022, the month before OpenAI finished training the model, the cluster used about 6% of the district’s water. As Google and Microsoft prepared their Bard and Bing large language models, both had major spikes in water use — increases of 20% and 34%, respectively, in one year, according to the companies’ environmental reports. One preprint suggests that, globally, the demand for water for AI could be half that of the United Kingdom by 2027. In another, Facebook AI researchers called the environmental effects of the industry’s pursuit of scale the “elephant in the room”.

SBIRs

  • 9:00 – demos
  • 3:00 – planning

GPT Agents

  • Fill out,

Phil 2.23.2024

Asked for the quote on the house!

Chores

2:00 counseling

chess_llm_interpretability

SBIRs

  • A couple of hours of WE to close out the week. Probably Saturday or Sunday since I’ll be recovering from a root canal.
  • Added Matt’s email to the Q8 notes
  • Slides – done

GPT Agents

  • Got the HAI-GEN response back, it’s 10 pages plus references, so yay!
  • Need to update the poster
  • Need to work on the slides

Phil 2.22.2024

Human and nonhuman norms: a dimensional framework

  • Human communities teem with a variety of social norms. In order to change unjust and harmful social norms, it is crucial to identify the psychological processes that give rise to them. Most researchers take it for granted that social norms are uniquely human. By contrast, we approach this matter from a comparative perspective, leveraging recent research on animal social behaviour. While there is currently only suggestive evidence for norms in nonhuman communities, we argue that human social norms are likely produced by a wide range of mechanisms, many of which we share with nonhuman animals. Approaching this variability from a comparative perspective can help norm researchers expand and reframe the range of hypotheses they test when attempting to understand the causes of socially normative behaviours in humans. First, we diagnose some of the theoretical obstacles to developing a comparative science of social norms, and offer a few basic constructs and distinctions to help norm researchers overcome these obstacles. Then we develop a six-dimensional model of the psychological and social factors that contribute to variability in both human and potential nonhuman norms.

SBIRs

  • 9:00 Standup
  • 3:30 USNA
  • 5:15 book club!

GPT Agents

  • 2:00 LLM Meeting – go over camera ready

Phil 2.21.2024

The original data from Paul Krugman’s opinion piece today

SBIRs

  • 11:00 Meeting with Matt. Hopefully Ron can upload, otherwise Matt can present
    • Really good discussion, with plot envelopes. Directed him to write a document for his future self
  • Suspended weekly meetings until the Phase II extension
  • More work on WE. It’s due in a week
  • Some work with the USNA folks
  • RAG meeting with NASA folks – AXIS – chatbot and smart documentation
  • Send note to Doreen – done

GPT Agents

  • 3:00 Meeting with Alden
  • 7:00 LLM + graph datasets for hallucination reduction – meh. Rag with nicer DB

Phil 2.20.2024

Call dentist!

Human languages with greater information density have higher communication speed but lower conversation breadth

  • Human languages vary widely in how they encode information within circumscribed semantic domains (for example, time, space, colour, human body parts and activities), but little is known about the global structure of semantic information and nothing about its relation to human communication. We first show that across a sample of ~1,000 languages, there is broad variation in how densely languages encode information into words. Second, we show that this language information density is associated with a denser configuration of semantic information. Finally, we trace the relationship between language information density and patterns of communication, showing that informationally denser languages tend towards faster communication but conceptually narrower conversations or expositions within which topics are discussed at greater depth. These results highlight an important source of variation across the human communicative channel, revealing that the structure of language shapes the nature and texture of human engagement, with consequences for human behavior across levels of society.

SBIRs

  • 9:00 Standup
  • AI Ethics?
  • Mostly work on WE paper.

Phil 2.19.2024

Stitches out at 4:00!

SBIRs

  • Working on WE paper
  • 11:00 SimAccel review – nope
  • 1:30 SimAccel outbrief – really confused. Marketing has a very specific idea of what they want to do and won’t lead. So instead we have a “rewrite this until we’re happy” perspective. Aaron pushed back, and I think that they will write the template first and we’ll fill in the pieces.
  • 2:00 MDA – found a problem with the ship distribution and FOM generation. It’s all too tight around the launch point. This is good for testing on the Lambda box, but not what we need for the DTA example. I asked Matt to produce an “envelope” that shows where valid FOMs are calculated with respect to a single trajectory. He should be done tomorrow. And, of course everything either stops or pauses on Thursday.

Phil 2.18.2024

I’m not that interested in like the Killer Robots walking down the street direction of things going wrong I much more interested in the like very subtle societal misalignments where we just have these systems out in society and through no particular ill intention um… things just go horribly wrong” – Sam Altman, at the World Government Summit Feb 13, 2024

Phil 2.15.2024

Welp, today I am 0100 0000 (binary) years old. On the other hand I just hit 40 in hex!

My brain on Ozempic

  • They’re not really “weight-loss drugs” at all. They’re something far more powerful and surreal: injectable willpower.

SBIRs

  • Sent a note to T and Aaron stating I plan to retire in a year
  • 9:00 Standup
  • Made good progress on the W.E. paper yesterday but also got pulled into the Tradewinds thing and a capabilities brief prep
  • Need to send my info over to NIST
  • More W.E.
  • 5:15 Book club. Finish chapter 1!

GPT Agents

  • 2:00 Meeting – cancelled due to deadlines

Phil 2.14.2024

SBIRs

  • No meetings!
  • Going to spend most of the day working on the WE paper. My angle is that AI will begin to occupy the places that animals did in the past. AI of different capacity will be employed to solve different scales of problems. Like animals, the AI will mostly do what it’s told, but it will inevitably misbehave. We had entire careers where people managed animals as tools (Teamsters!), and we will need something like that going forward (AI Whisperers and Prompt Engineering!) . There will be people who have a knack for these things, and there will be others that don’t but need the best performance out of the system that they can get. The background section will first set up the analogy that compares modern AI to animals, and then discusses how humans have managed these systems in the past, with Mahouts being the most extreme example. Which will get us to the next section, since WEs were the most sophisticated LAWS of ancient times.
  • Really good source material on AI Whisperers from Forbes: Rise Of The AI Whisperers

Phil 2.13.2024

Looks like a wet day:

Wrote up my philosophy on leaps of faith for a friend

Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with Transformer language models

  • This work describes a chronological (2000–2019) analysis of sentiment and emotion in 23 million headlines from 47 news media outlets popular in the United States. We use Transformer language models fine-tuned for detection of sentiment (positive, negative) and Ekman’s six basic emotions (anger, disgust, fear, joy, sadness, surprise) plus neutral to automatically label the headlines. Results show an increase of sentiment negativity in headlines across written news media since the year 2000. Headlines from right-leaning news media have been, on average, consistently more negative than headlines from left-leaning outlets over the entire studied time period. The chronological analysis of headlines emotionality shows a growing proportion of headlines denoting anger, fear, disgust and sadness and a decrease in the prevalence of emotionally neutral headlines across the studied outlets over the 2000–2019 interval. The prevalence of headlines denoting anger appears to be higher, on average, in right-leaning news outlets than in left-leaning news media.

SBIRs

  • 9:00 standup
  • 1:00 MDA
  • Otherwise, work on the crappy first draft for the intro and background sections of the W.E. paper. Wrote 500 words and stalled out
  • Set up NIST talk, I think
  • Got my UMBC profile information sent out

Phil 2.12.2024

Adding the hill climbing to the terrain app. I’ll see if it works later

Gonna risk pushing doctors orders and go for an easy ride at lunch

SBIRs

  • IUI stuff. Permission, register, reserve hotel – done
  • Continue W.E. Some progress. Had a good chat with Aaron about rescoping and changing the framing to account for everything that’s happened in the past year in AI
  • 2:00 MDA meeting. Working on getting all the new data sent over by Feb 22
  • Discovered The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology
    • JDMS: The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology is a quarterly refereed archival journal devoted to advancing the practice, science, and art of modeling and simulation as it relates to the military and defense. The primary focus of the journal is to document, in a rigorous manner, technical lessons derived from practical experience. The journal also publishes work related to the advancement of defense systems modeling and simulation technology, methodology, and theory. The journal covers all areas of the military / defense mission, maintaining a focus on the practical side of systems simulation versus pure theoretical applications.