Category Archives: Phil

Phil 3.6.2024

Dentist

Ping Tim, Dave

SBIRs

Talk went well yesterday. The White Hat AI seems to be reasonable. Need to put that on the poster
White paper. Start to fill in the stuff that I remember the best

GPT Agents

3:00 Meeting with Alden

Phil 3.5.2024

Started the day off on the wrong foot by dropping my breakfast. Grumble

SBIRs

Starting LM white paper
NIST AI COE presentation. Slides are done! Need to copy over to ppt and copy.

GPT Agents

Need to make a 10-minute version of the presentation
Need to see how the upstairs TV could work as a monitor
Need to put together a new poster
And I really need to add a AI White Hat section to the KillerApps paper based on the reception of the idea today
The paper is up on ArXiV: RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots
- Large language models (LLMs) like ChatGPT demonstrate the remarkable progress of artificial intelligence. However, their tendency to hallucinate — generate plausible but false information — poses a significant challenge. This issue is critical, as seen in recent court cases where ChatGPT’s use led to citations of non-existent legal rulings. This paper explores how Retrieval-Augmented Generation (RAG) can counter hallucinations by integrating external knowledge with prompts. We empirically evaluate RAG against standard LLMs using prompts designed to induce hallucinations. Our results show that RAG increases accuracy in some cases, but can still be misled when prompts directly contradict the model’s pre-trained understanding. These findings highlight the complex nature of hallucinations and the need for more robust solutions to ensure LLM reliability in real-world applications. We offer practical recommendations for RAG deployment and discuss implications for the development of more trustworthy LLMs.

Phil 3.4.2024

Tasks

~~Fix CEUR~~
~~Jim Donnies~~
~~Nathan~~
Power Wash
~~Bank~~

This looks interesting for building maps? Manifold Diffusion Fields

We present Manifold Diffusion Fields (MDF), an approach that unlocks learning of diffusion models of data in general non-Euclidean geometries. Leveraging insights from spectral geometry analysis, we define an intrinsic coordinate system on the manifold via the eigen-functions of the Laplace-Beltrami Operator. MDF represents functions using an explicit parametrization formed by a set of multiple input-output pairs. Our approach allows to sample continuous functions on manifolds and is invariant with respect to rigid and isometric transformations of the manifold. In addition, we show that MDF generalizes to the case where the training set contains functions on different manifolds. Empirical results on multiple datasets and manifolds including challenging scientific problems like weather prediction or molecular conformation show that MDF can capture distributions of such functions with better diversity and fidelity than previous approaches.

SBIRs

Walk through slides with Aaron
11:00 Manifold diffusion fields
Ping Amy about room and building – done

Phil 3.3.2024

Some links on the relative environmental impact of computing:

AI is like a very tiny hamburger

The Staggering Ecological Impacts of Computation and the Cloud

Phil 3.1.2024

Submitted the HAI-GEN paper!

Got rejected on the SIGCH late-breaking. Put the RAG paper up on ArXiv. Should be live Monday.

Undermining Ukraine: How Russia widened its global information war in 2023

As the full-scale war in Ukraine enters its third year, Russia has doubled down on its worldwide efforts to undermine Kyiv’s international standing in an attempt to erode Western support and domestic Ukrainian morale. Years of close monitoring of not only state-sponsored media such as Russia Today (RT) and Sputnik, but also Russian activity on Telegram, TikTok, X, and other social platforms, points to one conclusion: In the propaganda war, Russia remains fully committed to conducting information operations around the globe, playing the long game to outlast any unity among Ukraine’s allies and persist until Ukraine loses its will to fight.

Going to work on the slides for the NIST talk – good progress

Phil 2.29.2024

Hi Feb 29! See you again in 4 years!

SBIRs

Rukan gave his 2 weeks notice, dammit
9:00 standup
~~11:30 Touch point~~
Submit WE! Done
More slides. I made a Thing!

GPT Agents

Submit final Killer Apps paper
2:00 Meeting. Fun discussion on ways to detect bias in models and provide provenance of generated material

Phil 2.28.2024

Can the “hallucination” / invention / lying problem be fixed? No. These are systems of prediction. Predictions made from insufficient data will always be random. The problem is that the same thing that makes them really useful (that they are learning about culture e.g. language at many different levels) also ensures that they are deeply inhuman – there is no way to tell from the syntax or tone of a sentence how correct the model is. Nothing in modelling performed this way retains information about how much data underlies the predictions.

SBIRs

Slides for NIST talk
- “A final example of possible Chinese disinformation came when Typhoon Jebi hit Osaka, Japan and stranded thousands of tourists at Kansai International Airport. A fabricated story spread on social media alleging that Su Chii-cherng, director of the Taipei Economic and Cultural Representatives Office did nothing to help stranded Taiwanese citizens, while the PRC Consulate in Osaka dispatched buses to help rescue stranded Taiwan citizens. Shortly after the story began circulating, Su came under intense criticism online and ultimately hung himself, with the Ministry of Foreign Affairs claiming he left a suicide note blaming the disinformation surrounding his office’s incompetence. The Taiwan government found no evidence to support the rumors of Chinese assistance during the typhoon, ostensibly illustrating that this was another case of China-linked disinformation. However, in December 2019, two Taiwanese citizens were charged with creating and spreading the rumor online. Although China might have played a role in furthering the rumors spread, it still remains unclear and again highlights the challenge of definitive attribution.” Via Geopolitical Monitor
Sent the above to Kyle
3:00 Meeting with Rukan – nope
Meeting with Protima about generic madlibs JSON generator
SimAccel review/refactor meeting

GPT Agents

Poster for IUI. Going to play with generative features

Phil 2.27.2024

Lots of good stuff from Nature this morning:

‘My appetite and mind would go’: Inuit perceptions of (im)mobility and wellbeing loss under climate change across Inuit Nunangat in the Canadian Arctic
Science’s greatest discoverers: a shift towards greater interdisciplinarity, top universities and older age | Humanities and Social Sciences Communications (nature.com)
Intersecting language and society: a prototypical study of Cinderella story translations in China | Humanities and Social Sciences Communications (nature.com)
- The stories of Cinderella, highlighting the theme of kindness, are classic children’s literature worldwide. In China, the translation of the Cinderella stories has been listed in the Chinese textbook series launched in 2004, exerting a profound influence on generations of Chinese readers. This study investigates how Huiguniang, the Chinese counterpart of the character Cinderella, has become a household name among Chinese children. By examining the changes, correlations, and shifts of their prototypical features under the framework of the Aarne-Thompson-Uther classification in the three Chinese translations of the Cinderella stories and the ancient Chinese folklore The tale of Ye Xian, the study examines how factors such as external stability, internal dynamic trade-offs, and the iterative nature and empowerment of translation have popularized and canonized Huiguniang in China. The study further extends its focus within the broader context of discourse studies, embracing the intersections of language and society, as it brings to light the intricate dynamics of translation, empowerment, and cultural reception.
Understanding the dilemma of explainable artificial intelligence: a proposal for a ritual dialog framework

SBIRs

Slides for NIST talk
WE paper – DUE THIRSDAY – First pass is done and could be submitted. Waiting for Aaron’s input
Sat in on the beginning of Ron’s pitch
Sat in on a SimAccel feature planning meeting that had something to do with Tradewinds

GPT Agents

Paper – DUE THURSDAY. Finished all the proofing and submission form(s)
- Need to do the submission form thing.
- Save the paper as a word doc and proofread for grammar
Need to update the poster
Need to work on the slides

Phil 2.26.2024

Generative AI’s environmental costs are soaring — and mostly secret

…one assessment suggests that ChatGPT, the chatbot created by OpenAI in San Francisco, California, is already consuming the energy of 33,000 homes. It’s estimated that a search driven by generative AI uses four to five times the energy of a conventional web search. Within years, large AI systems are likely to need as much energy as entire nations.
And it’s not just energy. Generative AI systems need enormous amounts of fresh water to cool their processors and generate electricity. In West Des Moines, Iowa, a giant datacenter cluster serves OpenAI’s most advanced model, GPT-4. A lawsuit by local residents revealed that in July 2022, the month before OpenAI finished training the model, the cluster used about 6% of the district’s water. As Google and Microsoft prepared their Bard and Bing large language models, both had major spikes in water use — increases of 20% and 34%, respectively, in one year, according to the companies’ environmental reports. One preprint suggests that, globally, the demand for water for AI could be half that of the United Kingdom by 2027. In another, Facebook AI researchers called the environmental effects of the industry’s pursuit of scale the “elephant in the room”.

SBIRs

9:00 – demos
3:00 – planning

GPT Agents

Fill out,

Phil 2.23.2024

Asked for the quote on the house!

Chores

2:00 counseling

chess_llm_interpretability

This repo can train, evaluate, and visualize linear probes on LLMs that have been trained to play chess with PGN strings. For example, we can visualize where the model “thinks” the white pawns are. On the left, we have the actual white pawn location. In the middle, we clip the probe outputs to turn the heatmap into a more binary visualization. On the right, we have the full gradient of model beliefs, and we can see it’s extremely confident that no white pawns are on either side’s back rank.
Much of my linear probing was developed using Neel Nanda’s linear probing code as a reference. Here are the main references I used:

SBIRs

A couple of hours of WE to close out the week. Probably Saturday or Sunday since I’ll be recovering from a root canal.
Added Matt’s email to the Q8 notes
Slides – done

GPT Agents

Got the HAI-GEN response back, it’s 10 pages plus references, so yay!
Need to update the poster
Need to work on the slides

Phil 2.21.2024

The original data from Paul Krugman’s opinion piece today

SBIRs

11:00 Meeting with Matt. Hopefully Ron can upload, otherwise Matt can present
- Really good discussion, with plot envelopes. Directed him to write a document for his future self
Suspended weekly meetings until the Phase II extension
More work on WE. It’s due in a week
Some work with the USNA folks
RAG meeting with NASA folks – AXIS – chatbot and smart documentation
Send note to Doreen – done

GPT Agents

3:00 Meeting with Alden
7:00 LLM + graph datasets for hallucination reduction – meh. Rag with nicer DB

Phil 2.20.2024

Call dentist!

Human languages with greater information density have higher communication speed but lower conversation breadth

Human languages vary widely in how they encode information within circumscribed semantic domains (for example, time, space, colour, human body parts and activities), but little is known about the global structure of semantic information and nothing about its relation to human communication. We first show that across a sample of ~1,000 languages, there is broad variation in how densely languages encode information into words. Second, we show that this language information density is associated with a denser configuration of semantic information. Finally, we trace the relationship between language information density and patterns of communication, showing that informationally denser languages tend towards faster communication but conceptually narrower conversations or expositions within which topics are discussed at greater depth. These results highlight an important source of variation across the human communicative channel, revealing that the structure of language shapes the nature and texture of human engagement, with consequences for human behavior across levels of society.

SBIRs

9:00 Standup
AI Ethics?
Mostly work on WE paper.

Phil 2.19.2024

Stitches out at 4:00!

SBIRs

Working on WE paper
11:00 SimAccel review – nope
1:30 SimAccel outbrief – really confused. Marketing has a very specific idea of what they want to do and won’t lead. So instead we have a “rewrite this until we’re happy” perspective. Aaron pushed back, and I think that they will write the template first and we’ll fill in the pieces.
2:00 MDA – found a problem with the ship distribution and FOM generation. It’s all too tight around the launch point. This is good for testing on the Lambda box, but not what we need for the DTA example. I asked Matt to produce an “envelope” that shows where valid FOMs are calculated with respect to a single trajectory. He should be done tomorrow. And, of course everything either stops or pauses on Thursday.

Phil 2.18.2024

“I’m not that interested in like the Killer Robots walking down the street direction of things going wrong I much more interested in the like very subtle societal misalignments where we just have these systems out in society and through no particular ill intention um… things just go horribly wrong” – Sam Altman, at the World Government Summit Feb 13, 2024

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Phil

Phil 3.6.2024

Phil 3.5.2024

Phil 3.4.2024

Phil 3.3.2024

Phil 3.1.2024

Phil 2.29.2024

Phil 2.28.2024

Phil 2.27.2024

Phil 2.26.2024

Phil 2.23.2024

Phil 2.21.2024

Phil 2.20.2024

Phil 2.19.2024

Phil 2.18.2024