Phil 10.23.2023

Got a nice hike in with a bald eagle sighting 🙂

3:00 podcast meeting

SBIRs

9:00 Sprint review
2:00 MDA meeting
Wrote another post on AI societal weapons
William Marcellino – Senior Behavioral and Social Scientist; Professor, Pardee RAND Graduate School

GPT Agents

The IRB is submitted? Waiting for a response.
Apparently this technique is now called retrieval-augmented generation?
Start on slides, and make sure all the software works on the laptop

Phil 10.20.2023

I have a new dishwasher. It works so well!

SBIRs

Need to send an email stating I intend to submit a paper for the RHB prize – done
Register for the ETF – done
Added all the reviewer comments to the “venues” section of the Overleaf doc
3:00 meeting to go over RC slides

GPT Agents

Need to start an outline for JHU and UMBC Guest lecture
JHU
- Trust and coordination at scale
- Stories
- Dimension reduction
- Network density and stiffness
- Diversity for intelligence, hierarchy for speed
- Student activities
UMBC
- Why does conflict between Nation States happen?
- Why do we have combat?
- Why attack/invade/occupy?
- Why defend?
- What does AI bring to combat?
- What can go wrong, and how do/can we we fix that?
- What happens when communication is denied?
- What are the implications of massive, patient, reasonably smart weapons in the information and physical domain?
- What if AI is better (safer, more targeted, less confused) at combat than people? What does that mean for other jobs?

Phil 10.19.2023

GPT Agents

Make IRB changes. I think that it’s basically Data will be stored in a password-protected Dropbox folder. There will be two files. One is a list of names. email addresses, and dates with an associated anonymous string (eg “P1”, “E1”, or “D1”). The other file will contain all experiment data with all names and dates replaced with the substitutes from the other list. Mostly done. Need to change the website text to not mention Supabase and convert items to pdf. – Done
Read IUI paper # 1. Good! Need to write the review
2:00 LLM meeting

SBIRs

9:00 standup- done!
Research Council slide review – done!
Incorporate Clay’s comments into IPT deck- done!
Fill out forms for Clay- done!

Phil 10.18,2023

SBIRs

Start on the other slide deck due this week. Get a meeting with Aaron for more context
The War Elephants presentation got nominated for best presentation at MORS. I need to submit “A complete paper (in PDF format), not to exceed 40 pages or 10,000 words including appendices. Please see accompanying formatting guidelines for additional information.” Need to put in the reviewer suggestions and submit by Feb 29 2024.
Roll in changes for the research council slides and distribute.

GPT Agents

Make IRB changes
Alden meeting
Review IUI paper # 1

Phil 10.17.2023

This is the key to making trustworthy models:

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

If you reduce the parameter count in an LLM, it tends to lose recall of facts before it gets worse at learning from examples in the prompt. This holds for parameter count reductions via both pruning and using a smaller dense model.
How does scaling the number of parameters in large language models (LLMs) affect their core capabilities? We study two natural scaling techniques — weight pruning and simply training a smaller or larger model, which we refer to as dense scaling — and their effects on two core capabilities of LLMs: (a) recalling facts presented during pre-training and (b) processing information presented in-context during inference. By curating a suite of tasks that help disentangle these two capabilities, we find a striking difference in how these two abilities evolve due to scaling. Reducing the model size by more than 30\% (via either scaling approach) significantly decreases the ability to recall facts seen in pre-training. Yet, a 60–70\% reduction largely preserves the various ways the model can process in-context information, ranging from retrieving answers from a long context to learning parameterized functions from in-context exemplars. The fact that both dense scaling and weight pruning exhibit this behavior suggests that scaling model size has an inherently disparate effect on fact recall and in-context learning.

The thing is that for sociology, the large pretrained (not finetuned) models will probably be best.

SBIRs

Add a 3 point Research Council story – done
9:00 standup – done
1:00 Dr. Banerjee – done. Fun!
2:00 BMD – done. Did a slide walkthrough and got some action items
~~2:30 AI Ethics~~
~~3:00 AIMSS?~~

GPT Agents

Thinking more about how to watch the changes of the model under prompting. I think a ring buffer prompt, where the oldest tokens drop off while new ones are added makes the most sense. I checked, and the Llama-2 models do come in pretrained and finetuned (chat) flavors.
Put in a request for Llama-2 access – got it! That was quick. Yep pretrained and chat

My talk is back at its original time!
The atproto sdk looks very nice!

from atproto import Client, models


def main():
    client = Client()
    profile = client.login('my-handle', 'my-password')
    print('Welcome,', profile.display_name)
    
    response = client.send_post(text='Hello World from Python!')
    client.like(models.create_strong_ref(response))

    
if __name__ == '__main__':
    main()

Phil10.16.2023

Spamming where the skies are blue

Includes python code that uses the atproto package for consuming public data. All you need is a login and password!

Internet Archive Scholar

public web content as preserved in The Wayback Machine and Archive-It partner collections
digitized print materials from paper and microform collections
general materials from archive.org collections, including collaborations with partners

SBIRs

Start research council slide deck – Friday 20th!
Start Futures IPT slide deck – Wednesday 25th!
2:00 MDA meeting. Offer updated report

GPT Agents:

The chess model has had 23 downloads in 4 days!
Slide deck for AI Ethics class – Wednesday 18th!

Phil 10.13.2023

I have papers to review by November 17

Speaking of reviews: Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Recently, using a powerful proprietary Large Language Model (LLM) (e.g., GPT-4) as an evaluator for long-form responses has become the de facto standard. However, for practitioners with large-scale evaluation tasks and custom criteria in consideration (e.g., child-readability), using proprietary LLMs as an evaluator is unreliable due to the closed-source nature, uncontrolled versioning, and prohibitive costs. In this work, we propose Prometheus, a fully open-source LLM that is on par with GPT-4’s evaluation capabilities when the appropriate reference materials (reference answer, score rubric) are accompanied. We first construct the Feedback Collection, a new dataset that consists of 1K fine-grained score rubrics, 20K instructions, and 100K responses and language feedback generated by GPT-4. Using the Feedback Collection, we train Prometheus, a 13B evaluator LLM that can assess any given long-form text based on customized score rubric provided by the user. Experimental results show that Prometheus scores a Pearson correlation of 0.897 with human evaluators when evaluating with 45 customized score rubrics, which is on par with GPT-4 (0.882), and greatly outperforms ChatGPT (0.392). Furthermore, measuring correlation with GPT-4 with 1222 customized score rubrics across four benchmarks (MT Bench, Vicuna Bench, Feedback Bench, Flask Eval) shows similar trends, bolstering Prometheus’s capability as an evaluator LLM. Lastly, Prometheus achieves the highest accuracy on two human preference benchmarks (HHH Alignment & MT Bench Human Judgment) compared to open-sourced reward models explicitly trained on human preference datasets, highlighting its potential as an universal reward model. We open-source our code, dataset, and model at this https URL.

GPT Agents

Had a good discussion with Jimmy and Shimei yesterday about bias and the chess model. In chess, white always moves first. That’s bias. Trying to get the model to get black to move first is hard and maybe impossible. That, and other chess moves that are more common and less so might be a good way to evaluate how successful treating bias in a model could be, without destroying them.
My personal thought is that there may need to be either “mapping functions” that are attached to the prompt vector that steer the machine in certain ways, or even entire models who’s purpose is to detect and mitigate bias.
Started on getting ecco data to build maps. I need to install the project so that it can be edited, since I’m going to have to tweak. Here’s how: pip.pypa.io/en/latest/topics/local-project-installs

SBIRs

Need to add references to figures in the white paper – done
10:30 IPT meeting – just getting ducks in a row

10.12.2023

SBIRs

Review fixes to white paper
9:00 standup – done
11:30 CSC touch point – done
12:00 SEG ROM discussion – done
1:00 JSC meeting
3:30 M30 meeting

GPT Agents

2:00 Weekly meeting
Created a Huggingface model card for the chess model to use with map research. I think I’m going to try to build color maps for each layer as tokens are generated and see how they change as a game is generated

Phil 10.11.2023

Clean bathroom!

SBIRs

2:00 Weekly MDA meeting
Had a good discussion yesterday with Rukan about the demo app. Made the cutest wireframe you have ever seen using emojis
Got the Organizational Lobotomy story accepted into Future Tense!

GPT Agents

Finished my IUI bidding

Phil 10.10.2023

This weekend’s fun:

SBIR’s

9:00 Demos – need to make slides!
11:30 Rocket review
2:30 AI Ethics
3:00 M30 meeting
4:00 Sprint planning

GPT Agents

Started bidding on the IUI papers. Need to see if there are any more “reluctant”

Phil 10.6.2023

Had a good discussion with Shimei and Jimmy yesterday about Language Models Represent Space and Time. Basically the idea that the model itself should have the relative representation if information in it and that could be available. The token embeddings are a kind of direction, after all.

Tasks

Call Jim Donnie’s – done
Call Nathan – done
Chores
Load Garmin (done) and laptop (on thumb drive)
Pack!
Bennie note

SBIRs

Cancelled group dinner

GPT Agents

Wrote up some thoughts about mapping using the LLM itself here

Phil 10.5.2023

Take a look at the IUI abstracts and maybe put together a sortable spreadsheet?

SBIRs

9:00 standup
See if the (relative?) ship position data Loren used to create his FOM curves can be incorporated as input data in our app
First read of Language Models Represent Space and Time – done. Boy is there a backlash on Xitter
- Found this in the citations: Mapping Language Models to Grounded Conceptual Spaces
  - A fundamental criticism of text-only language models (LMs) is their lack of grounding—that is, the ability to tie a word for which they have learned a representation, to its actual use in the world. However, despite this limitation, large pre-trained LMs have been shown to have a remarkable grasp of the conceptual structure of language, as demonstrated by their ability to answer questions, generate fluent text, or make inferences about entities, objects, and properties that they have never physically observed. In this work we investigate the extent to which the rich conceptual structure that LMs learn indeed reflects the conceptual structure of the non-linguistic world—which is something that LMs have never observed. We do this by testing whether the LMs can learn to map an entire conceptual domain (e.g., direction or colour) onto a grounded world representation given only a small number of examples. For example, we show a model what the word “left” means using a textual depiction of a grid world, and assess how well it can generalise to related concepts, for example, the word “right”, in a similar grid world. We investigate a range of generative language models of varying sizes (including GPT-2 and GPT-3), and see that although the smaller models struggle to perform this mapping, the largest model can not only learn to ground the concepts that it is explicitly taught, but appears to generalise to several instances of unseen concepts as well. Our results suggest an alternative means of building grounded language models: rather than learning grounded representations “from scratch”, it is possible that large text-only models learn a sufficiently rich conceptual structure that could allow them to be grounded in a data-efficient way.
- Understanding intermediate layers using linear classifier probes
  - Neural network models have a reputation for being black boxes. We propose to monitor the features at every layer of a model and measure how suitable they are for classification. We use linear classifiers, which we refer to as “probes”, trained entirely independently of the model itself. This helps us better understand the roles and dynamics of the intermediate layers. We demonstrate how this can be used to develop a better intuition about models and to diagnose potential problems. We apply this technique to the popular models Inception v3 and Resnet-50. Among other things, we observe experimentally that the linear separability of features increase monotonically along the depth of the model.
Slides for demos

GPT agents

2:00 Meeting
Send story to CACM and see if they would like to pursue and what the lead times are – done
Worked a bit on Neema’s Senate testimony

Phil 10.4.2023

The bidding phase of IUI 2024 is now open. Now my present/future self has to live up to the commitments made by me in the past.

Just got back from the excellent Digital Platforms and Societal Harms IEEE event at American University. Some of the significant points that were discussed over the past two days:

Moderation is hard. Determining, for example, what is hate speech in the ten seconds or so allocated to moderators is mostly straightforward but often complicated and very dependent of locale and culture. I get the feeling that – based on examining content alone – machine learning could easily take care of 50% or so, particularly if you just decide to lump in satire and mockery. Add network analysis and you could probably be more sophisticated and get up to 70%? Handling the remaining 30% is a crushing job that would send most normal people running. Which means that the job of moderating for unacceptable content is its own form of exploitation.
Governments that were well positioned to detect and disrupt organizations like ISIS are no better prepared than a company like Meta when it comes to handling radical extremists from within the dominant culture that produced the company. In the US, that’s largely white and some variant of Christian. I’d assume that in China the same pattern exists for their dominant group.
There is a sense that all of our systems are reactive. That they only come into play when something has happened, not before something happens. Intervention for someone who is radicalizing requires human intervention. Which means it’s expensive and hard to scale. Moonshot is working to solve this problem, and has made surprisingly good progress, so there may be ways to make this work.
Militant accelerationism, or hastening societal collapse, is a thing. The exploitation of vulnerable people to become expendable munitions is being attempted by online actors. Generative AI will be a tool for these people, if it isn’t already.
There are quite a few good databases, but they are so toxic that they are largely kept in servers that are isolated from the internet to a greater or lesser degree. Public repositories are quite rare.
The transformation of Twitter to X is a new, very difficult problem. Twitter built up so much social utility as, for example, early warning, or reports from disaster areas that it can’t be removed from an App Store in the same way that an app that permits similar toxic behavior but only has 25 users can be. No one seems to have a good answer for this.
The Fediverse also appears to complicate harm tracking and prevention. Since there is no single source, how do you pull your Mastodon App if some people are accessing (possibly blacklisted) servers hosting hate speech? Most people are using the app for productive reasons. Now what?
Removing content doesn’t remove the person making the content. Even without any ability to post, or even with full bans from a platform, they can still search for targets and buy items that can enable them to cause harm in the real world. This is why moderation is only the lowest bar. Detection and treatment should be a goal.
Of course all these technologies are two edged swords. Detection and treatment in an authoritarian situation might mean finding reporters or human rights activist and imprisoning them.
The organizers are going to make this a full conference next year, with a call for papers and publication, so keep an eye on this space if you’re interested: https://tech-forum.computer.org/societal-harms-2023/

SBIRs

The War Elephants paper got a hard reject. Need to talk to Aaron to see How to proceed. Done
Add ASRC to letterhead – Done
Expense report! Done
Had a good chat with Rukan about using the SimAccel for interactive analysis of trajectories and FOM curves
Work on Senate story

GPT Agents

3:00 Alden meeting Nope
Gotta get back to maps. Found this:
Language Models Represent Space and Time
The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a coherent model of the data generating process — a world model. We find evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual “space neurons” and “time neurons” that reliably encode spatial and temporal coordinates. Our analysis demonstrates that modern LLMs acquire structured knowledge about fundamental dimensions such as space and time, supporting the view that they learn not merely superficial statistics, but literal world models.
Because the world is mean, the paper cites two papers from 2022 on reconstructing the game board from knowledge in the model with Chess and Othello. My paper did this in 2020. Grumble

Phil 10.3.2023

Day 2 of the Digital Platforms and Societal Harms IEEE event

My poster:

Phil 10.2.2023

Logically combines advanced AI with one of the world’s largest dedicated fact-checking teams. We help governments, businesses, and enterprise organizations uncover and address harmful misinformation and deliberate disinformation online.

viztales

Dimension reduction, State, Orientation, and Speed

Phil 10.23.2023

Phil 10.20.2023

Phil 10.19.2023

Phil 10.18,2023

Phil 10.17.2023

Phil10.16.2023

Phil 10.13.2023

10.12.2023

Phil 10.11.2023

Phil 10.10.2023

Phil 10.6.2023

Phil 10.5.2023

Phil 10.4.2023

Phil 10.3.2023

Phil 10.2.2023