Monthly Archives: December 2023

Phil 12.29.2023

Court records here. Pages 6, 86 – 89

Updated the RAG paper to reflect this

Good ride today in the unseasonably (new seasonably?) warm temps

Phil 12.28.2023

HugingFace now has HuggingChat, which can be linked to multiple models. I wonder if there is an API? Looks like there is this unofficial one

SBIRs

9:00 standup
Set up the HAI Overleaf using their template. The directions for preparing a submission file seem… unusual. Done! Pasted in a few sections from the book as placeholder

GPT Agents

Back up the SIGCHI paper – done!
2:00 Meeting – couldn’t get in at first, but a good discussion about the paper.

Phil 12.27.2023

SBIRs

Spent all day writing the RAG paper. Finished the first draft!

Phil 12.24.2023

Took in some lights and hosted some guests over the last few days

Why the simplest explanation isn’t always the best

As Shinn underscores, what feels intuitive and simple can often mislead: dimensionality reduction optimizes for specific statistical features of the data, and doesn’t always agree with the most intuitive explanation.

Phil 12.21.2023

Winter solstice. Going to see the lights at Longwood Gardens

GPT Agents

Had a good (quick) chat with Jimmy and Shimei about the SIGCHI paper. We are in agreement that focusing on the RAG errors makes good sense
Rough out the methods and results sections

SBIRs

Had a good exchange with Julie about meeting with her AI/Cyber folks at NIST

Phil 12.20.2023

40 hours until the days start getting longer!

Carbon offsets from the UN. Looks like I use about 18 tons according to their calculator. So get 75 for F&F. Done!

NewsGuard provides transparent tools to counter misinformation for readers, brands, and democracies. Since launching in 2018, its global staff of trained journalists and information specialists has collected, updated, and deployed more than 6.9 million data points on more than 35,000 news and information sources, and cataloged and tracked all of the top false narratives spreading online.

The rise of AI fake news is creating a ‘misinformation superspreader’

SBIRs

Nothing scheduled?
Pinged Julie to discuss AI & Cyber

GPT Agents

Finish the analysis of Partials, then start framing out the paper for SIGCHI Late Breaking

Phil 12.19.2023

Carbon credits!

SBIRs

Submit request to charge for work on SIGCHI and 5th Workshop on Human-AI Co-Creation with Generative Models – done
9:00 Standup – done
AI ethics? Cancelled
OpenAI lays out plan for dealing with dangers of AI
- OpenAI’s “Preparedness” team, led by MIT AI professor Aleksander Madry, will hire AI researchers, computer scientists, national security experts and policy professionals to monitor the tech, continually test it and warn the company if it believes any of its AI capabilities are becoming dangerous.
- OpenAI’s post
- Openings, but I can’t find any job description?

GPT Agents

Write a query that pulls all examples of all “with context” hallucinations and see what’s going on. It’s not many, and I suspect user problems. Done. It seems to be associated with unusual formatting. Working the unhelpful and partials. I think the angle is going to be that “Context helps a lot,” but it’s not a panacea. Errors leak through when the model can’t navigate the context effectively. At scale, this is millions of responses. Mitigation is going to be a version of Zeno’s paradox.
Start framing out the paper. Created appropriate Overleaf doc.

Phil 12.18.2023

Supported Radio Paradise. Now I need to get some carbon credits

3:00 Podcast

Lawn mower, dammit – done! Also door screen – done!

GPT_Agents

Working on getting summed values as a tab in the spreadsheet – done
Start framing out the paper
Maybe put something together for the 5th Workshop on Human-AI Co-Creation with Generative Models? CfP Is Jan 15. I’d like to put in something on AI weapons. 5 – 10 pages

SBIR

Need to finish Toy Models of Superposition, then start Superposition, Memorization, and Double Descent.
RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc). It can perform a whole range of protein design challenges as we have outlined in the RFdiffusion paper.
JSC discussion. Seems we’re on the hook for something other than we told them

Phil 12.14.2023

Tasks

9:00 Dentist – looks like bone graft. Yike
Dr. Diener

SBIRs

Continuing Toy Models of Superposition, then starting Superposition, Memorization, and Double Descent.

GPT Agents

Playing around with spreadsheets

Phil 12.15.2023

Did the counseling homework

Plumber $600

Photoshop Page Curl Effect – Page Turn Effect Tutorial

Got my new cover done!

Phil 12.13.2023

SBIRs

More reading. Next is Toy Models of Superposition. I do want to check out the Eliciting Latent Predictions from Transformers with the Tuned Lens GitHub repo. It looks like there are pretrained models.
There is a follow on paper for Toy Models: Superposition, Memorization, and Double Descent
- We extend our previous toy-model work to the finite data regime, revealing how and when they memorize training examples.
This post from 2014 also looks helpful: Deep Learning, NLP, and Representations
- This post reviews some extremely remarkable results in applying deep neural networks to natural language processing (NLP). In doing so, I hope to make accessible one promising answer as to why deep neural networks work. I think it’s a very elegant perspective.

GPT Agents

Created a ContextPromptAccuracy project and loaded it up with the code for the Wikipedia experiments and the supabase data. Need to set up mysql schema so I can start making queries, tables and charts.
Ok, really happy with this bit of code:

def to_db(msi:MSI.MySqlInterface, table_name:str, dict_list:List):
    d:Dict
    for d in dict_list:
        print()
        keys = d.keys()
        vals = d.values()
        s1 = "INSERT INTO {} (".format(table_name)
        s2 = " VALUES ("
        for k in keys:
            s1 += "{}, ".format(k)
            s2 += "%s, "
        sql = "{}) {});".format(s1[:-2], s2[:-2])
        print(sql)
        msi.write_sql_values_get_row(sql, tuple(vals))

Phil 12.12.2023

Run “The Swim” through the GPT as a New Yorker editor and get some feedback.

SBIRs

9:00 Standup
2:00 BMD SBIR Meeting
2:30 AI ethics?
Going to continue reading the papers I found yesterday.
Eliciting Latent Predictions from Transformers with the Tuned Lens
- We analyze transformers from the perspective of iterative inference, seeking to understand how model predictions are refined layer by layer. To do so, we train an affine probe for each block in a frozen pretrained model, making it possible to decode every hidden state into a distribution over the vocabulary. Our method, the tuned lens, is a refinement of the earlier “logit lens” technique, which yielded useful insights but is often brittle.
  We test our method on various autoregressive language models with up to 20B parameters, showing it to be more predictive, reliable and unbiased than the logit lens. With causal experiments, we show the tuned lens uses similar features to the model itself. We also find the trajectory of latent predictions can be used to detect malicious inputs with high accuracy. All code needed to reproduce our results can be found at this https URL.
Toy Models of Superposition
- Neural networks often pack many unrelated concepts into a single neuron – a puzzling phenomenon known as ‘polysemanticity’ which makes interpretability much more challenging. This paper provides a toy model where polysemanticity can be fully understood, arising as a result of models storing additional sparse features in “superposition.” We demonstrate the existence of a phase change, a surprising connection to the geometry of uniform polytopes, and evidence of a link to adversarial examples. We also discuss potential implications for mechanistic interpretability.

GPT Agents

At 54 responses, so yay! If nothing comes in today, I’m going to download the csv files, put them in Box, and start creating graphs.
Create a project that stores the files, loads them into the DB, and produces excel files. Use private SVN repo

Phil 12.11.2023

Whoops, didn’t get to the Lawn Mower on Sunday. Or bills for that matter

Write up the “air pocket essay?” Or maybe as a short story and don’t explain the metaphor.

3:00 Podcast!

SBIRS

2:00 MDA
Got the Perplexity.ai api running their hello world. To make this example work, you have to have your API token attached to an environment variable named “PERPLEXITY_API_KEY. There isn’t much else to the API. Hoping that things improve over time.

# From examples at https://docs.perplexity.ai/reference/post_chat_completions

import requests
import os
import json

url = "https://api.perplexity.ai/chat/completions"

api_key = os.environ.get("PERPLEXITY_API_KEY")
print("API key = {}".format(api_key))

payload = {
    "model": "mistral-7b-instruct",
    "messages": [
        {
            "role": "system",
            "content": "Be precise and concise."
        },
        {
            "role": "user",
            "content": "How many stars are there in our galaxy?"
        }
    ]
}
headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "authorization": "Bearer {}".format(api_key)
}

if api_key == None:
    print("No API key")
else:
    response = requests.post(url, json=payload, headers=headers)
    jobj = json.loads(response.text)
    s = json.dumps(jobj, sort_keys=True, indent=4)
    print(s)

Figuring if I want to work on NNM or War Elephants first. Both, I guess?
Setting up the NNM Overleaf so I have a place to add assets. Done
Also, this is for the W.E. paper:
Exposed Hugging Face API tokens offered full access to Meta’s Llama 2
- The API tokens of tech giants Meta, Microsoft, Google, VMware, and more have been found exposed on Hugging Face, opening them up to potential supply chain attacks.
- Researchers at Lasso Security found more than 1,500 exposed API tokens on the open source data science and machine learning platform – which allowed them to gain access to 723 organizations’ accounts.
- In the vast majority of cases (655), the exposed tokens had write permissions granting the ability to modify files in account repositories. A total of 77 organizations were exposed in this way, including Meta, EleutherAI, and BigScience Workshop – which run the Llama, Pythia, and Bloom projects respectively.
Reading Future Lens: Anticipating Subsequent Tokens from a Single Hidden State for hints about using activation layers.
- Found a few papers in the references that look interesting
- Locating and Editing Factual Associations in GPT
  - We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model’s factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate factual predictions while processing subject tokens. To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME). We find that ROME is effective on a standard zero-shot relation extraction (zsRE) model-editing task, comparable to existing methods. To perform a more sensitive evaluation, we also evaluate ROME on a new dataset of counterfactual assertions, on which it simultaneously maintains both specificity and generalization, whereas other methods sacrifice one or another. Our results confirm an important role for mid-layer feed-forward modules in storing factual associations and suggest that direct manipulation of computational mechanisms may be a feasible approach for model editing. The code, dataset, visualizations, and an interactive demo notebook are available in the supplemental materials.
- Transformer Feed-Forward Layers Are Key-Value Memories – ACL Anthology
  - Feed-forward layers constitute two-thirds of a transformer model’s parameters, yet their role in the network remains under-explored. We show that feed-forward layers in transformer-based language models operate as key-value memories, where each key correlates with textual patterns in the training examples, and each value induces a distribution over the output vocabulary. Our experiments show that the learned patterns are human-interpretable, and that lower layers tend to capture shallow patterns, while upper layers learn more semantic ones. The values complement the keys’ input patterns by inducing output distributions that concentrate probability mass on tokens likely to appear immediately after each pattern, particularly in the upper layers. Finally, we demonstrate that the output of a feed-forward layer is a composition of its memories, which is subsequently refined throughout the model’s layers via residual connections to produce the final output distribution.
- All Roads Lead to Rome? Exploring the Invariance of Transformers’ Representations
  - Transformer models bring propelling advances in various NLP tasks, thus inducing lots of interpretability research on the learned representations of the models. However, we raise a fundamental question regarding the reliability of the representations. Specifically, we investigate whether transformers learn essentially isomorphic representation spaces, or those that are sensitive to the random seeds in their pretraining process. In this work, we formulate the Bijection Hypothesis, which suggests the use of bijective methods to align different models’ representation spaces. We propose a model based on invertible neural networks, BERT-INN, to learn the bijection more effectively than other existing bijective methods such as the canonical correlation analysis (CCA). We show the advantage of BERT-INN both theoretically and through extensive experiments, and apply it to align the reproduced BERT embeddings to draw insights that are meaningful to the interpretability research. Our code is at this https URL.
- Jump to Conclusions: Short-Cutting Transformers With Linear Transformations
  - Transformer-based language models (LMs) create hidden representations of their inputs at every layer, but only use final-layer representations for prediction. This obscures the internal decision-making process of the model and the utility of its intermediate representations. One way to elucidate this is to cast the hidden representations as final representations, bypassing the transformer computation in-between. In this work, we suggest a simple method for such casting, by using linear transformations. We show that our approach produces more accurate approximations than the prevailing practice of inspecting hidden representations from all layers in the space of the final layer. Moreover, in the context of language modeling, our method allows “peeking” into early layer representations of GPT-2 and BERT, showing that often LMs already predict the final output in early layers. We then demonstrate the practicality of our method to recent early exit strategies, showing that when aiming, for example, at retention of 95% accuracy, our approach saves additional 7.9% layers for GPT-2 and 5.4% layers for BERT, on top of the savings of the original approach. Last, we extend our method to linearly approximate sub-modules, finding that attention is most tolerant to this change.
- Visualizing and Interpreting the Semantic Information Flow of Transformers
  - Recent advances in interpretability suggest we can project weights and hidden states of transformer-based language models (LMs) to their vocabulary, a transformation that makes them more human interpretable. In this paper, we investigate LM attention heads and memory values, the vectors the models dynamically create and recall while processing a given input. By analyzing the tokens they represent through this projection, we identify patterns in the information flow inside the attention mechanism. Based on our discoveries, we create a tool to visualize a forward pass of Generative Pre-trained Transformers (GPTs) as an interactive flow graph, with nodes representing neurons or hidden states and edges representing the interactions between them. Our visualization simplifies huge amounts of data into easy-to-read plots that can reflect the models’ internal processing, uncovering the contribution of each component to the models’ final prediction. Our visualization also unveils new insights about the role of layer norms as semantic filters that influence the models’ output, and about neurons that are always activated during forward passes and act as regularization vectors.
It turns out that there is a thing called AlignedUMAP
- It may happen that it would be beneficial to have different UMAP embeddings aligned with each other. There are several ways to go about doing this. One simple approach is to simply embed each dataset with UMAP independently and then solve for a Procrustes transformation on shared points. An alternative approach is to embed the first dataset and then construct an initial embedding for the second dataset based on locations of shared points in the first embedding and then go from there. A third approach, which will provide better alignments in general, is to optimize both embeddings at the same time with some form of constraint as to how far shared points can take different locations in different embeddings during the optimization. This last option is possible, but is not easily tractable to implement yourself (unlike the first two options). To remedy this issue it has been implemented as a separate model class in umap-learn called AlignedUMAP. The resulting class is quite flexible, but here we will walk through simple usage on some basic (and somewhat contrived) data just to demonstrate how to get it running on data.

GPT Agents

Send out some reminders – done
Frame out the new paper. Need to discuss RAG now. And maybe invert the order of the methods and findings?

Phil12.8.2023

Tasks

Winterize the lawnmower on Sunday!
Bills
Chores

SBIRs

Expense report for ETF
3:00 MDA meeting
Good meeting with Aaron about prompt swarms yesterday. A real problem will be how each agent maintains its “identity”, since all the stories by all the agents will be an influencing force that makes all output regress towards the mean. Each agent will probably have to have multiple internal threads that cover different aspects of its behavior. Only one thread will be the one that the “world” sees.

GPT Agents

Send out the reminder email

Phil 12.7.2023

Tasks

Schedule car recall, seat lever, and socket cover service – done
Schedule follow up visit – done

SBIRs

Conference from 1:30 – 3:00
9:00 standup – done
Slide prep for Friday. Use SEG document as the basis, and add Rukan’s Nov 29 results – done
Add sections to the Overleaf project for the Q8 notes and the Final report – done

GPT Agents

Start a “Human Study section in the paper. Note that tagging was not used because of the human evaluation and the complexities of interpreting a result that would include tags
Download the latest csv, and create spreadsheet with different tabs for the different tests, then the rollup
See if it’s possible to connect PgAdmin to my supabase instance and download. Success! Can’t see how to download just one schema though

If no new submissions, send out the reminder email

Book

Realized that the cover picture should be a grid of very attractive AI people, like dating profile pix.

Take the presentation and turn it into the text for the final chapter.

viztales

Dimension reduction, State, Orientation, and Speed

Monthly Archives: December 2023

Phil 12.29.2023

Phil 12.28.2023

Phil 12.27.2023

Phil 12.24.2023

Phil 12.21.2023

Phil 12.20.2023

Phil 12.19.2023

Phil 12.18.2023

Phil 12.14.2023

Phil 12.15.2023

Phil 12.13.2023

Phil 12.12.2023

Phil 12.11.2023

Phil12.8.2023

Phil 12.7.2023