Phil 7.2.2024

Tasks

  • Get recumbent to Aaron? Better yet, have him try it here.
  • Call Rhena

SBIRs

  • Need to reach out to Iain with the pointer to the Anthropic paper and the snippet of code that shows how to get to layers. Also send the link to DataMapPlot – done, though I forgot to mention the nifty plotting
  • Finish poster and send to Staples – done. Should be ready tonight
  • Start on 5 slide provocation deck. Well, it’s 11 slides, but made as a rapid sequence
  • Identrust thing – tomorrow?

Phil 7.1.2024

This happened today:

Tasks:

  • Plumber – scheduled – done!
  • Bank – done

SBIRs

  • 9:00 sprint demos – done
  • 3:00 sprint planning -done
  • Travel expenses – done. And approved!
  • Poster – first pass is done
  • Put business cards in contacts

Phil 6.28.2024

Back from the MORS 92ns Symposium. Monterey is lovely. I got to ride my bike along the shore and into the hills. Good presentations. Some particularly good stuff from Sandia on finding markers for when online activity moves into the real world. In this case the data was about the GameStop short squeeze, but it might be more generalizable. Need to keep in touch.

Another, less interesting talk had a really good pointer, MITRE’s Att&ck knowledge base of adversary tactics and techniques based on real-world observations. I think it makes sense to start to put together a AI-based social hacking of theoretical and actual possible hacks and defenses. Some of these would still be human active measures, but could be scaled.

Phil 6.20.2024

These are some loooooong daylight hours here near the 39th parallel.

SBIRs

  • Received a notification from the CUI folks to prepare a video presentation if I wasn’t attending of about 5 minutes, which is the same as a poster. So I think the wise move is slides and a poster. Work on that today and maybe some tomorrow. Otherwise while in CA.
  • 9:00 Standup. Go over the layer image maybe and then go for a ride. I’ll need to work on the poster & slides next week. I’ll try to get started on UMAP today and have enough done so I can pick it up in two weeks
  • 1:00 Overmatch call – might have gone well. More later?
  • Got UMAP working with Plotly! Here’s the code. It’s based on this UMAP example here and the plotly scatterplot examples here:
from dash import Dash, dcc, html, Input, Output, callback
import plotly.express as px
from NNMs.utils.DashBaseClass import DashBaseClass

import numpy as np
from sklearn.preprocessing import StandardScaler
import pandas as pd

import umap

class UmapPenguins(DashBaseClass):
    df:pd.DataFrame

    def initialize(self) -> None:
        penguins = pd.read_csv("https://raw.githubusercontent.com/allisonhorst/palmerpenguins/c19a904462482430170bfe2c718775ddb7dbb885/inst/extdata/penguins.csv")
        penguins.head()
        penguins = penguins.dropna()
        print(penguins.species.value_counts())

        print("scaling data")
        reducer = umap.UMAP()
        penguin_data = penguins[
            [
                "bill_length_mm",
                "bill_depth_mm",
                "flipper_length_mm",
                "body_mass_g",
            ]
        ].values
        scaled_penguin_data = StandardScaler().fit_transform(penguin_data)

        print("finished scaling data")

        print("calculating embedding")
        embedding = reducer.fit_transform(scaled_penguin_data)
        print("embedding.shape = {}".format(embedding.shape))
        self.df = pd.DataFrame(embedding, columns=['x', 'y'])

        # nda = np.random.random(size=(333, 3))
        # self.df = pd.DataFrame(nda, columns=['x', 'y', 's'])

    def setup_layout(self) -> None:
        self.add_div(html.H2("UMAP scatterplot", style={'textAlign': 'center'}))
        fig = px.scatter(self.df, x='x', y='y')
        self.add_div(dcc.Graph(figure=fig))
        self.app.layout = html.Div(self.div_list)

if __name__ == "__main__":
    ump = UmapPenguins(True)
  • And here’s the result.

Phil 6.18.2024

Interesting piece from Bobbie Berjon: The Public Interest Internet

  • What if the internet were public interest technology? I mean “internet” the way most people understand it, which is to say our whole digital sphere, and by “public interest” I don’t mean tinkering at the margins to reduce harm from some bad actors or painting some glossy ethics principles atop a pile of exploitative rent-seeking — I mean through and through, warts and all, an internet that works in support of a credible, pragmatic definition of the common good.

Tasks

  • Carlos email

SBIRs

  • 9:00 standup
  • Write up what I’ve discovered about the hidden layer info in output vs the activation info. Done
  • Connect the heatmap to the running model. Done
  • Add the running code to the documentation. Done
  • Start figuring out UMAP. Not even started! I blame meetings

Phil 6.17.2024

Tasks

  • Call Judith – done
  • Letter to Carlos
  • ICTAI-2024

Creativity Has Left the Chat: The Price of Debiasing Language Models

  • Large Language Models (LLMs) have revolutionized natural language processing but can exhibit biases and may generate toxic content. While alignment techniques like Reinforcement Learning from Human Feedback (RLHF) reduce these issues, their impact on creativity, defined as syntactic and semantic diversity, remains unexplored. We investigate the unintended consequences of RLHF on the creativity of LLMs through three experiments focusing on the Llama-2 series. Our findings reveal that aligned models exhibit lower entropy in token predictions, form distinct clusters in the embedding space, and gravitate towards “attractor states”, indicating limited output diversity. Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation. The trade-off between consistency and creativity in aligned models should be carefully considered when selecting the appropriate model for a given application. We also discuss the importance of prompt engineering in harnessing the creative potential of base models.
  • They were able to do this by comparing the chat vs completion models. There are all kinds of implications here

SBIRs

  • Letter to Anthropic – done
  • DARPA meeting – cancelled?
  • Get the cookiecutter environment finished

Got the heatmap working. Calling it a day

Phil 6.14.2024

Finally able to get to chores

Via Mastodon

Pentagon ran secret anti-vax campaign to undermine China during pandemic

  • At the height of the COVID-19 pandemic, the U.S. military launched a secret campaign to counter what it perceived as China’s growing influence in the Philippines, a nation hit especially hard by the deadly virus.
    The clandestine operation has not been previously reported. It aimed to sow doubt about the safety and efficacy of vaccines and other life-saving aid that was being supplied by China, a Reuters investigation found. Through phony internet accounts meant to impersonate Filipinos, the military’s propaganda efforts morphed into an anti-vax campaign. Social media posts decried the quality of face masks, test kits and the first vaccine that would become available in the Philippines – China’s Sinovac inoculation.

SBIRs

  • 12:00 AI Ethics meeting

Phil 6.13.2024

I’ve been wondering about how to map regions of a model that are reached through an extensive prompt, like the kind you see with RAG. The problem is that the prompt gets very large, and it may be difficult to see how the trajectory works. There appear to be several approaches to dealing with this, so here’s an ongoing list of things to try:

  • Just plot the whole prompt including context. This assumes that the model is big enough and public, like Llama or Gemma. I assume that as the prompt grows, the head will go to different regions. But the vector that we’re trying to plot will continue to grow, and I’m not sure how that gets plotted.
  • Just save off the N vectors that are closes to the head. The full text can also be saved, so there is some correlation.
  • Use a prompt to fire off N responses and use that to finetune a small (e.g. GPT-2) model. Then create “small” window of tokens that travel through the finetuned space. The nice thing is that this lets us indirectly explore closed-source models in a narrative context.
  • I’d also like to see if there is any way to use dictionary learning on these narrative elements. It seems that there is no mathematical reason that you can’t have “narrative features” like tropes, possibly.

SBIRs

  • 9:00 standup
  • 4:30 Book club. Finish chapter 8!

GPT Agents

  • Meet maybe at 3:00?

Phil 6.12.2024

Well, the vacation is fading, and I’m back to what I’ve been calling “lockdown lite.” Going from a steady interaction with people in the real world to work-from-home where my interaction with people is a few online meetings… isn’t healthy.

Mamba Explained

  • Mamba, however, is one of an alternative class of models called State Space Models (SSMs). Importantly, for the first time, Mamba promises similar performance (and crucially similar scaling laws) as the Transformer whilst being feasible at long sequence lengths (say 1 million tokens). To achieve this long context, the Mamba authors remove the “quadratic bottleneck” in the Attention Mechanism. Mamba also runs fast – like “up to 5x faster than Transformer fast.”

SBIRs

  • Write email to Anthropic – done
  • Write up notes on the Scaling Monosemanticity paper and put that in the NNM documentation. Done
  • Update the Overleaf book content – done! I even expanded the Senate testimony. Look at me go!
  • Got my slot for MORS – 6/26 at 8:30-9:00 in GL113. That should give me some time for riding around Monterey 🙂
  • Lunch with Aaron – fun!

Phil 6.11.2024

PORTUGAL FROM NORTH TO SOUTH ALONG THE MYTHICAL ESTRADA NACIONAL 2 – 5TH EDITION <- ordered!

Tasks:

  • Collect receipts and notes
  • Spreadsheet
  • Follow up with Carlos. Maybe discuss with Shimei & Jimmy first?

SBIRs

  • Performance goals – done
  • Letter to Anthropic
    • I spent the whole day reading Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. It’s very good and very interesting. The Anthropic folks are looking at features, not sequences. That being said, their feature work is really good, and the UMAP relationships they show are very map-like, in kind of the same way that text embedding is. Which makes sense as those embeddings are also coming from LLMs (OpenAI’s ada-embedding002, frequently). There is also some really interesting work in using the features to help the model detect manipulative content, which aligns with my White Hat AI concept.
    • I was thinking that it might make sense to wait to contact Anthropic after getting some layer mapping done, but I think it might make sense to reach out as planned. Particularly since they have a concept of features that are “smeared” across layers, which I hadn’t thought about before but makes sense. They call this Cross-Layer Superposition.
    • Anyway, I’ll write up an email tomorrow. Note – Include a picture from the conspiracy theory map!
    • I really wonder if dictionary learning of features using sparse autoencoders can be used for sequences rather than features.
  • Change out images in presentation and resubmit – done. Decided to leave the LLM embeddings
  • Work on book – nope

GPT-Agents

Phil 6.10.2024

Back from a great vacation in Portugal. We did non-touristy things too, but the National Palace of Pena in Sintra is an amazing 19th century Disneyland:

Here’s where it is: maps.app.goo.gl/11pEQ8TunMRW69qx6

And back to work:

GPT-2 from scratch

SBIRs

  • Performance goals
  • 9:00 Sprint demos
  • 1:30 D2A Tech Summit Review
  • 3:00 Sprint planning
    • MORS conference
    • CUI presentation prep
    • What else?

Phil 5.31.2024

OpenAI Says Russia and China Used Its A.I. in Covert Campaigns

  • OpenAI said on Thursday that it had identified and disrupted five online campaigns that used its generative artificial intelligence technologies to deceptively manipulate public opinion around the world and influence geopolitics.
  • The efforts were run by state actors and private companies in Russia, China, Iran and Israel, OpenAI said in a report about covert influence campaigns. The operations used OpenAI’s technology to generate social media posts, translate and edit articles, write headlines and debug computer programs, typically to win support for political campaigns or to swing public opinion in geopolitical conflicts.

Supervision and truth

Once a Sheriff’s Deputy in Florida, Now a Source of Disinformation From Russia (AI tools)

  • With the help of commercially available artificial intelligence tools, including OpenAI’s ChatGPT and DALL-E 3, he has filled the sites with tens of thousands of articles, many based on actual news events. Interspersed among them are also bespoke fabrications that officials in the United States and European Union have attributed to Russian intelligence agencies or the administration of President Vladimir V. Putin.

Fake News Reports and Videos Seek to Undermine the Paris Olympics (More traditional active measures)

  • Microsoft estimates that Storm-1679 produces three to eight faked videos a week, in English and French, with many impersonating the BBC, Al Jazeera and other broadcasters. The group appears to respond quickly to news events, like protests in New Caledonia, a French territory in the Pacific. Others focus on the prospect of a terrorist attack in Paris.

Why this year’s election interference could make 2016 look cute

  • For more than a year, FBI Director Christopher A. Wray has warned about a wave of election interference that could make 2016 look cute. No respectable foreign adversary needs an army of human trolls in 2024. AI can belch out literally billions of pieces of realistic-looking and sounding misinformation about when, where and how to vote. It can just as easily customize political propaganda for any individual target. In 2016, Brad Parscale, Donald Trump’s digital campaign director, spent endless hours customizing tiny thumbnail campaign ads for groups of 20 to 50 people on Facebook. It was miserable work but an incredibly effective way to make people feel seen by a campaign. In 2024, Brad Parscale is software, available to any chaos agent for pennies. There are more legal restrictions on ads, but AI can create fake social profiles and aim squarely for your individual feed. Deepfakes of candidates have been here for months, and the AI companies keep releasing tools that make all of this material faster and more convincing.

Mapping the Increasing Use of LLMs in Scientific Papers

  • Scientific publishing lays the foundation of science by disseminating research findings, fostering collaboration, encouraging reproducibility, and ensuring that scientific knowledge is accessible, verifiable, and built upon over time. Recently, there has been immense speculation about how many people are using large language models (LLMs) like ChatGPT in their academic writing, and to what extent this tool might have an effect on global scientific practices. However, we lack a precise measure of the proportion of academic writing substantially modified or produced by LLMs. To address this gap, we conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals, using a population-level statistical framework to measure the prevalence of LLM-modified content over time. Our statistical estimation operates on the corpus level and is more robust than inference on individual instances. Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers (up to 17.5%). In comparison, Mathematics papers and the Nature portfolio showed the least LLM modification (up to 6.3%). Moreover, at an aggregate level, our analysis reveals that higher levels of LLM-modification are associated with papers whose first authors post preprints more frequently, papers in more crowded research areas, and papers of shorter lengths. Our findings suggests that LLMs are being broadly used in scientific writings.

SBIRs

  • Submitted the MORS presentation

Phil 5.26.2024

On vacation, but need to work on the slide deck. Also, some interesting finds:

SBIRs

  • Slides

Phil 5.16.2024

Had an interesting talk with Tim Ellis yesterday, which makes me want to write about growing up in the shadow of the Holocaust, how it affected me then, and how things now look as an older adult.

Hmm. Just learned about Kaji, which is a paid search engine. May have to check that out.

SBIRs

  • CUI registration
  • Callbacks today – done!
  • Need to put together slides for Sprint Demos and Stories that I’ll squeeze in on vacation.
    • Need to start putting the MORS deck together. It’s due at the end of this month.
    • Finalize and submit CUI paper and start on slides
    • Work on the book

GPT Agents

  • 2:00 Meeting