Phil 8.15.2024

Tasks

9:40 Dentist
Bank
Schedule to get the door fixed

SBIRs

Trawled the swamp and found some good NSF and Army possibilities from grants.gov. You can export the results as a csv file and search through those, either by link (which can be wrong) or by googling project name, which works. Found some very good opportunities for NNMs, and some others for White Hat AI. The earliest opportunity closes Sept 30, which is enough time to write a reasonable proposal. Nothing specifically about spearphishing, which is kind of interesting. That seems to be acceptable in some way.
9:00 Standup. Will have to leave early
2:00 Thunderbolt meeting
4:30 Book club

GPT Agents

No meeting today, Jimmy’s at a wedding. Add more content?

Phil 8.14.2024

More justification for WH-AI: Hackers may have stolen the Social Security numbers of every American. How to protect yourself

“These bad guys, this is what they do for a living,” Murray said. They might send out tens of thousands of queries and get only one response, but that response could net them $10,000 from an unwitting victim. “Ten thousand dollars in one day for having one hit with one victim, that’s a pretty good return on investment,” she said. “That’s what motivates them.”

More stuff for consumer-first AI: Cosmos Magazine publishes AI-generated articles, drawing criticism from journalists, co-founders

Schedule to get the door fixed

SBIRs

Ping John to see if we can schedule WH-AI architecture planning
Draw up some diagrams for the architecture that we can go over
- Information flows
- Main browser extension – just Chrome for now
- Maybe three buttons for the popup?- Avoid, disregard, this is an error?
- Adjustment knobs for target user – also notification settings for guardians (parents, adult children, etc.)
- Private User database of issued warnings, so that users don’t see the same “introductory” warning. This DB could also have sender information
- Some other kind of warning if the user is repeatedly interacting with the sender of manipulative email, particularly if it matches one of the scam patterns.
- Spaced repetition of older warnings
- Public database of manipulative posts, if warnings were disregarded or heeded. This can feed back to the the Chrome extension as well in case there are multiple adjacent embeddings that are, for example, increasing in a viral way.
- A UMAP display of the embedding space that lets users navigate and understand what’s going on. Areas of high activity should be indicated. Clicking on a point or dragging across an area should provides specific and/or summary information
- Reactive design for Chrome on mobile?
UMAP-JS
- This is a JavaScript reimplementation of UMAP from the python implementation found at https://github.com/lmcinnes/umap.
Nope, strike that. Everything has to have a potential target before it gets worked on. So I’ve gone from belief space maps to white hat AI PoC to looking through SBIRs and BAAs. Ah, well, only 132 working days until 15 February 2025.
Put together a good size list of people to reach out to. Still need to trawl the BAA/SBIR swamp

GPT Agents

Read what I wrote yesterday. Look for sources. I’m particularly interested if there is anything on the creation of guardrails. There might be a perspective here that comes from labor relations, like having to buy GMO seeds rather than being able to re-plant based on the harvest? Extracting maximum value while providing minimum utility. Also worth re-reading the Stochastic Parrots paper and look for interesting papers that cite it. It strikes me that the whole idea of “public” GenAI may also reference the “right to repair” movement, and Robin Berjon’s The Public Interest Internet.

Phil 8.13.2024

Tasks

Schedule to get the truck door fixed
BoA closes at 4:00. Go over at 3:00?

SBIRs

Need to talk to Aaron about switching from NNM to WH-AI and see if I can get some Fabric time
Getting started on Chrome Extensions
Good chat with Zach. He agrees that starting with a chrome extension makes sense.

GPT Agents

Need to get started on the “A Consumer-First Approach to GenAI?” paper, now that the reviews are none. Good start

Phil 8.12.2024

Tasks

1:00: Get truck – done. Why is everything $1,000?
Call bank? Probably tomorrow
Call Guardian – try to reposition the cover first

SBIRs

9:00 Standup – done
11:00 AI C of I – done. Maybe that went well?
12:00 Steve’s brownbag – Could have been 20 minutes
Wrote up a proposal for White Hat AI that will “be the D2A of NNM.” Which is ok, I guess. Otherwise progress is going to be imperceptible. And Aaron has a good point in that no one knows the difference between topic embedding space and narrative space. Which is annoying, but makes it easier to make visible progress
3:00 Sprint planning
Do I need to write a SimAccel/D2A white paper?

GPT Agents

Finish and submit ICTAI reviews – DONE!

Phil 8.9.2024

Tasks

Sign things in blue for Wolfram
Pick up Stacey’s car and get the van if it’s not too wet

SBIRs

Scan cards
Expense report
Put together slides for sprint demo
USNA interns final presentation
Conference debrief to Orest
- The conference was almost exclusively props and videos. No one could really show anything unless you were a manufacturer (“look at this beautiful planetary gear!”)
  - The golf idea was universally well received, and there was one other booth with an interceptor game. I just don’t think it went far enough, and was too easy to defeat

We need videos that work with no audio. Also the lighting on Aaron makes his head distractingly shiny.
The booths that had demos had more people as a rule (Blue Halo and Axient were the only ones I saw)
A monitor in the conference-facing desk would be better than one on the desk
Chairs need to work on deeply-padded tradeshow carpets. The dis-based chairs we had wobbled. A lot.
Sequentially numbered/barcoded batches of bears that when scanned link to ASRC branded polar bear cams (or even better an updated curated feed)

GPT Agents

Write next review and read a paper

Phil 8.8.2024

Anointed with Oil

Anointed with Oil places religion and oil at the center of American history. As prize-winning historian Darren Dochuk reveals, from the earliest discovery of oil in America during the Civil War, citizens saw oil as the nation’s special blessing and its peculiar burden, the source of its prophetic mission in the world. Over the century that followed and down to the present day, the oil industry’s leaders and its ordinary workers together fundamentally transformed American religion, business, and politics — boosting America’s ascent as the preeminent global power, giving shape to modern evangelical Christianity, fueling the rise of the Republican Right, and setting the terms for today’s political and environmental debates.

Finishing the conference and heading home:

GPT Agents

Finished one more paper review and started the fourth one. Pretty good so far!
Jimmy put in a big chunk of writing on the Consumer-First Approach to AI paper!

Phil 8.6.2024

Transformer Layers as Painters

Despite their nearly universal adoption for large language models, the internal workings of transformers are not well understood. We aim to better understand the impact of removing or reorganizing information throughout the layers of a pretrained transformer. Such an understanding could both yield better usage of existing models as well as to make architectural improvements to produce new variants. We present a series of empirical studies on frozen models that show that the lower and final layers of pretrained transformers differ from middle layers, but that middle layers have a surprising amount of uniformity. We further show that some classes of problems have robustness to skipping layers, running the layers in an order different from how they were trained, or running the layers in parallel. Our observations suggest that even frozen pretrained models may gracefully trade accuracy for latency by skipping layers or running layers in parallel.

SBIRs

At a dumb conference and trade show that I just don’t need to be at.

GPT Agents

Getting some progress on paper reviews though

Phil 8.5.2025

Good ride on Saturday and a nice party on Sunday.

GPT Agents

Finished two reviews! Printed out the other three for the flight

SBIRs

Going down to Huntsville for a few days to be a booth babe. Hot but dry there, rain here.

Phil 8.2.2024

From BlueSky

No interns today, so reviewing papers instead. One down!

Tweaked the Overleaf document.

Applied for the Reddit Research API

Liars know they are lying: differentiating disinformation from disagreement

Mis- and disinformation pose substantial societal challenges, and have thus become the focus of a substantive field of research. However, the field of misinformation research has recently come under scrutiny on two fronts. First, a political response has emerged, claiming that misinformation research aims to censor conservative voices. Second, some scholars have questioned the utility of misinformation research altogether, arguing that misinformation is not sufficiently identifiable or widespread to warrant much concern or action. Here, we rebut these claims. We contend that the spread of misinformation—and in particular willful disinformation—is demonstrably harmful to public health, evidence-informed policymaking, and democratic processes. We also show that disinformation and outright lies can often be identified and differ from good-faith political contestation. We conclude by showing how misinformation and disinformation can be at least partially mitigated using a variety of empirically validated, rights-preserving methods that do not involve censorship.

Phil 8.1.2024

Dog days of summer for sure

Need to start on the tasks Wolfram asked for

SBIRs

Really interesting discussion with Aaron on CI Agents. It might be possible for token trajectories to maintain their “identity” by looking at the distance between a set of agent responses, and selecting those that have the greatest distance.
Got some good NNM work done yesterday. I’m almost ready to generate ring buffer text along with activations by token at each layer of the model. And in case I haven’t really mentioned it anywhere else, I think training ring-buffer GPT-2 (or bigger, local models that can quickly be finetuned. You need to be able to access the layers) models via finetuning from a big model with extensive prompt tuning might be a very good way to create local maps.
9:00 standup
Lunchtime ride in this gap!
11:00 M30
2:00 Conference prep
2:30 Hall research
4:30 Book club

GPT Agents

Finish TiiS review – done!
ICTAI – Downloaded papers. Deadline is August 18, so that’s a bit over 3 days per paper
3:00 Meeting
Can Artificial Intelligence be Open Sourced?
- This paper explores the potential of open source models to match or even surpass proprietary models in the future. The dialogue reflects a broader debate on the implications of open sourcing AI, weighing the benefits of democratization against the risks of misuse.
The Public Interest Internet
- Allow me to open with a wildly speculative question: What if the internet were public interest technology? I mean “internet” the way most people understand it, which is to say our whole digital sphere, and by “public interest” I don’t mean tinkering at the margins to reduce harm from some bad actors or painting some glossy ethics principles atop a pile of exploitative rent-seeking — I mean through and through, warts and all, an internet that works in support of a credible, pragmatic definition of the common good.¹

Phil 7.31.2024

And so we say goodbye to July. That was fast

11:00 meeting with Wolfram – went well!

SBIRs

Back to setting up the NNM project codebase. Here’s what I was planning before getting pulled off:
- First, a program that reads in a config file and generates a file of activations for each layer. This will be a fixed size ring buffer of tokens so that the number of vectors for each layer is fixed for the “evolution”
- Then code that reads the same config file to produce a set of UMAP files, again one for each layer.
- Last, the visualization code using Plotly. It either reads the same config or the user can navigate to it. Management of all the files (12 layers of embeddings, 12 layers of UMAP) will be managed by extension. Might want to set up folders though.
Started up the instance and getting back to work

Phil 7.30.2024

SBIRs

9:00 Standup
10:00 AI Ethics talk – good!
Maybe some NNM – nope. Couldn’t get settled into a coding groove
Moved the the CwoC Overleaf project to the company repo.

GPT Agents

More reviews. I have been pummeled by too many words

Phil 7.29.2024

SBIRs

Looks like I’m going to Huntsville for a conference
9:00 Intern presentation. Good stuff 🙂
Got all the flight, hotel, and car rental handled
Looks like book club is on for Thursday

GPT Agents

Start journal review – started

Phil 7.25.2024

AC gets its checkup: 7:30 – Noon

SBIRs

Finish the whitepaper(s). One is more marketing, one is more proposal. Done. Not done, refining.
2:00 ML discussion? Kind of. Important no shows
2:30 USNA proposal discussion
Maybe do more on NNM?
Cancelled the book club for the week

GPT Agents

3:00 meeting. I think we’re discussing the invited paper. One of the things I think I’ll suggest is the HST thing that I worked out yesterday. It’s 639 words. And this is also a good thing to look at: From Burnout to Balance: AI-Enhanced Work Models

Phil 7.24.2024

RE tax! – done, and scheduled for Dec 23, too

Put up my first Bluesky thread. Not sure if I did it right

I also wrote another thing! Need to submit this somewhere.

SBIRs

Set up the project folder structure and put together an initial config JSON file. Beginning prompt, token buffer size, and the number of steps. Maybe the number of UMAP dimensions? Or do we always produce 2d and 3d versions?
To processing tests to CSV files. That way we can look at them in the IDE
Got the structure set up and the config file read in
Need to do some BD as high priority

GPT Agents

Based on the chat with Jimmy, I loaded the (for me anyway) unreadable versions of Das Kapital into Notebook LM and used the following prompt:
- Extrapolate to concept of workers and machines to generative AI which is trained on the output of writers (the workers) and can be used to produce mediocre writing without workers, and faster, high-quality writing with them. On one side, generative AI can replace workers, but will produce largely unoriginal content that will lose appeal over time. On the other, workers have access to a mechanical co-worker that can accelerate and often improve what they produce. Is the technology of generative AI inherently capitalist? If so, why? Provide alternative, egalitarian perspectives as well. Use the form of a white paper
Here’s the response:
- Is the technology of generative AI inherently capitalist?
  - To address this question, this paper extrapolates from the sources’ concepts of workers and machines to the context of generative AI, which, analogous to a machine, is trained on the output of writers (the workers)
  - Similar to the machines described in the sources, generative AI can produce output (in this case, writing) without the involvement of writers once it is trained on their work12.
  - However, also similar to the sources’ depiction of machines, the output from generative AI is likely to be of higher quality when it is used as a tool by human workers134.
- Capitalist Perspective:
  - The sources, focusing primarily on the dynamics of capitalist production, offer compelling arguments for why the technology of generative AI, within the framework of a capitalist system, could be considered inherently capitalist:
  - Exploitation of Labor: A core tenet of the sources is that capitalist production inherently seeks to extract surplus value from labor. In the context of generative AI, this dynamic could manifest through:
  - Replacement of Workers: Generative AI could potentially replace writers, particularly in producing low-quality, formulaic content, mirroring how machinery replaced skilled artisans in the sources456. This replacement allows capitalists to reduce labor costs and increase profits.
  - Intensification of Labor: Generative AI, even when used as a tool by writers, can lead to an intensification of work78. This intensification could take the form of pressure to produce more content in less time or to compete with the output of AI, ultimately benefiting the capitalist through increased productivity.
  - Control Over Production: The sources emphasize how machinery, under capitalist control, consolidates the power of capitalists over the production process459. This control extends to generative AI:
  - Ownership of AI: Capitalists who own the AI technology hold significant power. They control the data used to train AI, influence the types of content produced, and determine how the technology is implemented—all of which can be leveraged to maximize profits.
- Alternative, Egalitarian Perspectives:
  - While the sources primarily focus on capitalist applications of machinery and labor, they also offer insights that can be extrapolated to imagine more egalitarian applications of generative AI:
  - AI as a Tool for Empowerment: Generative AI, when viewed not as a replacement but as a powerful tool, has the potential to empower writers1011. It can:
  - Increase Productivity: AI can handle tedious and repetitive writing tasks, freeing human writers to focus on more creative and engaging work.
  - Improve Quality: AI can offer suggestions, refine language, and identify errors, ultimately contributing to the creation of higher-quality content.
  - Democratization of Writing: One could envision a scenario where access to generative AI is not limited to capitalists. Open-source AI models and collaborative platforms could allow a wider range of individuals to participate in the creation and dissemination of written content, potentially fostering greater creative diversity and reducing the dominance of large corporations in media and publishing.
- Conclusion:
  - While the capitalist application of technology is a central theme in the sources, it is crucial to remember that technology itself is not inherently capitalist. The sources highlight that the capitalist exploitation of labor arises from social relations, not from the tools themselves1213. Similarly, the impact of generative AI will ultimately depend on how it is developed, implemented, and controlled. If these processes are driven by a focus on collective benefit, shared ownership, and the democratization of knowledge and resources, then generative AI can become a powerful force for a more egalitarian future.

viztales

Dimension reduction, State, Orientation, and Speed

Phil 8.15.2024

Phil 8.14.2024

Phil 8.13.2024

Phil 8.12.2024

Phil 8.9.2024

Phil 8.8.2024

Phil 8.6.2024

Phil 8.5.2025

Phil 8.2.2024

Phil 8.1.2024

Phil 7.31.2024

Phil 7.30.2024

Phil 7.29.2024

Phil 7.25.2024

Phil 7.24.2024