Monthly Archives: February 2023

Phil 2.8.2023

SambaNova Systems, the company that was first to market with domain-specific, pre-trained foundation models to underpin generative AI, announces a new program for startups to leverage these transformational capabilities. SambaNova is offering up to $1M dollars in free compute credits for generative AI to selected companies that have applied to the program to power and build generative AI applications running on SambaNova’s platform.

Generative AI: The Next Consumer Platform

  • We’ve entered the age of generative AI. The use cases are everywhere—from writing essays to creating comics to editing films—and adoption has outpaced every consumer tech trend of the past decade. Text generator ChatGPT surpassed 1 million users in just five days, and tens of millions of consumers have created AI avatars.
  • Whenever new technology captures consumer attention so quickly, it begs the question: is there real value here? We believe that the answer is undoubtedly yes. Generative AI will be the next major platform upon which founders build category-defining products. 
  • Much as the iPhone revolutionized our daily interaction with technology—spawning products like Uber, DoorDash, and Airbnb—generative AI will change everyday life. 
  • I think we’re entering the steep part of the singularity curve, and the paperclip function is “maximize revenue,” part of which is getting first mover advantage. So it’s going to be a centaur singularity.

Tasks

  • Schedule physical

GPT Agents

  • 2:00 Alden Dima
  • 4:00 UMBC Meeting
  • Add cites and a GPT ethics statement, then send to Jimmy

Phil 2.7.2023

Tasks

  • Schedule physical

GPT Agents

SBIRs

  • Get Mors abstract submitted by Feb 10
  • Got storing and loading of reduced embeddings and parameters in NarrativeExplorer
  • 9:15 standup – done
  • 1:00 Biweekly meeting – canceled
  • 3:00 New SBIR meeting – meh

Book

  • More proofing – done!

Phil 2.4.2023

OpenAi has been busy. First, they have some tutorials about interfacing with document collections using embeddings. Looks like a simpler version of GPT-Index

Second, they wrote up a report on using LLMs for misinformation and what to do about that:

Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations

  • Generative language models have improved drastically, and can now produce realistic text outputs that are difficult to distinguish from human-written content. For malicious actors, these language models bring the promise of automating the creation of convincing and misleading text for use in influence operations. This report assesses how language models might change influence operations in the future, and what steps can be taken to mitigate this threat. We lay out possible changes to the actors, behaviors, and content of online influence operations, and provide a framework for stages of the language model-to-influence operations pipeline that mitigations could target (model construction, model access, content dissemination, and belief formation). While no reasonable mitigation can be expected to fully prevent the threat of AI-enabled influence operations, a combination of multiple mitigations may make an important difference.

Journalistic Lessons for the Algorithmic Age

  • At The Markup we pioneered an array of scientifically inspired methods that used automation and computational power to supercharge our journalism. Reflecting on our work, I came up with 10 of the most important lessons I’ve learned using this approach.

Book

  • Proofing chapters. Finished up to chapter 10. Minor tweaks

Phil 2.3.2023

Brr.

SBIRs

  • Meeting a 10:30 to discuss GPT with Isaac. Wide ranging and fun. He’s going to add some slides
  • Afternoon chat with Aaron. Also wide ranging and fun. 1) We are probably in the Singularity, and 2) The universe is probably not a simulation
  • After some struggling, got the dev branch of the binary encoding project set up with Rukan

Book

  • Working on the proofs

GPT Agents

  • The demo got accepted at IUI! I may be going to Australia?
  • Getting the clustering and embedding working

Phil 2.2.2023

Return glasses for less powerful prescription. I’ll do that after my 2:00 meeting

Looks like the end of academic access. Ah well, it was a nice run. Trained language models are more fun anyway

Extracting Training Data from Diffusion Models

  • Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

And I found the Trump campaign trip I’ve been looking for!

SBIRs

  • Finished the second draft! Need to send it out for some external sanity check. The SLT would like to see it too.
  • 9:15 standup – done
  • 11:30 CSC touch point
  • 2:00 MORS meeting with Aaron – done! Sent off to SLT
  • Send draft! Done!
  • Check out GPT-Index (github.com/jerryjliu/gpt_index) – done! Need to see if it will work with Python 3.7.4
  • Talk to Rukan and Aaron about making a separate repo for binary encoding project, notebooks, and results – done. Set up tomorrow maybe?

GPT-Agents

  • Copy over and wire up PCA, TSNE, and DBSCAN.

Book

  • Start proofing. I think downloading chapters to Word for grammar and spell checks is probably the way to go

Phil 2.1.2023

This is true! I’ve put together a spreadsheet so you can see for yourself

SBIRs

  • More FOM stuff. Maybe a meeting at 2:00?
  • MORS paper with Aaron. Nope, but did finish the second draft.

GPT Agents

  • 4:00 Meeting
  • Went on a bit of a tangent discussing Bostrom’s paperclip conjecture and how recommender algorithms could be that, but from a human/ai source, not agi. The problem is at the scales that these systems might have effects at, it is not clear what the objective function means, and if we are, in fact destroying the world by creating an algorithm that seeks to optimize for one thing, but does so in ways that are ultimately destructive to humans. Venue could be the 5th AAAI/ACM Conference on AI, Ethics, and Society Papers are due on March 5.

Book