Author Archives: pgfeldman

Phil 6.14.2023

Sequels are lower-effort, but have sufficiently high value to be profitable

People + AI Research (PAIR) is a multidisciplinary team at Google that explores the human side of AI by doing fundamental research, building tools, creating design frameworks, and working with diverse communities. We believe that for machine learning to achieve its positive potential, it needs to be participatory, involving the communities it affects and guided by a diverse set of citizens, policy-makers, activists, artists and more.

Center for Accelerating Operational Efficiency (CAOE) Fact Sheet

Found this example of Elephant Armor today

Phil 6.13.2023

The Curse of Recursion: Training on Generated Data Makes Models Forget

  • Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper we consider what the future might hold. What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as Model Collapse and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs. We build theoretical intuition behind the phenomenon and portray its ubiquity amongst all learned generative models. We demonstrate that it has to be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of content generated by LLMs in data crawled from the Internet.

SBIRs

  • Out conferencing, so not so much posting.

GPT Agents

Phil 6.9.2023

Better air today. Sure is dry though:

GPT Agents

  • More work on the hallucination paper – done! Decided not to include the discussion section since it was getting long. Screwed up the title a bit, so I’ll have to fix that later

SBIRs

  • Trip prep
  • 11:30 overleaf meeting – delayed

Phil 6.8.2023

The smoke from the fires in Canada is much worse today:

generative AI learning path

  • This learning path guides you through a curated collection of content on generative AI products and technologies, from the fundamentals of Large Language Models to how to create and deploy generative AI solutions on Google Cloud.

SBIRs

  • 9:15 standup
  • 11:30 CSC touchpoint
  • Put slides on thumb drive and laptop

GPT Agents

  • Try to finish the first pass of the hallucination paper and get it up on ArXiv

Phil 6.7.2023

The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

  • Evolution provides a creative fount of complex and subtle adaptations that often surprise the scientists who discover them. However, the creativity of evolution is not limited to the natural world: artificial organisms evolving in computational environments have also elicited surprise and wonder from the researchers studying them. The process of evolution is an algorithmic process that transcends the substrate in which it occurs. Indeed, many researchers in the field of digital evolution can provide examples of how their evolving algorithms and organisms have creatively subverted their expectations or intentions, exposed unrecognized bugs in their code, produced unexpectedly adaptations, or engaged in behaviors and outcomes uncannily convergent with ones found in nature. Such stories routinely reveal surprise and creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. Bugs are fixed, experiments are refocused, and one-off surprises are collapsed into a single data point. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This paper is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems.

SBIRs

  • Took the mid morning off: https://www.strava.com/activities/9221033759
  • 2:00 MDA slides meeting with Matt. Bring slides on thumb drive. Done. We discussed, and I sent him a copy for the slides as well as an update. We’ll get together again to discuss the presentation on the 19th
  • 3:00 AI ethics tagup. Read Eric’s sidebar. He gets the idea, but he can’t write worth a damn. Cleaned up and added a lot of text. Need to read it aloud tomorrow

GPT Agents

  • Good progress on the Methods section yesterday. Fill out the Results section.
  • 4:00 GPT meeting? Nope.

Phil 6.6.2023

More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models

  • In this work, we show that augmenting LLMs with retrieval and API calling capabilities (so-called Application-Integrated LLMs) induces a whole new set of attack vectors. These LLMs might process poisoned content retrieved from the Web that contains malicious prompts pre-injected and selected by adversaries. We demonstrate that an attacker can indirectly perform such PI attacks. Based on this key insight, we systematically analyze the resulting threat landscape of Application-Integrated LLMs and discuss a variety of new attack vectors. To demonstrate the practical viability of our attacks, we implemented specific demonstrations of the proposed attacks within synthetic applications. In summary, our work calls for an urgent evaluation of current mitigation techniques and an investigation of whether new techniques are needed to defend LLMs against these threats.

AI chatbots lose money every time you use them. That is a problem.

  • Those costs may also be one reason Google has yet to build an AI chatbot into its flagship search engine, which fields billions of queries every day. When Google released its Bard chatbot in March, it opted not to use its largest language model. Dylan Patel, chief analyst at the semiconductor research firm SemiAnalysis, estimated that a single chat with ChatGPT could cost up to 1,000 times as much as a simple Google search.

GPT Agents

  • Working on getting the results into the paper

SBIRs

  • Sprint planning. Need to add MORS symposium and MDA management – done
  • Send MORS stuff – Done!

Phil 6.5.2023

Bringing Open Large Language Models to Consumer Devices

  • This post describes our effort on streamlining the deployment of Open LLMs through a versatile machine learning compilation infrastructure. We bring RedPajama, a permissive open language model to WebGPU, iOS, GPUs, and various other platforms. Furthermore, the workflow we have established can be easily adapted to support a wide range of models with fine-tuned (personalized) weights, promoting flexibility and customization in LLM deployment.

The overparameterized, paralyzed generation

SBIRs

  • Sprint demos. Need to make slides – done
  • Sent off the Q5 report

GPT Agents

  • Got a lot done in reading the json files and making spreadsheets
  • Created a rollup spreadsheet that I think I’ll use for the paper

Phil 6.1.2023

June already! Probably taking tomorrow off since it looks like rain on Saturday

Book

  • Pinged On the Record and Midday. Probably screwed up one email. Nothing back from the Pratt

GPT Agents

  • Modified wikipedia_search.py to get page text that I will use to read into ContextExplorer. That will let me debug the loader and create context prompts for the hallucination project

SBIRs

  • More Q4-5 slides
  • Talk about paper with Aaron? Rather than reworking current content, I’d prefer he work on new until we have a first pass.

Phil 5.31.2023

Democratic Inputs to AI: Our nonprofit organization, OpenAI, Inc., is launching a program to award ten $100,000 grants to fund experiments in setting up a democratic process for deciding what rules AI systems should follow, within the bounds defined by the law.

Book

  • Sent another thread into the void on Twitter and Mastodon
  • Contact On The Record
  • Contact Midday

GPT Agents

  • Got the first pass of the context tagging paper done except for the results section. Need to talk to Shimei and Jimmy about what to put in. Then rewriting and cleanup. Not sure what the venue would be, but literally all the references are on ArXiv, which really says something.
  • 2:30 Alden – interesting. Got some good thinking on prompts
  • 4:00 GPT meeting

SBIRs

  • Start on MDA slides. Looks like it’s going to be a combination of Q4 and Q5. Set up the templates.
  • 3:00 AI Ethics tagup? Oddly, I wrote up and email and sent it out to the team that tried to work out what ethical autonomous systems might look like viewed through the Inupiat lens. Crickets. Colonialism dies hard, I guess.

Phil 5.30.2023

Book

SBIRs

  • 9:15 Standup
  • Talked to Rukan about the idea of using a mixed NN/Markov sim. He’s going to see if he can find a preexisting model (e.g. a multi-link pendulum) to try training the model with. Also set up an overleaf doc for the methods and results
  • 1:00 BMD status
  • Need to get slides submitted for approval (and maybe reformatting?)

GPT Agents

  • Made good progress on the paper and got a lot of good results over the long weekend. I’d like to get the methods section done before the Wednesday meeting
  • Starting the LangChain tutorial, which looks good. Got my Huggingface token so I can try models on the HF hub without downloading. Should make it nice to try context prompting with smaller, more servable models

Phil 5.29.2023

GPT Agents

  • Running a bunch of different models. I’ll need to write up some code to produce some excel spreadsheets from the json files so I can make some tables and charts
  • Start putting the paper together. I think it’s fair to explicitly say this is about OpenAI’s models because they are a) Exemplary, and the approach should be broadly applicable, and b) ChatGPT is causing more problems than any of the other models at the moment.
  • Will need to mention Perplexity.ai, which cites sources well, and is very current. I tried “What are the potential benefits of breaking off small segments from a belief stampede rather than attempting to change everyone’s behavior?” And Perplexity came back with my blog post, which went up three days ago! Search and context really beats more training. Wild.

Phil 5.26.2023

GPT Agents

  • Started on the experiment code to insert (source x) into a context prompt. It works nicely! When I try the prompt on the Playground with the GPT-4.314. with “How can diversity injection disrupt belief stampedes and nudge individuals off the trajectory of a stampede?” I get the following very good answer, with numbers that are in the context prompt
Diversity injection disrupts belief stampedes by targeting individuals with latent interests that may fill the same needs as belonging to a cult (source 2958). It works one person at a time, nudging them off the stampede trajectory into a world where individuals have more freedom to move (source 5214). Instead of directly confronting or interacting with their conspiracy beliefs, it aims to broaden perspectives and reduce polarization by introducing diversity into social networks (source 1313).

This method removes individuals from misinformation streams and exposes them to reality-based information and social groups they would not encounter otherwise (source 4648). Over time, the belief distance may grow so significant that it no longer makes sense to them (source 3995). The overall goal is not to change everyone's behavior around dangerous misinformation but rather to break off small segments from an existing belief stampede, making it less dangerous (source 4406). By promoting diverse information and interactions, it helps create resilient ecosystems that resist belief stampedes (source 7545).
  • If I instead ask an uber question such as “How can I find a Girlfriend?” with the same context, I get:
Finding a girlfriend involves building connections, developing social skills, and being open to new experiences. Here are some steps to guide you:

1. Be yourself: Authenticity is key in attracting potential partners. Be honest about your interests, values, and personality so you can attract someone who appreciates you for who you are.

2. Develop confidence: Confidence is attractive but can take time to build. Focus on self-improvement and self-awareness, including exercise, dressing well, and engaging in hobbies or activities that make you happy.

3. Expand your social circle: Attend social events or activities where you can meet new people, such as joining clubs, taking up a new hobby, or volunteering. These experiences enrich your life and make you more interesting, increasing the chances of finding a girlfriend who shares your interests.

4. Practice good communication skills: Listen actively to others, show empathy, and be genuinely interested in their lives. Engage in meaningful conversations and express your thoughts and opinions respectfully.

5. Be patient: Relationships develop over time. Don't rush into a relationship or try to force one to happen. Focus on building genuine connections and enjoying the present moment.

Sources:
- Breines, J. (2015). "7 Research-Based Reasons Internet Dating Doesn't Work
  • The GPT-3.5-turbo is pickier and will often refuse to answer the question because lack of context. Regardless, the (source x) markers seem to work fine and can actually be used for detailed traceback to the source document!
  • Wrote the experiment code and now the models are not cooperating in generating responses to out of context questions. Sigh. Still, the index list matches great. There are some issues with getting the (source x) marker to appear in all the responses, but that’s probably something that can be done with better prompting or more sophisticated parsing. Something like looking for parens with numbers that match. Cleaned up the regex a bit
pattern = r"\(source\s+\d+(,\s+\d+)*\)\."
  • The results are much better! I think I can write a paper now

SBIRs

  • Got my slides trimmed down. Thanks Dr. Peterson!
  • Spent some time with Ron getting the ContextExplorer running. Success!

Phil 5.25.2023

The Global Conversation. Truth, Trust and Hope – Nobel Prize Summit 2023

GPT Agents

  • Write a first draft of a paper that talks about verifiable context prompting, where source indexes are listed after each end punctuation in the context so that they can be searched for in the response and checked. Hallucinations should have non-existent indices. Verify against some book part, since the GPT hasn’t read that, and see how it does with how do I find a boyfriend/girlfriend uber-prompt. Need to write a small experiment class and put it in a new experiment folder.

SBIRs

  • 9:15 standup – no Aaron today, so I wound up leading.
  • Add enhanced Monte Carlo and Stable Diffusion stories to the backlog – done
  • 11:30 CSC touchpoint
  • More Q5 report. Maybe first pass today? Done!
  • Got a lot done on the scale paper, which is now at 45 pages. Oops. Maybe it really is a book

Phil 5.24.2023

Users choose to engage with more partisan news than they are exposed to on Google Search

  • If popular online platforms systematically expose their users to partisan and unreliable news, they could potentially contribute to societal issues such as rising political polarization1,2. This concern is central to the ‘echo chamber’3,4,5 and ‘filter bubble’6,7 debates, which critique the roles that user choice and algorithmic curation play in guiding users to different online information sources8,9,10. These roles can be measured as exposure, defined as the URLs shown to users by online platforms, and engagement, defined as the URLs selected by users. However, owing to the challenges of obtaining ecologically valid exposure data—what real users were shown during their typical platform use—research in this vein typically relies on engagement data4,8,11,12,13,14,15,16 or estimates of hypothetical exposure17,18,19,20,21,22,23. Studies involving ecological exposure have therefore been rare, and largely limited to social media platforms7,24, leaving open questions about web search engines. To address these gaps, we conducted a two-wave study pairing surveys with ecologically valid measures of both exposure and engagement on Google Search during the 2018 and 2020 US elections. In both waves, we found more identity-congruent and unreliable news sources in participants’ engagement choices, both within Google Search and overall, than they were exposed to in their Google Search results. These results indicate that exposure to and engagement with partisan or unreliable news on Google Search are driven not primarily by algorithmic curation but by users’ own choices.

SBIRs

  • The meeting went well yesterday, I think? Need to write up some thoughts on Stable Diffusion and general meeting notes.
  • Travel reimbursement – done
  • Slides!
  • Q5 Report – good progress
  • JSC kickoff – done

GPT Agents

  • 4:00 meeting
    • Good discussion. I was convinced to write a first draft of a paper that talks about verifiable context prompting, where source indexes are listed after each end punctuation in the context so that they can be searched for in the response and checked. Hallucinations should have non-existent indices. Verify against some book part, since the GPT hasn’t read that, and see how it does with how do I find a boyfriend/girlfriend uber-prompt. Need to write a small experiment class and put it in a new experiment folder.
  • Gene Set Summarization using Large Language Models
    • Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB.
      We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval.
      We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary.
    • This is an interesting idea. If a language model can *do* biochemistry, then it is sophisticated enough to be biochemistry

Book