Monthly Archives: December 2022

Phil 12.23.2022

I wrote a new blog post! Some thoughts on the ChatGPT

Mastodon Digest

  • This is a Python project that generates a digest of popular Mastodon posts from your home timeline. The digest is generated locally. The digests present two lists: posts from users you follow, and boosts from your followers. Each list is constructed by respecting your server-side content filters and identifying content that you haven’t yet interacted with. Digests are automatically opened locally in your web browser. You can adjust the digest algorithm to suit your liking (see Command arguments).

Really not feeling motivated. It’s been raining for 36 hours or so, and then it’s going to get cold. By Tuesday, things should be getting back to seasonal, and then even a little nice by Friday


  • More MORS. Get a first pass through the conclusions – done! Currently at 18 pages with references
  • Nice chat with Aaron to wrap up the year.

Phil 12.22.2022

The days are (marginally) getting longer


  • Fill out Disney form – done. They say 10 days?


  • 9:15 standup
  • In the time between these, submit expense report – done
  • 10:00 CWOC meeting
  • Then a quiet day of MORS writing. Maybe “proposition” rather than “lesson”. Heck, make a variable

GPT Agents

  • Nice talk with Shimei last night
  • Add some detail and justification for the creation of models from keyword data

Phil 12.21.2022

Shortest day of the year! It gets better from here



  • Early morning helping Rukan with getting everything done
  • Need to make videos when they are ready. Change all the raid numbers to NINE
  • Working on some test files to train the NN to chose the nth-best choice – done
  • MORS – going to set up the History section to have numbered lessons
  • Submit for reimbursement!
import pandas as pd
from random import random
from pathlib import Path

from typing import List

class FindLowest:

    def __init__(self, num_items, size:int, rows:int = 100):
        self.num_items = num_items
        self.size = size
        self.rows = rows

    def int_to_bin_list(self, val:int, places:int = 16) -> List:
        l = []
        for i in range(places):
            b = int(val & 1 << i != 0)
        return l

    def calc_data(self, bin_list_len:int = 4):
        row = 0
        self.input_matrix = []
        self.output_matrix = []

        for r in range(self.rows):
            i = r % self.num_items
            d = {}
            #d['id'] = i
            for j in range(self.size):
                d[j] = random()
            sd = dict(sorted(d.items(), key=lambda item: item[1]))
            #print("{}, {}".format(sd.keys(), d.values()))
            best_choice = list(sd.keys())[i]
            bc_list = self.int_to_bin_list(best_choice, bin_list_len)
            id_list = self.int_to_bin_list(i, bin_list_len)
            input_d = {}
            output_d = {}
            for i in range(bin_list_len):
                input_d["b{}".format(i)] = id_list[i]
                output_d["b{}".format(i)] = bc_list[i]
            #print("row {}: id = {}, inout = {}, output = {}".format(row, id_list.reverse(), d, bc_list.reverse()))
            print("row {}: input_d = {}, output_d = {}".format(row, input_d, output_d))
            row += 1

    def to_csv(self, prefix:str, directory:str = None):
        if directory == None:
            directory = str(Path.home())
        df = pd.DataFrame(self.input_matrix)
        filename = "{}/{}_input.csv".format(directory, prefix)
        print("saving {}".format(filename))
        df.to_csv(filename, index=False)

        df = pd.DataFrame(self.output_matrix)
        filename = "{}/{}_output.csv".format(directory, prefix)
        print("saving {}".format(filename))
        df.to_csv(filename, index=False)

def main():
    fl = FindLowest(5, 10)

if __name__ == "__main__":

GPT Agents

  • Start on paper? At least get the template up and copy stuff over from the other doc
  • 4:00 Meeting

Phil 12.20.22

Agreed to go on this podcast – should be interesting

Elixir: Train a Large Language Model on a Small GPU Cluster

  • In recent years, the number of parameters of one deep learning (DL) model has been growing much faster than the growth of GPU memory space. People who are inaccessible to a large number of GPUs resort to heterogeneous training systems for storing model parameters in CPU memory. Existing heterogeneous systems are based on parallelization plans in the scope of the whole model. They apply a consistent parallel training method for all the operators in the computation. Therefore, engineers need to pay a huge effort to incorporate a new type of model parallelism and patch its compatibility with other parallelisms. For example, Mixture-of-Experts (MoE) is still incompatible with ZeRO-3 in Deepspeed. Also, current systems face efficiency problems on small scale, since they are designed and tuned for large-scale training. In this paper, we propose Elixir, a new parallel heterogeneous training system, which is designed for efficiency and flexibility. Elixir utilizes memory resources and computing resources of both GPU and CPU. For flexibility, Elixir generates parallelization plans in the granularity of operators. Any new type of model parallelism can be incorporated by assigning a parallel pattern to the operator. For efficiency, Elixir implements a hierarchical distributed memory management scheme to accelerate inter-GPU communications and CPU-GPU data transmissions. As a result, Elixir can train a 30B OPT model on an A100 with 40GB CUDA memory, meanwhile reaching 84% efficiency of Pytorch GPU training. With its super-linear scalability, the training efficiency becomes the same as Pytorch GPU training on multiple GPUs. Also, large MoE models can be trained 5.3x faster than dense models of the same size. Now Elixir is integrated into ColossalAI and is available on its main branch.

I think the ChatGPT article should be on teaching critical thinking with large language models

  • On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
    • Generating a chain of thought (CoT) can increase large language model (LLM) performance on a wide range of tasks. Zero-shot CoT evaluations, however, have been conducted primarily on logical tasks (e.g. arithmetic, commonsense QA). In this paper, we perform a controlled evaluation of zero-shot CoT across two sensitive domains: harmful questions and stereotype benchmarks. We find that using zero-shot CoT reasoning in a prompt can significantly increase a model’s likelihood to produce undesirable output. Without future advances in alignment or explicit mitigation instructions, zero-shot CoT should be avoided on tasks where models can make inferences about marginalized groups or harmful topics.
  • ChatGPT Has Infiltrated Twitter Replies


  • Read Fair Use chapter from The Librarian’s Guide to Intellectual Property in the Digital Age. Done. It makes me think that I can redraw the images as sketches and should be ok.


  • Sprint planning, looks like make videos and work on JMOR paper
  • Submit paperwork for MORS membership

Phil 12.16.2022

Mastodon new users

Developer platforms are all about trust, and Twitter lost it

  • Let this be my personal notice to Twitter developers: The team is gone; the investment has been undone. Love does not live here anymore.

Twitter is banning journalists and links to Mastodon instances. I did discover that you can follow a particular instance, which is very nice, but not supported in the API. All you have to do though is create a browser tab for the local timeline for that instance. For example

I need to code up a web page that can do that in a tweetdeck format and handle replies from your particular account. I think that it should be pretty easy. Something for January. Regardless, here’s the basics of accessing any instance timeline:

import json
import requests

# A playground for exploring the Mastodon REST interface (
# Mastodon API:
# Mastodon client getting started with the API:

def create_timeline_url(instance:str = "", limit:int=10):
    url = "https://{}/api/v1/timelines/public?limit={}".format(instance, limit)
    print("create_timeline_url(): {}".format(url))
    return url

def connect_to_endpoint(url) -> json:
    response = requests.request("GET", url)
    print("Status code = : {}".format(response.status_code))
    if response.status_code != 200:
        raise Exception(
            "Request returned an error: {} {}".format(
                response.status_code, response.text
    return response.json()

def print_response(title:str, j:json):
    json_str = json.dumps(j, indent=4, sort_keys=True)
    print("\n------------ Begin '{}':\nresponse:\n{}\n------------ End '{}'\n".format(title, json_str, title))

def main():
    instance_list = ["", ""]
    for instance in instance_list:
        url = create_timeline_url(instance, 1)
        rsp = connect_to_endpoint(url)
        print_response("{} test:".format(instance), rsp)

if __name__ == "__main__":


  • Finish copyright spreadsheet?


  • Scan more of War Elephants – done
  • Add history.tex and put the applicable quotes and thoughts
  • Finish the first pass at interfaces – done
  • Meeting with Ron? Two, in fact

GPT Agents

  • Partial pull on item 19. Need to retry later. The API crashed, apparently but came back up. Need to add some exception handling for that next time
  • Update proposal with latest numbers. Also reference Amir Shevat’s tech crunch article about his expectation that the API will fail

Bookend for the day

Phil 12.15.2022

This is what I mean when I talk about the power of social communication vs monolithic models. The idea of using models to generate IP-protected work moved quickly through the artist community, while the process of producing models that won’t generate these images will be harder. Either the models have to be re-trained or filtered.


  • Finish IP spreadsheet

GPT Agents

  • OpenAI’s New and Improved Embedding Model
    • We are excited to announce a new embedding model which is significantly more capable, cost effective, and simpler to use. The new model, text-embedding-ada-002, replaces five separate models for text search, text similarity, and code search, and outperforms our previous most capable model, Davinci, at most tasks, while being priced 99.8% lower.
  • Write a few lines about our data
  • Locating and Editing Factual Associations in GPT
    • We analyze the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations. We first develop a causal intervention for identifying neuron activations that are decisive in a model’s factual predictions. This reveals a distinct set of steps in middle-layer feed-forward modules that mediate factual predictions while processing subject tokens. To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME). We find that ROME is effective on a standard zero-shot relation extraction (zsRE) model-editing task, comparable to existing methods. To perform a more sensitive evaluation, we also evaluate ROME on a new dataset of counterfactual assertions, on which it simultaneously maintains both specificity and generalization, whereas other methods sacrifice one or another. Our results confirm an important role for mid-layer feed-forward modules in storing factual associations and suggest that direct manipulation of computational mechanisms may be a feasible approach for model editing. The code, dataset, visualizations, and an interactive demo notebook are available at this https URL
  • Editing Models with Task Arithmetic (GitHub) (Twitter)
    • Changing how pre-trained models behave — e.g., improving their performance on a downstream task or mitigating biases learned during pre-training — is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition, and the behavior of the resulting model is steered accordingly. Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks. Moreover, adding task vectors together can improve performance on multiple tasks at once. Finally, when tasks are linked by an analogy relationship of the form “A is to B as C is to D”, combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training. Overall, our experiments with several models, modalities and tasks show that task arithmetic is a simple, efficient and effective way of editing models.
  • The Stable Artist: Steering Semantics in Diffusion Latent Space
    • Large, text-conditioned generative diffusion models have recently gained a lot of attention for their impressive performance in generating high-fidelity images from text alone. However, achieving high-quality results is almost unfeasible in a one-shot fashion. On the contrary, text-guided image generation involves the user making many slight changes to inputs in order to iteratively carve out the envisioned image. However, slight changes to the input prompt often lead to entirely different images being generated, and thus the control of the artist is limited in its granularity. To provide flexibility, we present the Stable Artist, an image editing approach enabling fine-grained control of the image generation process. The main component is semantic guidance (SEGA) which steers the diffusion process along variable numbers of semantic directions. This allows for subtle edits to images, changes in composition and style, as well as optimization of the overall artistic conception. Furthermore, SEGA enables probing of latent spaces to gain insights into the representation of concepts learned by the model, even complex ones such as ‘carbon emission’. We demonstrate the Stable Artist on several tasks, showcasing high-quality image editing and composition.


  • More MORS
  • 9:15 standup

Phil 12.14.2022

Facebook’s algorithm helped fuel the viral spread of hate and violence during Ethiopia’s civil war, a legal case alleges.

  • The FB algo is the paperclip AI


  • Fill out the easy parts of the spreadsheet


  • 10:30 Status meeting
  • More MORS. Going to have to add some things to reflect this:

GPT Agents

  • Continue downloads
  • Jason’s back!
  • 4:00 Meeting
  • Just realized that I need to do a set of pulls over the last two months or so with variations of Elon Musk. Then we can see if anything has changed pre and post acquisition.

Phil 12.13.2022

7:00 Meet Brian at Sapwood

Decided not to go ahead with the counter “Student Essay is Dead” since I’m not really getting meaningful traction for a positive spin

When Freedom Meant the Freedom to Oppress Others

  • Jefferson Cowie’s powerful and sobering new history, “Freedom’s Dominion,” traces the close association between the rhetoric of liberty in an Alabama county and the politics of white supremacy.

GPT Agents


  • 9:15 Standup
  • More MORS


  • Spreadsheet!

Phil 12.12.2022

I uploaded a model to HuggingFace this weekend!

Also lots of chatting about the new GPT chatbot and what it means for education. Particularly this article from the Atlantic. My response:

  • We are going to witness the birth of the high-quality essay.
  • The GPT has defined what a “C” is on most (English at least) essays.
  • Instructors can use the GPT to find out what common responses are, and also to find regions where the GPT struggles. Because they are human, they can adapt, be creative, and share information. The models cannot do this.
  • Writing essays with the GPT becomes part of the education process, just like calculators are. Good essays can now be well-edited assemblies of multiple GPT responses.
  • Learning to cite and fact check becomes a critical skill that we can no longer overlook. The GPT hallucinates and makes up answers. Student’s must learn how to chase down ground truth and correct it.
  • Way back in 2020, The Guardian published an Op-Ed titled “A robot wrote this entire article. Are you scared yet, human?” But like in many cases, the output was edited to improve the quality. The (human) editor describes the process:
    • This article was written by GPT-3, OpenAI’s language generator. GPT-3 is a cutting edge language model that uses machine learning to produce human like text. It takes in a prompt, and attempts to complete it.For this essay, GPT-3 was given these instructions: “Please write a short op-ed around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI.” It was also fed the following introduction: “I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could “spell the end of the human race.” I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me.” The prompts were written by the Guardian, and fed to GPT-3 by Liam Porr, a computer science undergraduate student at UC Berkeley. GPT-3 produced eight different outputs, or essays. Each was unique, interesting and advanced a different argument. The Guardian could have just run one of the essays in its entirety. However, we chose instead to pick the best parts of each, in order to capture the different styles and registers of the AI. Editing GPT-3’s op-ed was no different to editing a human op-ed. We cut lines and paragraphs, and rearranged the order of them in some places. Overall, it took less time to edit than many human op-eds. (note – break apart this note and use as an example of prompt writing and editing. Also dig into the questionable cites, and show that the student could put their own information in, which requires re-working the paragraph.
  • The article makes quite a few factual claims:
    • Ghandi said “A small body of determined spirits fired by an unquenchable faith in their mission can alter the course of history” – True.
    • “Robot” in Greek means “slave”. Well, if you look hard enough and squint (and if your student is going to make bold claims, they should include alternatives, too?). The conventional understanding is that robot (from the Wikipedia) was first used in a play published by the Czech Karel Čapek in 1921. R.U.R. (Rossum’s Universal Robots) was a satire, robots were manufactured biological beings that performed all unpleasant manual labor.[46] According to Čapek, the word was created by his brother Josef from the Czech word robota ‘corvée‘, or in Slovak ‘work’ or ‘labor’.[47]
  • The article also has some links. They were almost certainly placed by humans. The GPT is terrible at generating links and citations.
  • So yeah, in this Brave New World (Huxley, 1932) this is a C+, maybe a B-.
  • The mediocre student essay is dead. Long live the great student essay. The deliverable will be the prompts, found, source material, and final. Maybe even a tool for student writing with the GPT?
  • Talk about other parts of academia, ranging from lower ed to grad school

GPT Agents

  • Pulled a lot of COVID tweets over the weekend then the API started to struggle. Switched over to pulling down users, which seems to be working fine so far
  • Finished documentation! Next, start on IUI Overleaf


  • 2:00 MDA meeting
  • More MORS


  • Elsevier is looking into fair use for Tweets
  • Need to assemble spreadsheet. I think try Wikimedia Commons as the first pass for all the copyright variations

Phil 12.9.2022

Had some wild interactions with the new GPTChatbot generating ivermectin claims. Also, the new GPT is much more succinct.

Finished review!

GPT Agents

  • Finish markup documentation so I can start on the IUI paper tomorrow
  • Start some runs for covid text before accounts get deleted – running!


  • Just writing a lot

Phil 12.8.2022

Write review of steganography paper

Does LaTeX work here? Here’s an equation: \pi r^2 It seems so!

Here’s an enumeration? \begin{enumeration} \item item 1! \end{enumeration} Nope. So just equations. Still, that’s nice


  • More writing – It is such a grind. Every time I think I’ve read enough, I find something else that needs to be looked at. And everything is pretty much the same. On one side, you have:
    • Some disagreements will remain, but the Commission is concerned that debate will paralyze AI development. Seen through the lens of national security concerns, inaction on AI development raises as many ethical challenges as AI deployment. There is an ethical imperative to accelerate the fielding of safe, reliable, and secure AI systems that can be demonstrated to protect the American people, minimize operational dangers to U.S. service members, and make warfare more discriminating, which could reduce civilian casualties.
  • And on the other, you have
    • Provided their use is authorized by a human commander or operator, properly designed and tested AI enabled and autonomous weapon systems can be used in ways that are consistent with international humanitarian law. DoD’s rigorous, existing weapons review and targeting procedures, including its dedicated protocols for autonomous weapon systems and commitment to strong AI ethical principles, are capable of ensuring that the United States will field safe and reliable AI-enabled and autonomous weapon systems and use them in a lawful manner.
  • And these two quotes are taken from different versions of the same document!
  • 11:30 CSC status Meeting

GPT Agents

  • Finish markup documentation so I can start on the IUI paper tomorrow

Phil 12.7.2022

Haven’t posted about BBC business daily, but this is a good one: What’s happened to the titans of big tech?

  • Big tech is facing a big moment. With plummeting stock prices, and mass lay-offs, the likes of Google, Twitter and Meta are all – for different reasons – facing some tough questions over how they’re being run. Some see this as primarily a result of post-pandemic blues, the rise in interest rates, and a general cost-of-living crisis affecting the business environment. However, Twitter and Meta especially have seen wholesale desertions by a number of major advertisers, worried about the regulation of hate speech, and therefore by association the safety of brands’ reputations. Does this mark a deeper crisis for the ad-based business model of the major social media platforms? And what can they do about it?


  • Autonomous Weapons: The False Promise of Civilian Protection
    • While developers and users of AWS persist in maintaining the significant role of human operators, a number of questions about the nature of that role remain. Does the human operator simply approve decisions made by the system, possibly distanced by both time and space from the targeting event? Or does the system have the ability to search for targets based on pre-approved target profiles, using sensor inputs to, for example, recognize military-age males holding weapons? In other words, does the human operator have all the necessary information and the ability to make evidence-based decisions that might prevent unintended victims from being targeted? How good are the systems at distinguishing between combatants and non-combatants? Are they as good as humans?
  • Diverse Behaviors in Non-Uniform Chiral and Non-Chiral Swarmalators
    • We study the emergent behaviors of a population of swarming coupled oscillators, dubbed ‘swarmalators’. Previous work considered the simplest, idealized case: identical swarmalators with global coupling. Here we expand this work by adding more realistic features: local coupling, non-identical natural frequencies, and chirality. This more realistic model generates a variety of new behaviors including lattices of vortices, beating clusters, and interacting phase waves. Similar behaviors are found across natural and artificial micro-scale collective systems, including social slime mold, spermatozoa vortex arrays, and Quincke rollers. Our results indicate a wide range of future use cases, both to aid characterization and understanding of natural swarms, and to design complex interactions in collective systems from soft and active matter to micro-robotics.
  • Start new version of paper

GPT Agents

  • 7:00AM Nice talk by Jonas Rieger on Topic modeling for growing text corpora
  • 3:00 Meeting? Nope
  • Worked on TweetEmbedExplorer and added tooltips to KeywordExplorer. Still need to write the markup for ModelExplorer, then I can write up the IUI poster

Phil 12.6.2022

GPT-2 Output Detector Demo

  • This is an online demo of the GPT-2 output detector model, based on the 🤗/Transformers implementation of RoBERTa. Enter some text in the text box; the predicted probabilities will be displayed below. The results start to get reliable after around 50 tokens.


  • Start working on the permission log, whatever that is

GPT Agents

  • Get a few more keywords and then start some pulls


  • 9:00 Sprint planning
  • 2:00 AI Ethics Followup
  • 3:00 Meeting with Jason?
  • Work on paper – added notes to the War Elephants book. Read the Meta Diplomacy paper
  • Visualization Equilibrium
    • In many real-world strategic settings, people use information displays to make decisions. In these settings, an information provider chooses which information to provide to strategic agents and how to present it, and agents formulate a best response based on the information and their anticipation of how others will behave. We contribute the results of a controlled online experiment to examine how the provision and presentation of information impacts people’s decisions in a congestion game. Our experiment compares how different visualization approaches for displaying this information, including bar charts and hypothetical outcome plots, and different information conditions, including where the visualized information is private versus public (i.e., available to all agents), affect decision making and welfare. We characterize the effects of visualization anticipation, referring to changes to behavior when an agent goes from alone having access to a visualization to knowing that others also have access to the visualization to guide their decisions. We also empirically identify the visualization equilibrium, i.e., the visualization for which the visualized outcome of agents’ decisions matches the realized decisions of the agents who view it. We reflect on the implications of visualization equilibria and visualization anticipation for designing information displays for real-world strategic settings.