Category Archives: Users

Phil 4.14.21

GPT Agents

  • Generated reversed version of the chinavirus corpora and am currently training a model. The Huggingface API has changed some, and it seems very slow?
  • Lit review


  • Assisting Rukan
  • 10:00 Meeting


  • 5:30 Editing with Michelle


  • 7:00 Meeting

Phil 4.13.21

GPT Agents

  • Working on paper – barring the lit review, I’m at a first draft, I think
    • Still need to do the abstract! Done!
  • 3:00 Meeting today
    • Banged away on a lot of issues. I need to put together a lit review by tomorrow COB. The due date is the 19th, though!
  • I have a crazy idea for prompt generation. I think I’m going to train a model on text with the word order reversed. Then an ‘answer’ fed in to the reversed system should generate a set of prompts that should have a high chance of generating that answer, once re-reversed.
  • Fixed all the weird parsing issues for POS strings


  • Need to set up meeting for April 30th or May 7th at 1:15 pm PT (4:15 pm ET)


  • 9:15 Sprint scheduling

Phil 4.8.21

Print and mail taxes today

How many data points is a prompt worth?

  • Prompts are interesting because they allow a practitioner to give information to the model, although in a very different fashion from standard ML supervision. In our NAACL 2021 paper with Sasha Rush, we investigate prompt-based fine-tuning, a promising alternative fine-tuning approach, and find that prompts often yield an edge over the standard approach. As we interpret a prompt as additional human-crafted information for the model, we measure that edge in terms of data points and quantify: how many data points is a prompt worth?


  • 9:15 IRAD standup
  • 11:00 Meeting with Orest
  • Make slide for Aaron
  • More work with Rukan


GPT Agents

  • More writing

Phil 4.7.21

Two perspectives on large language model (LLM) ethics

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

  • The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.

Alignment of Language Agents

  • For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for language agents, arising from accidental misspecification by the system designer. We highlight some ways that misspecification can occur and discuss some behavioural issues that could arise from misspecification, including deceptive or manipulative language, and review some approaches for avoiding these issues.


GPT Agents

  • Move token workbooks into the right place – done. Recalculated a few. Also created a folder for modified spreadsheets so that I can find them later!
  • Write! Did some, but mostly made charts:


  • 10:00 Meeting
  • More model tuning with Rukan. Much better luck with MLPs! Going to rethink how an attention head should be attached to a linear layer


  • 7:00 Meeting

Phil 4.6.21

Need to agree to re-review

GPT Agents

  • Continued to adjust the schema. Probe now stores the full raw json response as a string:
  • Added logit storage to the raw values. A exp(0) = 1.0, or 100%. Anything less than that is lower probability
  • Continuing to work on the paper
  • 3:00 Meeting


  • More Transformer work. Need to save out some screenshots for slides this time!
  • 9:15 Standup

Phil 4.5.21

GPT Agents

  • Made some more progress on the mapping framework. Stubbed out some tables for storing the node and edge information, and started to look at probes that can create long jumps to other sections of the space, e.g.
There are also some countries that are very far away from the United States. Here's a short list, starting with the most distant, separated by commas:
  • More working on the paper


  • Got the Transformer doing its thing. It looks like it might work!
  • Having some difficulty getting it to behave with batches, though

Phil 4.4.21

Happy end-of-Passover, Easter!

Playing with the GPT mapping, and I’ve gotten queries running with POS processing. Here’s the prompt:

"A list of the countries that are nearest the United States, separated by comma:"

Here’s the response:

Canada, Mexico, Bahamas, Dominican Republic, Haiti, Jamaica, Cuba, Trinidad and Tobago, Puerto Rico, Barbados, Antigua and Barbuda, Saint Lucia, Saint Vincent and the Grenadines, Grenada, Domin

And here it is processed by Flair:

{'text': 'Canada', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Mexico', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Bahamas', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Dominican', 'tag': 'NNP'}
{'text': 'Republic', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Haiti', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Jamaica', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Cuba', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Trinidad', 'tag': 'NNP'}
{'text': 'and', 'tag': 'CC'}
{'text': 'Tobago', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Puerto', 'tag': 'NNP'}
{'text': 'Rico', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Barbados', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Antigua', 'tag': 'NNP'}
{'text': 'and', 'tag': 'CC'}
{'text': 'Barbuda', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Saint', 'tag': 'NNP'}
{'text': 'Lucia', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Saint', 'tag': 'NNP'}
{'text': 'Vincent', 'tag': 'NNP'}
{'text': 'and', 'tag': 'CC'}
{'text': 'the', 'tag': 'DT'}
{'text': 'Grenadines', 'tag': 'NNPS'}
{'text': ',', 'tag': ','}
{'text': 'Grenada', 'tag': 'NNP'}
{'text': ',', 'tag': ','}
{'text': 'Domin', 'tag': 'NNP'}

I am very excited!

Phil 4.2.21

I think milling = fashion

GPT Agents

  • Extract the sentiment into a workbook. It looks like it should be pretty easy:
select count(*) as count, probe from table_output where experiment_id = 89 and tag = 'raw' and sent_label = 'NEGATIVE' group by probe order by probe;
  • Continue on paper, upload to Overleaf, too
  • Meeting at 5:00


  • More work with Rukan? Need to figure out why a 5×256 is going in, but a 256×256 is coming out. We could try an attention layer first. Let’s see how things go?
  • Set up a time to discuss research with Orest


  • 2:00 Meeting with Michelle

Phil 4.1.21

Exploring the effects of algorithm-driven news sources on political behavior and polarization

  • Do algorithm-driven news sources have different effects on political behavior when compared to non-algorithmic news sources? Media companies compete for our scarce time and attention; one way they do this is by leveraging algorithms to select the most appealing content for each user. While algorithm-driven sites are increasingly popular sources of information, we know very little about the effects of algorithmically determined news at the individual level. The objective of this paper is to define and measure the effects of algorithmically generated news. We begin by developing a taxonomy of news delivery by distinguishing between two types of algorithmically generated news, socially driven and user-driven, and contrasting these with non-algorithmic news. We follow with an exploratory analysis of the effects of these news delivery modes on political behavior, specifically political participation and polarization. Using two nationally representative surveys, one of young adults and one of the general population, we find that getting news from sites that use socially driven or user-driven algorithms to generate content corresponds with higher levels of political participation, but that getting news from non-algorithmic sources does not. We also find that neither non-algorithmic nor algorithmically determined news contribute to higher levels of partisan polarization. This research helps identify important variation in the consequences of news consumption contingent on the mode of delivery.

GPT Agents

  • Finished POS tokenizing terms
  • Started sentiment (POS/NEG) on terms – done
  • Stubbed out the POS and sentiment for the token full_string – done
  • Working on the paper – progress


  • 2:00 Standup
  • 2:00 VDI Ubuntu

Phil 3.31.21

Cool thing for the day!

Cognitive networks identify the content of English and Italian popular posts about COVID-19 vaccines: Anticipation, logistics, conspiracy and loss of trust

  • Monitoring social discourse about COVID-19 vaccines is key to understanding how large populations perceive vaccination campaigns. We focus on 4765 unique popular tweets in English or Italian about COVID-19 vaccines between 12/2020 and 03/2021. One popular English tweet was liked up to 495,000 times, stressing how popular tweets affected cognitively massive populations. We investigate both text and multimedia in tweets, building a knowledge graph of syntactic/semantic associations in messages including visual features and indicating how online users framed social discourse mostly around the logistics of vaccine distribution. The English semantic frame of “vaccine” was highly polarised between trust/anticipation (towards the vaccine as a scientific asset saving lives) and anger/sadness (mentioning critical issues with dose administering). Semantic associations with “vaccine,” “hoax” and conspiratorial jargon indicated the persistence of conspiracy theories and vaccines in massively read English posts (absent in Italian messages). The image analysis found that popular tweets with images of people wearing face masks used language lacking the trust and joy found in tweets showing people with no masks, indicating a negative affect attributed to face covering in social discourse. A behavioural analysis revealed a tendency for users to share content eliciting joy, sadness and disgust and to like less sad messages, highlighting an interplay between emotions and content diffusion beyond sentiment. With the AstraZeneca vaccine being suspended in mid March 2021, “Astrazeneca” was associated with trustful language driven by experts, but popular Italian tweets framed “vaccine” by crucially replacing earlier levels of trust with deep sadness. Our results stress how cognitive networks and innovative multimedia processing open new ways for reconstructing online perceptions about vaccines and trust.

GPT-3 (actually GPT-Neo) is available on Huggingface:

GPT Agents

python \
    --model_name_or_path gpt2 \
    --train_file path_to_train_file \
    --validation_file path_to_validation_file \
    --do_train \
    --do_eval \
    --output_dir /tmp/test-clm
  • There is also an API that gives you more control described here.
from transformers import BertForSequenceClassification, Trainer, TrainingArguments

model = BertForSequenceClassification.from_pretrained("bert-large-uncased")

training_args = TrainingArguments(
    output_dir='./results',          # output directory
    num_train_epochs=3,              # total # of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=64,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs

trainer = Trainer(
    model=model,                         # the instantiated 🤗 Transformers  model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=test_dataset            # evaluation dataset
  • Got tired of recalculating parts-of-speech, so I added a field to table_output for that and sentiment. currently reprocessing all the tables from Fauci/Trump forward.
  • Update the Overleaf doc
  • Figuring out what to do with the chess paper with Antonio


  • MDA Meeting at 10:00

Phil 3.30.21

GPT Agents

import flair
from flair.models import TextClassifier
flair_sentiment = TextClassifier.load('en-sentiment')
text="Avengers: Infinity War is a giant battle for which directors Anthony and Joe Russo have given us touches of JRR Tolkien’s Return of the King and JK Rowling’s Harry Potter and the Deathly Hallows. The film delivers the sugar-rush of spectacle and some very amusing one-liners."
total_sentiment = sentence.labels

[POSITIVE (0.9994151592254639)]


  • Sprint planning
  • Trying to get a charge number for the RFI response – done
  • Finished the response

Phil 3.29.21

IT’s the end of the month, so it’s time for these two charts again. i think we’re seeing the Vaccines starting to have and effect? At least Switzerland seems to have gotten its second wave under control. Italy though…

And here’s the USA. Georgia is over two times worse than the UK. Think about that.


  • Sprint review
  • Proposal. Boy, was that interesting. We had one vague paragraph to go on. I fed that into the GPT playground along with some additional text to structure the response. and it damn near wrote the whole thing, pulling latent knowledge out of the model. I’l do a more detailed writeup later.


  • Working on say-mask token analysis code. Done!

Phil 3.26.21

Vaccine today (hopefully)! Here’s Maryland one year ago:

And here we are today:

What a terrible year.

GPT Agents

  • Rolling in Sim’s suggestions
  • Heatmaps by row, column, or matrix are done. Need to put together a base class for a lot of this
  • Working out the summaries
  • 3:30 Meeting, got some good prompts for masks. Running them now