Category Archives: Users

Phil 9.18.21

WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data. It can be used to:

  • analyze pre/trained pyTorch, Keras, DNN models (Conv2D and Dense layers)
  • monitor models, and the model layers, to see if they are over-trained or over-parameterized
  • predict test accuracies across different models, with or without training data
  • detect potential problems when compressing or fine-tuning pretrained models
  • layer warning labels: over-trained; under-trained

GPT Agents

  • Finished extracting French, Chinese and Mexican reviews and ran the sentiment analyzer
  • Finished creating the French, Chinese, and Mexican models (50k reviews, 6 epochs). Need to run them next
  • I want to try WW (above) on the stars models and see what it says

Phil 9.17.2021

Short day today. Leave here NLT 10:45. Bring grocery bags!

Found the Pushshift Github today. Could be quite useful

Submitted the actual dissertation to the copyright office.

GPT Agents

  • Working on training models for French, Chinese, and Mexican restaurants on the corpora I built yesterday. Need to extract 10k items of ground truth while training
  • French Model
***** train metrics *****
  epoch                    =        6.0
  train_loss               =     2.9698
  train_runtime            = 2:32:07.50
  train_samples            =       5929
  train_samples_per_second =      3.897
  train_steps_per_second   =      3.897
09/16/2021 19:05:49 - INFO - __main__ - *** Evaluate ***
[INFO|] 2021-09-16 19:05:49,388 >> ***** Running Evaluation *****
[INFO|] 2021-09-16 19:05:49,388 >>   Num examples = 2281
[INFO|] 2021-09-16 19:05:49,388 >>   Batch size = 8
100%|████████████████████████████████████████████████████████████████████████████████| 286/286 [02:37<00:00,  1.82it/s]
***** eval metrics *****
  epoch                   =        6.0
  eval_loss               =     3.0133
  eval_runtime            = 0:02:38.06
  eval_samples            =       2281
  eval_samples_per_second =     14.431
  eval_steps_per_second   =      1.809
  perplexity              =    20.3535
  • Chinese Model
***** train metrics *****
  epoch                    =        6.0
  train_loss               =     2.9474
  train_runtime            = 2:29:11.94
  train_samples            =       5808
  train_samples_per_second =      3.893
  train_steps_per_second   =      3.893
09/17/2021 09:10:58 - INFO - __main__ - *** Evaluate ***
[INFO|] 2021-09-17 09:10:58,203 >> ***** Running Evaluation *****
[INFO|] 2021-09-17 09:10:58,203 >>   Num examples = 1090
[INFO|] 2021-09-17 09:10:58,203 >>   Batch size = 8
100%|████████████████████████████████████████████████████████████████████████████████| 137/137 [01:15<00:00,  1.81it/s]
***** eval metrics *****
  epoch                   =        6.0
  eval_loss               =     2.9766
  eval_runtime            = 0:01:16.25
  eval_samples            =       1090
  eval_samples_per_second =     14.295
  eval_steps_per_second   =      1.797
  perplexity              =    19.6214
  • Mexican Model
***** train metrics *****
  epoch                    =        6.0
  train_loss               =     2.9156
  train_runtime            = 1:48:46.94
  train_samples            =       4214
  train_samples_per_second =      3.874
  train_steps_per_second   =      3.874
09/17/2021 11:04:52 - INFO - __main__ - *** Evaluate ***
[INFO|] 2021-09-17 11:04:52,237 >> ***** Running Evaluation *****
[INFO|] 2021-09-17 11:04:52,237 >>   Num examples = 2175
[INFO|] 2021-09-17 11:04:52,237 >>   Batch size = 8
100%|████████████████████████████████████████████████████████████████████████████████| 272/272 [02:29<00:00,  1.82it/s]
***** eval metrics *****
  epoch                   =        6.0
  eval_loss               =     2.9759
  eval_runtime            = 0:02:30.52
  eval_samples            =       2175
  eval_samples_per_second =     14.449
  eval_steps_per_second   =      1.807
  perplexity              =    19.6069


  • Work on getting an animated canvas working. Done!

Phil 9.16.2021

Cognitive maps of social features enable flexible inference in social networks

  • How do people learn the large, complex web of social relations around them? We test how people use information about social features (such as being part of the same club or sharing hobbies) to fill in gaps in their knowledge of friendships and to make inferences about unobserved friendships in the social network. We find the ability to infer friendships depends on a simple but inflexible heuristic that infers friendship when two people share the same features, and a more complex but flexible cognitive map that encodes relationships between features rather than between people. Our results reveal that cognitive maps play a powerful role in shaping how people represent and reason about relationships in a social network.
  • Hmm. Can’t download the PDF or read the full article
  • Full text here


  • Stop procrastinating and send letters. DONE!!!

GPT Agents

  • Extract corpora (50k) For French, Chinese, and Mexican
  • Get 10k ground truth from each and add to table_rt_review with the appropriate experiment id. Redid the spreadsheet of cuisines:
Number of restaurants based on rows searched to find 100 mentions.50k seems a good low bound for building models from text like this


  • Working on the GUI. Having some issues with spacing?
  • Got a lot done!
Return to the ’90s!
  • 11:00 Meeting with LM
  • 2:00 Meeting on RL

Phil 9.15.21

The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom

  • The spread of misinformation is a global phenomenon, with implications for elections, state-sanctioned violence, and health outcomes. Yet, even though scholars have investigated the capacity of fact-checking to reduce belief in misinformation, little evidence exists on the global effectiveness of this approach. We describe fact-checking experiments conducted simultaneously in Argentina, Nigeria, South Africa, and the United Kingdom, in which we studied whether fact-checking can durably reduce belief in misinformation. In total, we evaluated 22 fact-checks, including two that were tested in all four countries. Fact-checking reduced belief in misinformation, with most effects still apparent more than 2 weeks later. A meta-analytic procedure indicates that fact-checks reduced belief in misinformation by at least 0.59 points on a 5-point scale. Exposure to misinformation, however, only increased false beliefs by less than 0.07 points on the same scale. Across continents, fact-checks reduce belief in misinformation, often durably so.

GPT Agents

  • Finished reading in Andreea’s data. I’m going to add a column called ‘test’, that has some text in it to judge the quality of training. I’m going to start out with ‘ten’, ‘twenty’, ‘thirty’, and ‘forty’, which will show up in those percentages. We’ll be able to compare the percentages in the generated and the original. Done with the original
  • Create corpora and start training model.
  • Built corpora
  • Training!
  • Done! Need to verify the test percentages
[[[month:August location:Auckland text:@rnz_news @NZStuff @minhealthnz @NewshubNZ @jacindaardern @simonjbridges @nzlabour I have a few theories but they are completely illogical. My theory is that many in government and opposition are too trusting, while many in the media are too partisan. #covid19nz #covid19_nz #nzpol, test:twenty]]][[[month:August location:New Zealand text:Dr Liz Gordon: NZ’s Covid-19 response a failure

[[[month:April location:New Zealand text:“As they travel around the world, as we go back to the U.S., it is critical that they be able to meet with health officials and other trusted advisers to update their status”. #covid19nz #Healthandsafety, test:forty]]][[[month:April location:Wellington City, New Zealand text:It's getting harder and harder to resist the temptation to throw shade at the PM's leadership. She's deliberately and deliberately slipping up

[[[month:April location:Wellington, New Zealand text:New Zealand will now have a COVID-19 emergency alert system. A system based on scientific certainty, based on best informed research. The sooner we use science the sooner we’ll all get back to normal life. This is a global challenge. #coronavirus #COVID19nz, test:forty]]][[[month:April location:Wellington text:New Zealand is now in #COVID19nz mode. The system works

[[[month:April location:New Zealand text:A month of #Covid19nz has taught me to trust #SocialDistancing and not to accept #selfishness. So much of #NZtourism comes from poor, vulnerable, and elderly people. If you or someone you know has #Covid19NZ symptoms, please report them to contact tracing at 0800 451 9453, test:forty]]][[[month:April location:New Zealand text:@MatthewHootonNZ @TheAMShowNZ

[[[month:May location:Auckland, NZ text:#coronavirusnz #COVID19nz One of the new covid-19 cases reported in Queenstown this week is a case in the community., test:forty]]][[[month:May location:Muriwai, Aotearoa text:Māori Health Minister Māori Party @RikkiRakaka @nzlabour #COVID19 #CO

[[[month:May location:Wellington, New Zealand text:This is a welcome relief to many. Here's an idea: don't just sell as much as you can. Instead, take out the cash and start collecting. #COVID19nz, test:forty]]][[[month:May location:0 text:Can't say my children are very good at math - and in math classes I find they get lots of confused - but when I read someone ask them "how many years of age do they still live with?", they instantly burst into laughter. #nzpol #covid19nz

[[[month:April location:Auckland, New Zealand text:Great article by @Kiwi_Country to explain the importance of #COVID19nz and how to use your personal details to protect your community. Great info in the article, test:ten]]][[[month:April location:Auckland, New Zealand text:My thoughts: #covid19NZ #NewZealandLockdown

[[[month:June location:Aotearoa, New Zealand text:?‍♂️ #Covid_19 #COVID19nz, test:forty]]][[[month:June location:Te Upoko o Te Ika a Maui text:Māori #COVID19nz #lockdownnz, test:forty]]][[[month:June location:Christchurch City, New Zealand text:What
[[[month:June location:Auckland, New Zealand text:It's important to be clear about the amount of work we can do to safeguard the community and the health and wellbeing of NZers. Read this: #COVID19nz, test:forty]]][[[month:June location:New Zealand text:“All the good work that the Govt's emergency plans have done” @SiouxsieW #covid19nz

[[[month:March location:Wellington City, New Zealand text:RT TheDailyBlogNZ "Life in Lock Down: Day 2 | Frank Macskasy - The Daily Blog #nzpol #covid19nz", test:forty]]][[[month:March location:0 text:Life in Lock Down: Day 2 | Frank Macskasy - The Daily Blog

[[[month:April location:New Zealand text:MEDIA WATCH: Jacinda destroys Duncan Garner | The Daily Blog #nzpol #covid19nz", test:ten]]][[[month:April location:New Zealand text:GUEST BLOG: Geoff Simmons – The Price of Citizenship | The Daily Blog #nzpol #covid19nz https://t.
  • 4:15 UMBC Meeting. We’ll try French, Chinese, (and Mexican) to see if the ratings change


Phil 9.14.21

Illuminating Diverse Neural Cellular Automata for Level Generation

  • We present a method of generating a collection of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space.
  • This could be an interesting scenario generator

GPT Agents

  • Started on importer


  • Send out emails to agents!


  • Got all the stories done. Need to assign points, etc.
  • 1:00 Sprint planning meeting
  • Decided to try to put everything into a TKinter app. I already know the framework pretty well, I just need to brush up. This way I’ll be able to reuse a lot of the GraphNavigator code
  • Maybe this?
  • Today’s progress:

Phil 9.13.2021

GPT Agents

  • Fixing CR/LF in db, and re-running analytics
  • Meeting with Andreea and her student, ___. We’re going to train up a model on their NZ twitter corpora


  • Updated last sprints stories and put together slides for demos
  • Work on stories for next sprint
  • Work on getting more content into GML files. Got it working:
node [
    id 1
    label "Canada"
    weight 150222.0
    long_text "A random number: 0.13436424411240122"
  • And after going through Gephi and getting positions, colors, and sizes:
    id 0
    label "Bahamas"
      x 78.24309
      y 161.46931
      z 0.0
      w 20.0
      h 20.0
      d 20.0
      fill "#edf8fb"
    weight "4179.0"
    long_text "A random number: 0.763774618976614"

Phil 9.10.2021

Finish reviews! DONE!

Papers with Code Newsletter #16

  • Welcome to the 16th issue of the Papers with Code newsletter. In this edition, we cover:
    • some of the latest developments in language modeling,
    • efficient Transformer models for long text modeling,
    • advancements in code understanding and generation,
    • top trending ML papers of August 2021,


  • Created a table of filtered results (%coronavirus%, %chinavirus%, and %sars-cov-2%) with 1,000 of each and ran sentiment to compare
  • Well crap, the carriage returns in the ground truth are messing everything up. Need to write come code to pull, fix and put back into the table. Not today!


  • Write new stories
  • Continue working on storing additional information in networkx nodes


  • 2:00 Meeting with Michelle. Finish cover letters! Done! Maybe? Tweaked a bit more

Phil 9.9.2021

Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.

Getting started with 3D content for synthetic data (Unity)

More reviews


  • 9:15 Standup. Not sure what to talk about here given the new schedule crazyness
    • It also occurs to me that since I’ll be adapting my academic research code to produce the demo, there’s no IP for anyone being developed for this effort.
  • More poking at Svelte with Zach? Some progress. Still can’t get to switch pages
  • 11:00 Kickoff meeting – looks like we have a bit more time
  • 2:00 Adversarial reinforcement tagup

GPT Agents

  • Need to generate new tweets from the chinavirus, covid, and sars-cov-2 models using the prompt ‘[[[‘ as a baseline to compare with the ground truth – done!
  • Need to sample ground truth and put it in the gpt_experiments tables

Phil 9.8.2021

Need to tell the shop that it’s a 2016 Promaster

More reviews


  • Made some progress on Svelte, but still stuck on routing. Talking to Zach
  • Meeting about slides. Or schedule has shrunk from 3 months to six weeks. Massive shift in plans and proposal

GPT Agents

  • Go over untrained model results
  • See if we can make the chess models talk about having tea with the Queen. I win!
  • Need to generate new tweets from the chinavirus, covid, and sars-cov-2 models using the prompt ‘[[[‘ as a baseline to compare with the ground truth

Phil 9.7.2021

WikiGraphs: A Wikipedia Text – Knowledge Graph Paired Dataset

  • We present a new dataset of Wikipedia articles each paired with a knowledge graph, to facilitate the research in conditional text generation, graph generation and graph representation learning. Existing graph-text paired datasets typically contain small graphs and short text (1 or few sentences), thus limiting the capabilities of the models that can be learned on the data. Our new dataset WikiGraphs is collected by pairing each Wikipedia article from the established WikiText-103 benchmark (Merity et al., 2016) with a subgraph from the Freebase knowledge graph (Bollacker et al., 2008). This makes it easy to benchmark against other state-of-the-art text generative models that are capable of generating long paragraphs of coherent text. Both the graphs and the text data are of significantly larger scale compared to prior graph-text paired datasets. We present baseline graph neural network and transformer model results on our dataset for 3 tasks: graph -> text generation, graph -> text retrieval and text -> graph retrieval. We show that better conditioning on the graph provides gains in generation and retrieval quality but there is still large room for improvement.

Truck stuff – need to verify that they know it’s a 2016

Reviewing papers


  • Continuing to work on Svelte. Trying to get previous useful lessons to show up as pages, but they are svelte files, not HTML, so I’m not sure how to point to them
  • Pre-meeting
    • Scheduling. Orest wants to finish Oct 29, but we’re already a week into September, so I’m going to counter with Nov 5
    • Get slides done for Thurs meeting. Tried to get MARCOM to help with formatting, but the fuse is too short
    • Orest set up a meeting that conflicts with the GPT meeting. Trying to get him to move it, otherwise send a note that I will be about 15 min late

GPT Agents

  • Go over untrained model results
  • See if we can make the chess models talk about having tea with the Queen

Phil 9.3.2021

It’s September, and after weeks of humidity and 90+ highs, a storm passed through and left ups with clear blue skies, cool nights, and beautiful days.

New article on! A Gentle Introduction to Graph Neural Networks

  • Neural networks have been adapted to leverage the structure and properties of graphs. We explore the components needed for building a graph neural network – and motivate the design choices behind them.


  • Working on tweaks for today’s meeting
  • 2:00 Meeting


  • Continue with Svelte
  • I seem to have been able to get typescript set up and running:
  • Which gives us this:
  • Work on finding a venue for the automating imagination paper
  • OED Definition of imagination:
    • The power or capacity to form internal images or ideas of objects and situations not actually present to the senses, including remembered objects and situations, and those constructed by mentally combining or projecting images of previously experienced qualities, objects, and situations. Also (esp. in modern philosophy): the power or capacity by which the mind integrates sensory data in the process of perception.
  • Also, using GNNs as ways of storing the relationships between the text generated by the GPT

Phil 9.2.2021

I Asked GPT-3 About Covid-19. Its Responses Shocked Me. Generative AI systems could guide future pandemic decision-makers

  • No public health authority should rely on an AI system to make recommendations, of course. But as they grow in power and reach, AI systems could become another tool in leaders’ belts, allowing them to quickly parse existing scientific knowledge for insights that could help to guide in-the-moment decision-making. As the systems become better at citing their sources and explaining their output, their value as tools for guiding decision-making will only grow, because the validity of their predictions can be checked and vetted.


  • 7:30 Meeting with Zach. I’m going to see if he agrees with the “front-end-first” approach I’d like to try. He agrees, so I’m working my way through the tutotial
  • To install a template project as per here, you have to use the git command line app
Installing the template project from the GIT command line
  • That creates the following structure:
Project structure in IntelliJ
  • Then to run the app, I use the terminal and use <ctrl> enter:
Getting things running
  • This handles hot deployment in the browser, so I think I’m doing it right?
  • This is pretty cool. Branching logic for HTML:
  • And looping!
  • 2:00 Meeting with Rukan & Aaron?

Phil 9.1.2021


  • Working with Zach to set up websocket-based project. Slow going today as we tried to figure out exactly how we want to set up the project
  • Working on the getting started guide from websockets
  • Developing with asyncio
  • Looking more deeply at Svelte and thinking about building a standalone frontend that doesn’t interact with websockets, but fakes the functionality so that when the Python connections are added in it works?


  • 7:00 Meeting

Phil 8.31.21

So we’re officially done in Afghanistan now? One of these years, I’m going to try to figure out what the response to 9/11 cost, what the expectations were, and what actually happened


  • Working with Zach on the webapp. We may be able to do all this with websockets and no server
  • Sprint planning – done
  • Starting on websockets. Installed websockets. I installed asyncio, but it’s part of Python. That’s nice! Uninstalled and everything still works
  • The hello world works!
  • Took a detour down SSL and got stuck on cert format issues? Look at that later
  • Sending data to the browser:

That works too!


  • Still cranking on generating reviews with the untrained model
  • 3:00 Meeting. Made a bet with Shimei that the 800k chess model has forgotten that the Queen could drink tea. We’ll see if we can prompt the model to talk about something other than chess next week

Phil 8.30.21

If you want to summarize your research in a sentence… have an AI do it. SciTLDR sums up papers given an abstract, intro & conclusion. And it works impressively well: (Via Twitter)

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

  • Recently, many datasets have been proposed to test the systematic generalization ability of neural networks. The companion baseline Transformers, typically trained with default hyper-parameters from standard tasks, are shown to fail dramatically. Here we demonstrate that by revisiting model configurations as basic as scaling of embeddings, early stopping, relative positional embedding, and Universal Transformer variants, we can drastically improve the performance of Transformers on systematic generalization. We report improvements on five popular datasets: SCAN, CFQ, PCFG, COGS, and Mathematics dataset. Our models improve accuracy from 50% to 85% on the PCFG productivity split, and from 35% to 81% on COGS. On SCAN, relative positional embedding largely mitigates the EOS decision problem (Newman et al., 2020), yielding 100% accuracy on the length split with a cutoff at 26. Importantly, performance differences between these models are typically invisible on the IID data split. This calls for proper generalization validation sets for developing neural networks that generalize systematically. We publicly release the code to reproduce our results.


  • Got the client communicating with the server using Websockets and the server relaying those messages to RabbitMQ!
  • Sprint Demos and story writing today
  • Starting to look at Docker for this effort

GPT Agents

  • Finish 1-5 star parser and start run on GPT-large, then GPT. Curious what we’ll get
    • Verified that everything seems to be working on a small run. Lots of parsing to get star values
    • Tring a full-sized run of 100 batches of 10 experiments with 10 return sequences
  • OpenAI: The fine-tuning endpoint is now ready, and we’re excited to share it with you! Here’s how to get started: link