Category Archives: Paper

Phil 8.13.20

Ride through the park today and ask about pavilion rental – done

Māori Pronunciation

Iñupiaq (Inupiatun)

GPT-2 Agents

Rewrite intro, including the finding that these texts seem to be matched in some way. Done
Uploaded the new version to ArXiv. Should be live by tomorrow
Read Language Models as Knowledge Bases, and added to the lit review.
Discovered Antoine Bosselut, who was lead author on the following papers.Need to add them to the future work section
- Dynamic Knowledge Graph Construction for Zero-shot Commonsense Question Answering
  - Understanding narratives requires dynamically reasoning about the implicit causes, effects, and states of the situations described in text, which in turn requires understanding rich background knowledge about how the social and physical world works. At the core of this challenge is how to access contextually relevant knowledge on demand and reason over it.
    In this paper, we present initial studies toward zero-shot commonsense QA by formulating the task as probabilistic inference over dynamically generated commonsense knowledge graphs. In contrast to previous studies for knowledge integration that rely on retrieval of existing knowledge from static knowledge graphs, our study requires commonsense knowledge integration where contextually relevant knowledge is often not present in existing knowledge bases. Therefore, we present a novel approach that generates contextually relevant knowledge on demand using generative neural commonsense knowledge models.
    Empirical results on the SocialIQa and StoryCommonsense datasets in a zero-shot setting demonstrate that using commonsense knowledge models to dynamically construct and reason over knowledge graphs achieves performance boosts over pre-trained language models and using knowledge models to directly evaluate answers.
- COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
  - We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional KBs that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET) that learn to generate rich and diverse commonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.

GOES

10:00 sim status meeting – planning to fully evaluate off-axis rotation by Monday, then characterize Rwheel contribution, adjust the control system and start commanding vehicle rotations by the end of the week? Seems ambitions, but what the hell.
2:00 status meeting
Anything about GVSETS? Yup: Meeting Wed 9/16/2020 9:00 AM – 10:00 AM

JuryRoom

5:30 meeting. Discuss proposal and additional meetings

Book

Transfer more content

Phil 8.11.20

Zero-Shot Learning in Modern NLP

In this post, I will present a few techniques, both from published research and our own experiments at Hugging Face, for using state-of-the-art NLP models for sequence classification without large annotated training sets.

Found a really good dashboard for US economic indicators:

GOES

I think I realize my problem about the second axis. It’s not rotating around the origin, so the vectors that I’m using to create the rotation vectors are not right.
Fixed! Here are some rotations (180 around Z, 90 around x, and 360 around z, 180 around x)

GPT-2 Agents

I did 11 runs of S/He walked into the room and made word clouds:

I’m going to re-run this on my GPT-2 so I can have a larger N. Just need to do some things to the test code to output to a file

ICTAI

Finished the last review. The last paper was an ontological model with no computation in it
Uploaded and finished!

ML seminar

I have access to the Twitter data now. Need to download and store it in the db
Presentation next week

Phil 8.10.20

Really good weekend. I feel almost recharged to this time last week 🙂

Language Models as Knowledge Bases (Via Shimei)

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as “fill-in-the-blank” cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at this https URL.

#COVID

Currently at 103, 951 tweets translated

JuryRoom

Write reference section – done

GOES

I need to do an incremental rotation to track the reference points from last week
Still having problems with the secondary rotation. I’m clearly doing something basic wrong
Meeting with Vadim

GPT-2 Agents

Create a word cloud for multiple passes of “She came into the room”
Add something about the place for qualitative research in a Language model sociology. Outliers are the places that the models learn to ignore. So traditional research will be the way that these marginalized populations are not forgotten.
Screwed up the ArXiv bibliography submission. Fixed

ICTAI

Started reading the last paper, which is on <shudder> ontologies

Phil 8.6.20

Coronavirus: The viral rumours that were completely wrong (BBC)

An ocean of Books (Google Arts & Culture Experiments)

bookocean

Hopfield Networks is All You Need

We show that the transformer attention mechanism is the update rule of a modern Hopfield network with continuous states. This new Hopfield network can store exponentially (with the dimension) many patterns, converges with one update, and has exponentially small retrieval errors. The number of stored patterns is traded off against convergence speed and retrieval error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. The gradient in transformers is maximal for metastable states, is uniformly distributed for global averaging, and vanishes for a fixed point near a stored pattern. Using the Hopfield network interpretation, we analyzed learning of transformer and BERT models. Learning starts with attention heads that average and then most of them switch to metastable states. However, the majority of heads in the first layers still averages and can be replaced by averaging, e.g. our proposed Gaussian weighting. In contrast, heads in the last layers steadily learn and seem to use metastable states to collect information created in lower layers. These heads seem to be a promising target for improving transformers. Neural networks with Hopfield networks outperform other methods on immune repertoire classification, where the Hopfield net stores several hundreds of thousands of patterns. We provide a new PyTorch layer called “Hopfield”, which allows to equip deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. GitHub: this https URL

Can GPT-3 Make Analogies?. By Melanie Mitchell | by Melanie Mitchell | Aug, 2020 | Medium

#COVID

Going to try to get the translator working and inserting best effort into the DB. They we can make queries for the good results. Done! Here’s a shot of it chunking away. About one translation a second:

GOES

Work on quaternion frame tracking
This might help with visualization: matplotlib.org/3.1.1/api/animation_api
Updating my work box. Had a weird experience upgrading pip. It hit a permissions issue and failed out without rolling back. I had to use get-pip.py to get it back
Looking good:

rotate_to_point

JuryRoom

5:30(?) meeting
Project grant application

ICTAI

Write review – done. One to go!

Phil 8.5.20

Wajanat’s defense at 10:00!

Train your TensorFlow model on Google Cloud using TensorFlow Cloud

import

How QAnon Creates a Dangerous Alternate Reality

Game designer Adrian Hon says the conspiracy theory parallels the immersive worlds of alternate reality games.

GPT-2 Agents

Finish the results section – done!. Need to do Discussion (done!), Future Work (done!), and Conclusions(done!)
Looked on Scholar for “language model sociology GPT” and didn’t find anything, so I’m hopeful that this is still a pretty novel idea

Book

Add in more content to the Overleaf project

GOES

2:00 Meeting

#COVID group 4:30

Write translator code for tomorrow and get that running

Read paper 5 – done. Started great but no results section!

Phil 8.3.20

I found Knuth’s version of “how to write a paper”!

knuth

GPT-2 Agents

Writing paper

GOES

Status report – done
More quaternions. Got the reference frame doing what I want:

ref_rotation

Here it’s starting at -45 (rotated around the Y axis) and 0, rotated around the Z. The Z axis is rotated 10 degrees per step. When Z is between 90 and 180, Y is rotated to 0. When Z > 180, Y is set to 45
I’ve started to add the tracking, and it’s close-ish:

ref_vehicle_rotation_bug

ICTAI 2020

Starting next paper – finished reading. It’s pretty bad…

Phil 7.31.20

Maryland Vote Centers & Ballot Drop Off Locations

Maryland election page, includes mail in directions

Ping ProPublica for setting up a dashboard for mail delivery

GPT-2 Agents

More writing
Got an account with InferKit, which is the successor to talktotransformer
GPT-3 talking about itself: twitter.com/raphamilliere/status/1289129723310886912

Book

2:00 Meeting

ICTAI

Write review

Phil 7.30.20

GPT-2 Agents

Writing up graph creation

GOES

More coordinate transforms. I think a good way to do this is to determine the desired target vector, take the cross product of that with the current vector and use that as the axis of rotation. Recalculate each frame to keep the rotations on track
Think I got the hang of the quaternions for the target frame. I need to put things together in a nice class:

helix

ICTAI

Sent Shimei a note about online CHI experiences in the current issue of Interactions
Start writing review

JuryRoom

5:30 Meeting tonight

Phil 7.29.20

Call bank – 1-800-399-5919 Opt.2

Mindfulness is the intentional use of attention – Stanford business school professor Dr. Laurie Weiss (maybe this?). From Commonwealth Club podcast, second half

I think the difference between intentional and unintentional attention is an important part of AI and collective thought. Machine learning is starting to exploit unintentional attention. It’s reflexive. A population which is not being intentional in their attention is more easily herded.

SimplE Embedding for Link Prediction in Knowledge Graphs

Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE’s code is available on GitHub at https://github.com/Mehran-k/SimplE.

Knowledge base construction

Knowledge base construction (KBC) is the process of populating a knowledge base (KB) with facts (or assertions) extracted from data (e.g., text, audio, video, tables, diagrams, …). For example, one may want to build a medical knowledge base of interactions between drugs and diseases, a Paleobiology knowledge base to understand when and where did dinosaurs live, or a knowledge base of people’s relationships such as spouse, parents or sibling. DeepDive can be used to facilitate KBC.

GPT-2 Agents

More writing
Make a version of the chessboard with the coarse and granular trajectories

Book

Adjust chapters

GOES

Continue working on mapping transitions between coordinate frames
Plotting rotations – it’s working, though not exactly in the way I was expecting:
Duh. That’s the overlap of the positive X and Z points, which are in the same plane and 90 degrees out of phase
2:00 Meeting
- Status and schedules

IACT

Finished reading the next paper. Time to write up

Phil 7.27.20

I had a good weekend. Got to ride in the mountains. Actually finished my chores, to I didn’t get to paying bills. Saw my sister – outside, 8′ apart, much more careful than last time. Went on a date.

Translating Embeddings for Modeling Multi-relational Data

We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose, TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.

GPT-2

Back to draft zero – grinding along

GOES

Check out Vadim’s rwheel results today.
Work on calculating the contributions from the rwheels to rotation around an arbitrary vector

ICTAI

Write up second review – done!
Started on third paper

Phil 7.24.20

I had home-grown tomatoes this morning!

And I hung up my shiny new diploma!

GPT-2 Agents

I think it’s time to start writing the paper. Something like Synthetic Agents in Language Models: Navigating belief
- Using the IEEE(ACSOS) template
- Set up the paper with authors and dummy text. Starting to fill in the pieces

Writing the methods section and needed to count the number of games (#draw + #resigns). The easiest way to do this was jut to count all the word frequencies. Here are the top terms:

to : 1474559
from : 1472081
moves : 1472071
white : 1062561
black : 1056840
pawn : 392494
in : 330044
move : 307701
takes : 307166
rook : 258476
knight : 250998
bishop : 225442
queen : 175254
king : 173837
pawn. : 145164
check. : 91512

The list goes on a while. The most mentioned squares are d4 (56,224), d5(53,986), and f6(48,772)

God help me, I’m updating my IDE

GOES

Asked Vadim to exercise the satellite through +/- 90
Need to start working on the mapping of rwheels to inertial(?) frame. The thing is, the yaw axis rotates 360 degrees every day, so what frame do we use? My thinking is that the inertial frame (as defined by the star tracker) is unchanging, but we have a rotating frame inside that . The satellite’s moves are relative to that rotating frame plus the inertial frame. So the satellite’s first task is to keep its orientation relative to the rotating frame, then execute commands with respect to that frame. So a stacked matrix of inertial frame, Earth frame, vehicle matrix and then a matrix for each of the rwheels?

Phil 6.19.20

12:00 – Sy’s defense at noon!

GPT-2 Agents

Fixed the regex in ChessMovesToDb
More work on finding closest neighbors.
- Maybe keep a record of the number and type of pieces that are used?
- Looks like the basics are working. Here’s the test graph:

known_nearest

- And here are the results. I made the code so that it only shows each neighbor once, but it may be useful to keep track of the number of times a neighbor shows up in a list. This might not be important in chess, but in less structured text environments (RPGs to Reddit threads), it may be valuable:
```
find_closest_neighbors(): nodes = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
{'node': 'a', 'known_nearest': ['f', 'd']}
{'node': 'b', 'known_nearest': ['f', 'd']}
{'node': 'c', 'known_nearest': []}
{'node': 'd', 'known_nearest': ['f', 'a', 'b', 'g']}
{'node': 'e', 'known_nearest': []}
{'node': 'f', 'known_nearest': ['a', 'g', 'd', 'b']}
{'node': 'g', 'known_nearest': ['f', 'd']}
```
- At this point it’s not recursive, but it could be. I’m worried about combinatorial explosion though

GOES

Submit GVSETS paper – done!
Meeting with Vadim and Issac at 11:00
- Goal is to move all the RW code out of the sim class and into its own and call methods from the sim class

Phil 6.18.20

Hotel reservations!

Sent a ping to Don about a paper to review

GPT-2 Agents

Started on common neighbor algorithm. Definitely a good place for recursion
Generating larger file

adjacency

moves

If you look at the center of the plot and squint a bit, you can see a bit of the grid:

There is an error: The string ‘, White moves pawn from h3 to g4. White takes black pawn. LCZero v0.24-sv-t60-3010 moves black knight from h5 to g7. White moves pawn from g4 to h5. LCZero v0.24‘ is parsing incorrectly due to the truly bizarre name (The little known Grand Master LCZero v0.24-sv-t60-3010). Need to fix the regex. I think I just need to make it so that there has to be a space in front and a space/period after.

GOES

Readthrough of GVSETS paper
2:00 Meeting

Waikato

Alex had a really good insight in that groups that are working at coming to consensus use terms to discuss their level of agreement that are independent of the points being argued. That’s could really be important in text analysis.

Phil 6.17.20

Listened to a fantastic interview with Nell Irvin Painter (White Supremacy at Home and Abroad):

GPT-2 Agents

Working on finding the connections between nodes

Now that I know how to add weights to edges, I think I want to add the piece that made the move. It needs to be a list, since multiple types of pieces can connect two squares. Added a dict_array per edge:

if target not in nlist:
    self.G.add_edge(source, target, weight=0)
    self.G[target]['dict_array'] = []
self.G[target]['weight'] += 1
for key, val in data_dict.items():
    a:List = self.G[target]['dict_array']
    a.append({key:val})

I also realize that moves that repeatedly connect squares are more likely to be close, simply because the available squares of more distant moves increase in a geometric fashion. I added a method that writes out moves to Excel where I can play with them. Here are some moves:

moves

In looking at these moves, it does seem to be that the majority of the moves seem to be short (e.g. b6-b7, b6-a7, b6-b5). The only exception is the knight (b6-d7). So I think there is a confidence value that I can calculate for the ‘physical’ adjacency of nodes in a network. This could also apply to belief spaces as well. Most consensus requires coordination and common orientation (pos, heading, speed), so commonly connected topics can be said to be ‘closer’
Good chat with Aaron about CVPR and algorithms

GOES

Finish revisions and send to T and Aaron for review. Last thing is to tie back to ground vehicles in the discussion. Done! I think… Need to read the whole thing and see if it still hangs together
2:00 – Meeting

Phil 6/16/20

Started an interesting conversation with Matthew Remski about my research. Curious where it will go

GPT-2 Agents

Working on fixing the adjacency matrix code. Fixed!

GOES

Continuing on revisions

ML seminar

Dao’s proposal defense walkthrough. Boy, has she done a lot

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Paper

Phil 8.13.20

Phil 8.11.20

Phil 8.10.20

Phil 8.6.20

Phil 8.5.20

Phil 8.3.20

Phil 7.31.20

Phil 7.30.20

Phil 7.29.20

Phil 7.27.20

Phil 7.24.20

Phil 6.19.20

Phil 6.18.20

Phil 6.17.20

Phil 6/16/20