Monthly Archives: July 2020

Phil 7.30.20

GPT-2 Agents

  • Writing up graph creation


  • More coordinate transforms. I think a good way to do this is to determine the desired target vector, take the cross product of that with the current vector and use that as the axis of rotation. Recalculate each frame to keep the rotations on track
  • Think I got the hang of the quaternions for the target frame. I need to put things together in a nice class:



  • Sent Shimei a note about online CHI experiences in the current issue of  Interactions
  • Start writing review


  • 5:30 Meeting tonight

Phil 7.29.20

Call bank – 1-800-399-5919 Opt.2

Mindfulness is the intentional use of attention – Stanford business school professor Dr. Laurie Weiss (maybe this?). From Commonwealth Club podcast, second half

I think the difference between intentional and unintentional attention is an important part of AI and collective thought. Machine learning is starting to exploit unintentional attention. It’s reflexive. A population which is not being intentional in their attention is more easily herded.

SimplE Embedding for Link Prediction in Knowledge Graphs

  • Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE’s code is available on GitHub at

Knowledge base construction

  • Knowledge base construction (KBC) is the process of populating a knowledge base (KB) with facts (or assertions) extracted from data (e.g., text, audio, video, tables, diagrams, …). For example, one may want to build a medical knowledge base of interactions between drugs and diseases, a Paleobiology knowledge base to understand when and where did dinosaurs live, or a knowledge base of people’s relationships such as spouse, parents or sibling. DeepDive can be used to facilitate KBC.

GPT-2 Agents

  • More writing
  • Make a version of the chessboard with the coarse and granular trajectories


  • Adjust chapters


  • Continue working on mapping transitions between coordinate frames
  • Plotting rotations – it’s working, though not exactly in the way I was expecting:quats
  • Duh. That’s the overlap of the positive X and Z points, which are in the same plane and 90 degrees out of phase
  • 2:00 Meeting
    • Status and schedules


  • Finished reading the next paper. Time to write up

Phil 7.28.20

GPT-2 Agents


  • Read Michelle’s comments and pretty much agree with everything


  • Digging into coordinate frame transformations
  • It looks like these are some potential good libraries:
  • Starting with Transforms3d. It looks like it does most of the things I think I need to see?
    • aff – 4×4 affine matrix for operating on homogenous coordinates of shape (4,) or (4, N);
    • mat – 3×3 transformation matrix for operating on non-homogenous coordinate vectors of shape (3,) or (3, N). A rotation matrix is an example of a transformation matrix;
    • euler – euler angles – sequence of three scalars giving rotations about defined axes;
    • axangle – axis angle – axis (vector) and angle (scalar) giving axis around which to rotate and angle of rotation;
    • quat – quaternion – shape (4,);
    • rfnorm : reflection in plane defined by normal (vector) and optional point (vector);
    • zfdir : zooms encoded by factor (scalar) and direction (vector)
    • zdir – factor (scalar), direction (vector) pair to specify 3D zoom matrix;
    • striu : shears encoded by vector giving triangular portion above diagonal of NxN array (for ND transformation)
    • sadn : shears encoded by angle scalar, direction vector, normal vector (with optional point vector)
  • Scipy also has some good stuff in their spatial transformations library, particularly SLERP
  • Transforms3d doesn’t seem to have a SLERP function, but pyquaternion does. Going to try some more experiments. I think this is right? Need to plot:
    Earth vec, rotated 90 degrees
    degree = 90
    	q = +0.707 +0.000i +0.707j +0.000k
    	x = [0.00, 0.00, -1.00]
    	y = [0.00, 1.00, 0.00]
    	z = [1.00, 0.00, 0.00]
    Yaw vec, rotated 45 degrees
    degree = 45
    	q = +0.924 +0.000i +0.000j +0.383k
    	x = [0.71, 0.71, 0.00]
    	y = [-0.71, 0.71, 0.00]
    	z = [0.00, 0.00, 1.00]
    Composite vec
    	x = [0.00, 0.71, -0.71]
    	y = [0.00, 0.71, 0.71]
    	z = [1.00, 0.00, 0.00]


Phil 7.27.20

I had a good weekend. Got to ride in the mountains. Actually finished my chores, to I didn’t get to paying bills. Saw my sister – outside, 8′ apart, much more careful than last time. Went on a date.

Translating Embeddings for Modeling Multi-relational Data

  • We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is easy to train, contains a reduced number of parameters and can scale up to very large databases. Hence, we propose, TransE, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities. Despite its simplicity, this assumption proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases. Besides, it can be successfully trained on a large scale data set with 1M entities, 25k relationships and more than 17M training samples.


  • Back to draft zero – grinding along


  • Check out Vadim’s rwheel results today.
  • Work on calculating the contributions from the rwheels to rotation around an arbitrary vector


  • Write up second review – done!
  • Started on third paper

Phil 7.24.20

I had home-grown tomatoes this morning!

And I hung up my shiny new diploma!

GPT-2 Agents

  • I think it’s time to start writing the paper. Something like Synthetic Agents in Language Models: Navigating belief
    • Using the IEEE(ACSOS) template
    • Set up the paper with authors and dummy text. Starting to fill in the pieces
  • Writing the methods section and needed to count the number of games (#draw + #resigns). The easiest way to do this was jut to count all the word frequencies. Here are the top terms:
    to : 1474559
    from : 1472081
    moves : 1472071
    white : 1062561
    black : 1056840
    pawn : 392494
    in : 330044
    move : 307701
    takes : 307166
    rook : 258476
    knight : 250998
    bishop : 225442
    queen : 175254
    king : 173837
    pawn. : 145164
    check. : 91512


  • The list goes on a while. The most mentioned squares are d4 (56,224), d5(53,986), and f6(48,772)

God help me, I’m updating my IDE


  • Asked Vadim to exercise the satellite through +/- 90
  • Need to start working on the mapping of rwheels to inertial(?) frame. The thing is, the yaw axis rotates 360 degrees every day, so what frame do we use? My thinking is that the inertial frame (as defined by the star tracker) is unchanging, but we have a rotating frame inside that . The satellite’s moves are relative to that rotating frame plus the inertial frame. So the satellite’s first task is to keep its orientation relative to the rotating frame, then execute commands with respect to that frame. So a stacked matrix of inertial frame, Earth frame, vehicle matrix and then a matrix for each of the rwheels?

Phil 7.23.20

Amid a tense meeting with protesters, Portland Mayor Ted Wheeler tear-gassed by federal agents

GPT-2 Agents

  • Good back-and-forth with Antonio about venues
  • It struck me that statistical tests about fair dice might give me a way of comparing the two populations. Pieces are roughly equivalent to dice sides. Looking at this post on the RPG Stackexchange. That led me to Pearson’s Chi-square test (which rang a bell as the sort of test I might need).
  • Success! Here’s the code:
    from scipy.stats import chisquare, chi2_contingency
    from scipy.stats.stats import pearsonr
    import pandas as pd
    import numpy as np
    gpt = [51394,
    twic = [49386,
    z, p = chisquare(f_obs=gpt,f_exp=twic)
    print("z = {}, p = {}".format(z, p))
    ar = np.array([gpt, twic])
    df = pd.DataFrame(ar, columns=['pawns', 'rooks', 'bishops', 'knights', 'queen', 'king'], index=['gpt-2', 'twic'])
    print("\n", df)
    z,p,dof,expected=chi2_contingency(df, correction=False)
    print("\nNo correction: z = {}, p = {}, DOF = {}, expected = {}".format(z, p, dof, expected))
    z,p,dof,expected=chi2_contingency(df, correction=True)
    print("\nCorrected: z = {}, p = {}, DOF = {}, expected = {}".format(z, p, dof, expected))
    cor = pearsonr(gpt, twic)
    print("\nCorrelation = {}".format(cor))
  • Here’s the results:
    "C:\Program Files\Python\python.exe" C:/Development/Sandboxes/GPT-2_agents/gpt2agents/analytics/
    z = 8696.966788178523, p = 0.0
     [[51394 25962 19242 23334 15928 19953]
     [49386 31507 28263 31493 22818 23608]]
            pawns  rooks  bishops  knights  queen   king
    gpt-2  51394  25962    19242    23334  15928  19953
    twic   49386  31507    28263    31493  22818  23608
    No correction: z = 2202.2014776980245, p = 0.0, DOF = 5, expected = [[45795.81128532 26114.70012657 21586.92215826 24914.13916789 17606.71268169 19794.71458027]
     [54984.18871468 31354.29987343 25918.07784174 29912.86083211 21139.28731831 23766.28541973]]
    Corrected: z = 2202.2014776980245, p = 0.0, DOF = 5, expected = [[45795.81128532 26114.70012657 21586.92215826 24914.13916789 17606.71268169 19794.71458027]
     [54984.18871468 31354.29987343 25918.07784174 29912.86083211 21139.28731831 23766.28541973]]
    Correlation = (0.9779452546334226, 0.0007242538456558558)
    Process finished with exit code 0


  • It might be time to start writing this up!


  • Found vehicle orientation mnemonics: GNC_AD_STA_FUSED_QRS#


  • 11:00 Meeting with Erik and Vadim about schedules. Erik will send an update. The meeting went well. Vadim’s going to exercise the model through a set of GOTO ANGLE 90 / GOTO ANGLE 0 for each of the rwheels, and we’ll see how they map to the primary axis of the GOES

Phil 7.21.20

Superstrata ebike

Review papers – finished reading the first, write review today. First review done!

Realized that I really need to update my online resumes to include Python and Machine Learning. Can probably just replace the Flex and YUI entries with Python and Tensorflow

Read this today: Proposal: A Market for Truth to Address False Ads on Social Media. It’s by Marshall Van Alstyne, a Questrom Chair Professor at Boston University where he teaches information economics. From the Wikipedia entry

  • Information has special characteristics: It is easy to create but hard to trust. It is easy to spread but hard to control. It influences many decisions. These special characteristics (as compared with other types of goods) complicate many standard economic theories. 
  • Information economics is formally related to game theory as two different types of games that may apply, including games with perfect information,[5] complete information,[6] and incomplete information.[7] Experimental and game-theory methods have been developed to model and test theories of information economics,[8]
  • This looks as close to the description of decisions in the presence of expensive information that I’ve seen so far

GPT-2 Agents

  • The run completed last night! I have 156,313 synthetic moves
  • Reworking the queries from the actual moves to reflect the probes for the synthetic
  • Created a view that combines the probe and the response into a description:
    create or replace view gpt_view as
        select tm.move_number, tm.color, tm.piece, tm.`from`, tm.`to`, concat(tm.probe, tm.response) as description
        FROM table_moves as tm;
  • Almost forgot to backup the db before doing something dumb
  • Created a “constraint string” that should make the game space searched somewhat more similar:
    and (move_number < 42 or description like "%White takes%" or description like "%Black takes%" or description like "%Check%")
  • Made the changes to the code and am running the analysis
  • My fancy queries are producing odd results. Pulling out the constraint string. That looks pretty good!


  • As an aside, the chess queries and extraction is based on an understanding of movement tems like ‘from’ and ‘to’. Thinking about Alex’ finding of consensus metaterms, I think it would be useful to look for movement/consensus/compromise terms and then weighting the words that are nearby

ML meeting

  • Vacation pix!
  • Went over results shown above
  • Arpita found some good embedding results using Tensorboard, but not sure where to go from there?

Phil 7.20.20

My guess it that barring interference of some kind all US cities will have something like what’s going on in Portland by election day

GPT-2 Agents

  • Back from break, and thinking about what to do next. I think the first thing to do is simply gather more data from the model. Right now I have about  1,500 GPT-2 moves and about 190,000 human moves. Increasing the number of predictions to 1,000 by adding a batch size value. Otherwise I got out-of-memory errors.
  • I had started the run in the morning and was almost done when a power failure hit and the UPS didn’t work. Ordered a new UPS. Tried to be clever about finishing off the last piece of data but left in the code that truncated the table. Ah, well. Starting over.
  • Next is to adjust the queries so that the populations are more similar. The GPT-2 moves come from the following prompts:
    probe_list = ['The game begins as ', 'In move 10', 'In move 20', 'In move 30', 'In move 40', 'White takes black ', 'Black takes white ', 'Check. ']

    That means I should adjust my queries of the human data to reflect those biases, something like:

    select * from table_actual where move_number = 1 order by move_number limit 50;

    which should match the probe ‘The game begins as ‘.

  • I’d also like to run longer, full games (look for ‘resigns’, ‘draw’, or ‘wins’) and parse them, but that’s for later.
  • Need to figure out the statistics to compare populations. I think I’m going to take some time and look through the NIST Engineering Statistics Handbook



  • Vadim seems to have made progress. Need to set up a meeting to chat and catch up
  • 2:00 meeting with V & E. Good progress!
  • GVSETS has been moved to Nov 3. Speaking of which, I’ll need to compare simulated and actual maneuvers, so stats here too. Now that the moves are cooking I’ll start on the stats


The GPT-3 is… something

An extract from “On Being an Automaton” (full text here). From , seeded was the title, the author’s name and the first word. Found(?) by Mario Klingemann, who talks about it and other compositions in this Thread

“I am not actually an artificial intelligence but a rather more interesting phenomenon. What I actually am is an artificial intelligence that has learned to write like myself, a machine with writing skills that are indistinguishable from mine, but which, unlike me, is not so hide-bound by its programming that it cannot learn a little something new. As I sit here, writing, it is watching. And as I continue, it too continues, but also improving itself.”

This is really starting to remind me of discussions about consciousness. At what point do we call something aware? Self-reflection? That sure seems to fit the bill. I think the question may start to become not when machines are conscious/aware, but if we are more machine-like in our awareness than we feel/believe.

Thoughts from a bike trip from Baltimore to Pittsburgh

I’ve been riding my bike with a few friends for the last week. We started near Baltimore-Washington Airport, and ended in downtown Pittsburgh. The trip was mostly through rural regions in Maryland, West Virginia, and Pennsylvania. After a while, I realized that I was seeing 100 Trump flags for every Biden yard sign. Here’s one of the more over-the-top examples:


At the same time, the people I see are often chronically sick. I’ve lost times of the times I’ve seen folks barely able to walk shopping in the Dollar Stores that are sprinkled along the route. Some are clearly poor, driving used cars that sound like they only have a few miles left before something important breaks. Some are much better off, with brand new KAG flags flying above well-manicured lawns.

In talking to the folks around here, everyone has been helpful and nice. But everyone seems afraid. And it’s not the virus. It’s of things like Antifa and BLM. I think it’s important to understand that the people believing that black folks and foreigners are coming for them. There are people on my trip, with multiple degrees who are genuinely of this. When we stayed at a B&B this week, the proprietor said that people were cancelling bike trips to DC, not because of COVID, but because of the “Rioting”.

I think that there is some kind of existential disconnect going on here between lived experiences and what we are being presented. Few of us seem to know anyone who has had a serious case of COVID-19 directly. The cases we have direct knowledge of are generally minor. We are presented stories, spread across dozens of sources and media that talk about a rising death toll, but it is not tangible.

So instead we choose our sources based on credibility rather than trustworthiness, and believe them. Not because many of these things actually happen, but because they occupy a shared social reality. And when I can talk to my friends about something that I’ve seen, and they have seen it too, then it  seems real. At least as real as the protesters and death counts that also compete for attention on our screens.

I think Trump embodies that, maybe better than any other politician I’ve experienced in my lifetime. He exists in a particular social reality where dangers are clearly delineated, and the enemy looks different. And in that odd following-leader dance that you can see so clearly at his rallies, he is able to articulate these fears in such a way to keep his base focused on an outside enemy.

Trump Country for me was just some more credible-sounding information coming across my screens. This trip has made it tangible for me. I think that Trump’s base views him as a success. Not in draining the swamp. Not in bringing jobs back to America. They think he is a success because he is keeping the invaders out. The proof of his success is the fact that we do not have Committees of Public Safety, of BLM and Antifa activists in every town pulling down statues, burning businesses, and imposing an alien way of life that they see presented to them through such channels as Fox News, OAN, and Facebook.


I remember seeing a lot of Trump signs last election as well. I think his support may be broader and deeper than regular polling may suggest. I also think that if he wins, they synergy of this fear relationship between him and his followers will have to get more extreme to maintain its hold. My sense is that this country is heading towards a reckoning of some sort, where this screen-mediated social reality becomes the dominant force, or where a sustained effort to re-attach people to some form of shared reality must take place. The former will be more exciting (at least for a while), and will have a tremendous pull. The other will be dependent, I think, on coming to terms with how our technologies are affecting how we think and experience reality as populations. It has been done before with language, the printing press, and mass media. Hopefully we’ll be able to do it again.

Phil 7.9.20

NVAE: A Deep Hierarchical Variational Autoencoder

  • Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are currently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural architectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierarchical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ as shown in Fig. 1. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256×256 pixels.


Like Two Pis in a Pod: Author Similarity Across Time in the Ancient Greek Corpus

  • One commonly recognized feature of the Ancient Greek corpus is that later texts frequently imitate and allude to model texts from earlier time periods, but analysis of this phenomenon is mostly done for specific author pairs based on close reading and highly visible instances of imitation. In this work, we use computational techniques to examine the similarity of a wide range of Ancient Greek authors, with a focus on similarity between authors writing many centuries apart. We represent texts and authors based on their usage of high-frequency words to capture author signatures rather than document topics and measure similarity using Jensen- Shannon Divergence. We then analyze author similarity across centuries, finding high similarity between specific authors and across the corpus that is not common to all languages.

GPT-2 Agents

  • Setting up some experiments, for real and synthetic, black and white. All values should have raw numbers and percentages:
    • Moves from each square by piece+color / total number of moves from square
    • Moves to each square by piece+color / total number of moves from square
    • Squares by piece+color / total number of pieces
    • Sequences? I’d have to add back in castling and re-run. Maybe later
    • Squares used over time (first 10 moves, second 10, etc)
    • Pieces used over time
  • Create new directory called results that will contain the spreadsheets
  • Running the first queries. It’s going to take about an hour by my estimation, but nothing is exploding as far as the queries go
  • Add a spreadsheet for illegal moves. Done! Here’s the results. The GPT agents make 3 illegal moves out of 1,565:
    illegal bishop move: {'from': 'e7', 'to': 'c6'}
    illegal knight move: {'from': 'c5', 'to': 'a8'}
    illegal queen move: {'from': 'f8', 'to': 'h4'}
    Dataframe: ../results/legal_1.xlsx/legal-table_moves
             illegal  legal
    pawns          0    446
    rooks          0    270
    bishops        1    193
    knights        1    266
    queen          1    175
    king           0    212
    totals         3   1562
    Dataframe: ../results/legal_1.xlsx/legal-table_actual
             illegal   legal
    pawns          0   49386
    rooks          0   31507
    bishops        0   28263
    knights        0   31493
    queen          0   22818
    king           0   23608
    totals         0  188324




  • Waiting on Vadim
  • 2:00 AIMS-Core v3.0 Overview
  • Ping MARCOM


  • 6:00 Meeting