Category Archives: research

Phil 8.7.20

#COVID

The Arabic translation program is chunking along. It’s translated over 27,000 tweets so far. I think I’m seeing the power and risks of AI/ML in this tiny example. See, I’ve been programming since the late 1970’s, in many, many, languages and environments, and the common thread in everything I’ve done was the idea of deterministic execution. That’s the idea that you can, if you have the time and skills, step through a program line by line in a debugger and figure out what’s going on. It wasn’t always true in practice, but the idea was conceptually sound.
This translation program is entirely different. To understand why, it helps to look at the code:

translator

This is the core of the code. It looks a lot like code I’ve written over the years. I open a database, get some lines, manipulate them, and put them back. Rinse, lather, repeat.
That manipulation, though…
The six lines in yellow are the Huggingface API, which allow me to access Microsoft’s Marian Neural Machine Translation models, and have them use the pretrained models generated by the University of Helsinki. The one I’m using translates Arabic (src = ‘ar’) to English (trg = ‘en’). The lines that do the work are in the inner loop:
```
batch = tok.prepare_translation_batch(src_texts=[d['contents']])
gen = model.generate(**batch)  # for forward pass: model(**batch)
words: List[str] = tok.batch_decode(gen, skip_special_tokens=True)
```
The first line is straightforward. It converts the Arabic words to tokens (numbers) that the language model works in. The last line does the reverse, converting result tokens to english.
The middle line is the new part. The input vector of tokens is goes to the input layer of the model, where they get sent through a 12-layer, 512-hidden, 8-heads, ~74M parameter model. Tokens that can be converted to English pop put the other side. I know (roughly) how it works at the neuron and layer level, but the idea of stepping through the execution of such a model to understand the translation process is meaningless.
In the time it took to write this, its translated about 1,000 more tweets. I can have my Arabic-speaking friends to a sanity check on a sample of these words, but we’re going to have to trust the overall behavior of the model to do our research in, because some of these systems only work on English text.
So we’re trusting a system that we cannot verify to to research at a scale that would otherwise be impossible. If the model is good enough, the results should be valid. If the model behaves poorly, then we have bad science. The problem is right now there is only one Arabic to English translation model available, so there is no way to statistically examine the results for validity.
And I guess that’s really how we’ll have to proceed in this new world where ML becomes just another API. Validity of results will depend on diversity on model architectures and training sets. That may occur naturally in some areas, but in others, there may only be one model, and we may never know the influences that it has on us.

GOES

More quaternions. Need to do multiple axis movement properly. Can you average two quaternions and have something meaningful?
Here’s the reference frame with two rotations based off of the origin, so no drift. Now I need to do an incremental rotation to track these points:

reference_frame

GPT-2 Agents

Start digging into knowledge graphs

Phil 8.6.20

Coronavirus: The viral rumours that were completely wrong (BBC)

An ocean of Books (Google Arts & Culture Experiments)

bookocean

Hopfield Networks is All You Need

We show that the transformer attention mechanism is the update rule of a modern Hopfield network with continuous states. This new Hopfield network can store exponentially (with the dimension) many patterns, converges with one update, and has exponentially small retrieval errors. The number of stored patterns is traded off against convergence speed and retrieval error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. The gradient in transformers is maximal for metastable states, is uniformly distributed for global averaging, and vanishes for a fixed point near a stored pattern. Using the Hopfield network interpretation, we analyzed learning of transformer and BERT models. Learning starts with attention heads that average and then most of them switch to metastable states. However, the majority of heads in the first layers still averages and can be replaced by averaging, e.g. our proposed Gaussian weighting. In contrast, heads in the last layers steadily learn and seem to use metastable states to collect information created in lower layers. These heads seem to be a promising target for improving transformers. Neural networks with Hopfield networks outperform other methods on immune repertoire classification, where the Hopfield net stores several hundreds of thousands of patterns. We provide a new PyTorch layer called “Hopfield”, which allows to equip deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. GitHub: this https URL

Can GPT-3 Make Analogies?. By Melanie Mitchell | by Melanie Mitchell | Aug, 2020 | Medium

#COVID

Going to try to get the translator working and inserting best effort into the DB. They we can make queries for the good results. Done! Here’s a shot of it chunking away. About one translation a second:

GOES

Work on quaternion frame tracking
This might help with visualization: matplotlib.org/3.1.1/api/animation_api
Updating my work box. Had a weird experience upgrading pip. It hit a permissions issue and failed out without rolling back. I had to use get-pip.py to get it back
Looking good:

rotate_to_point

JuryRoom

5:30(?) meeting
Project grant application

ICTAI

Write review – done. One to go!

Phil 7.23.20

Amid a tense meeting with protesters, Portland Mayor Ted Wheeler tear-gassed by federal agents

GPT-2 Agents

Good back-and-forth with Antonio about venues
It struck me that statistical tests about fair dice might give me a way of comparing the two populations. Pieces are roughly equivalent to dice sides. Looking at this post on the RPG Stackexchange. That led me to Pearson’s Chi-square test (which rang a bell as the sort of test I might need).

Success! Here’s the code:

from scipy.stats import chisquare, chi2_contingency
from scipy.stats.stats import pearsonr
import pandas as pd
import numpy as np

gpt = [51394,
       25962,
       19242,
       23334,
       15928,
       19953]

twic = [49386,
        31507,
        28263,
        31493,
        22818,
        23608]

z, p = chisquare(f_obs=gpt,f_exp=twic)
print("z = {}, p = {}".format(z, p))

ar = np.array([gpt, twic])
print("\n",ar)

df = pd.DataFrame(ar, columns=['pawns', 'rooks', 'bishops', 'knights', 'queen', 'king'], index=['gpt-2', 'twic'])
print("\n", df)

z,p,dof,expected=chi2_contingency(df, correction=False)
print("\nNo correction: z = {}, p = {}, DOF = {}, expected = {}".format(z, p, dof, expected))

z,p,dof,expected=chi2_contingency(df, correction=True)
print("\nCorrected: z = {}, p = {}, DOF = {}, expected = {}".format(z, p, dof, expected))

cor = pearsonr(gpt, twic)
print("\nCorrelation = {}".format(cor))

Here’s the results:

"C:\Program Files\Python\python.exe" C:/Development/Sandboxes/GPT-2_agents/gpt2agents/analytics/pearsons.py
z = 8696.966788178523, p = 0.0

 [[51394 25962 19242 23334 15928 19953]
 [49386 31507 28263 31493 22818 23608]]

        pawns  rooks  bishops  knights  queen   king
gpt-2  51394  25962    19242    23334  15928  19953
twic   49386  31507    28263    31493  22818  23608

No correction: z = 2202.2014776980245, p = 0.0, DOF = 5, expected = [[45795.81128532 26114.70012657 21586.92215826 24914.13916789 17606.71268169 19794.71458027]
 [54984.18871468 31354.29987343 25918.07784174 29912.86083211 21139.28731831 23766.28541973]]

Corrected: z = 2202.2014776980245, p = 0.0, DOF = 5, expected = [[45795.81128532 26114.70012657 21586.92215826 24914.13916789 17606.71268169 19794.71458027]
 [54984.18871468 31354.29987343 25918.07784174 29912.86083211 21139.28731831 23766.28541973]]

Correlation = (0.9779452546334226, 0.0007242538456558558)

Process finished with exit code 0

It might be time to start writing this up!

GOES

Found vehicle orientation mnemonics: GNC_AD_STA_FUSED_QRS#

2020-07-23

11:00 Meeting with Erik and Vadim about schedules. Erik will send an update. The meeting went well. Vadim’s going to exercise the model through a set of GOTO ANGLE 90 / GOTO ANGLE 0 for each of the rwheels, and we’ll see how they map to the primary axis of the GOES

Phil 7.21.20

Superstrata ebike

Review papers – finished reading the first, write review today. First review done!

Realized that I really need to update my online resumes to include Python and Machine Learning. Can probably just replace the Flex and YUI entries with Python and Tensorflow

Read this today: Proposal: A Market for Truth to Address False Ads on Social Media. It’s by Marshall Van Alstyne, a Questrom Chair Professor at Boston University where he teaches information economics. From the Wikipedia entry

Information has special characteristics: It is easy to create but hard to trust. It is easy to spread but hard to control. It influences many decisions. These special characteristics (as compared with other types of goods) complicate many standard economic theories.
Information economics is formally related to game theory as two different types of games that may apply, including games with perfect information,^[5] complete information,^[6] and incomplete information.^[7] Experimental and game-theory methods have been developed to model and test theories of information economics,^[8]
This looks as close to the description of decisions in the presence of expensive information that I’ve seen so far

GPT-2 Agents

The run completed last night! I have 156,313 synthetic moves
Reworking the queries from the actual moves to reflect the probes for the synthetic

Created a view that combines the probe and the response into a description:

create or replace view gpt_view as
    select tm.move_number, tm.color, tm.piece, tm.`from`, tm.`to`, concat(tm.probe, tm.response) as description
    FROM table_moves as tm;

Almost forgot to backup the db before doing something dumb

Created a “constraint string” that should make the game space searched somewhat more similar:

and (move_number < 42 or description like "%White takes%" or description like "%Black takes%" or description like "%Check%")

Made the changes to the code and am running the analysis
My fancy queries are producing odd results. Pulling out the constraint string. That looks pretty good!

GPT-2-TWIC

As an aside, the chess queries and extraction is based on an understanding of movement tems like ‘from’ and ‘to’. Thinking about Alex’ finding of consensus metaterms, I think it would be useful to look for movement/consensus/compromise terms and then weighting the words that are nearby

ML meeting

Vacation pix!
Went over results shown above
Arpita found some good embedding results using Tensorboard, but not sure where to go from there?

Phil 7.9.20

NVAE: A Deep Hierarchical Variational Autoencoder

Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are currently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural architectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierarchical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ as shown in Fig. 1. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256×256 pixels.

VAEsNotGANs

Like Two Pis in a Pod: Author Similarity Across Time in the Ancient Greek Corpus

One commonly recognized feature of the Ancient Greek corpus is that later texts frequently imitate and allude to model texts from earlier time periods, but analysis of this phenomenon is mostly done for specific author pairs based on close reading and highly visible instances of imitation. In this work, we use computational techniques to examine the similarity of a wide range of Ancient Greek authors, with a focus on similarity between authors writing many centuries apart. We represent texts and authors based on their usage of high-frequency words to capture author signatures rather than document topics and measure similarity using Jensen- Shannon Divergence. We then analyze author similarity across centuries, finding high similarity between specific authors and across the corpus that is not common to all languages.

GPT-2 Agents

Setting up some experiments, for real and synthetic, black and white. All values should have raw numbers and percentages:
- Moves from each square by piece+color / total number of moves from square
- Moves to each square by piece+color / total number of moves from square
- Squares by piece+color / total number of pieces
- Sequences? I’d have to add back in castling and re-run. Maybe later
- Squares used over time (first 10 moves, second 10, etc)
- Pieces used over time
Create new directory called results that will contain the spreadsheets
Running the first queries. It’s going to take about an hour by my estimation, but nothing is exploding as far as the queries go

Add a spreadsheet for illegal moves. Done! Here’s the results. The GPT agents make 3 illegal moves out of 1,565:

illegal bishop move: {'from': 'e7', 'to': 'c6'}
illegal knight move: {'from': 'c5', 'to': 'a8'}
illegal queen move: {'from': 'f8', 'to': 'h4'}
Dataframe: ../results/legal_1.xlsx/legal-table_moves
         illegal  legal
pawns          0    446
rooks          0    270
bishops        1    193
knights        1    266
queen          1    175
king           0    212
totals         3   1562
Dataframe: ../results/legal_1.xlsx/legal-table_actual
         illegal   legal
pawns          0   49386
rooks          0   31507
bishops        0   28263
knights        0   31493
queen          0   22818
king           0   23608
totals         0  188324

GOES

Waiting on Vadim
2:00 AIMS-Core v3.0 Overview
Ping MARCOM

Waikato

6:00 Meeting

Phil 7.8.20

A brief history of high-speed trading (via the Museum of American Finance)

In the late 1830s, Philadelphia broker William C. Bridges operated a private signal station between New York and Philadelphia which disseminated stock market news to him and his backers (and to no one else). The signals were transmitted through an “optical telegraph,” which consisted of a series of boards on a pole, mounted on hills that could be seen by a telescope.

DtZ

The IHME site has improved to the point that we should pull down our site

GPT-2 Agents

Need to think about how to show that interrogating a language model is sufficiently similar to interrogating actual data.
- At this point, I know that the language model comes up with legal moves
- I need to compare the statistics of actual moves to synthetic moves to see if the populations are sufficiently similar. This means that I need to get the training and evaluation data into the database. Once that’s done, I can compare the frequency of move types (e.g. “At move 10, White moves pawn from a2 to a4”), and the moves from a particular location (e.g. “e2” can have moves to “e3” and “e4” with the pawn, or diagonals with the “f1” bishop or the white queen).
- The level of similarity should indicate if the biases of the players are represented in the language model.
  - There should be a way of determining a lower bound of data?
  - Once this is shown, then the idea of generalizing to other human interactions can be justified.
Started PGNtoDB, which will populate table_actual
- Ignoring castling for now
- Chunking into the database! And by chunking, I refer to the sound of the drive 🙂
- And now I have a catalog of 188,324 human chess moves

GOES

10:00 Meeting with Vadim
2:00 Status
Last training for a while!

Phil 7.7.20

The opportunity cost of this is going to be so steep. I wonder what country will set up an effective, open, online university?

GPT-2 Agents

Working through the texthero examples. Spent a lot of time figuring out how to print elements from a row in a Dataframe, which was ridiculously hard. Instead, I just turned it into a dict and worked with that

# print the first n rows of a dataframe using the specified columns. Use a -1 for printing all rows 
def print_df(df:pd.DataFrame, headers:List, num_rows:int = 4, max_chars:int = 80):
    s:pd.Series
    rows = 0

    d:Dict = df.to_dict('index')
    rd:Dict
    for index, rd in d.items():
        st = ""
        keys = rd.keys()
        for key in headers:
            if key in keys:
                val = rd[key]
                st += "{}: {}, ".format(key, val[:max_chars])
        print(st)
        rows += 1
        if num_rows != -1 and rows > num_rows:
            break

The scatterplot appears to use plotly, since it’s presented in the browser. That’s kind of cool, since it implies that the plotting functions of plotly are free somehow? After going to the plotly.com website, I see that “Plotly.py is free and open source and you can view the source, report issues or contribute on GitHub.” That would be worth digging into some more then. Here’s the PCA plot:

pca

You can make word clouds easily, too

WordCloud

GOES

Finish training? Ooops, forgot
Some discussion with Vadim about the structure of the control

ML Seminar

Good discussion on topic extraction over time. Basically, create k topics from the entire corpora. Each topic is a ranking of all the words in the corpus. Behavior over time is the amount of the top n words from each topic k in each time sample t.

Phil 7.6.20

GPT-2 Agents

Search the db for the appropriate “from to” text snippet (e.g. “Black moves pawn from e2 to e3”), with a count of the number of times this move was done using that piece. Done!
Add a “fewest hops” (A* – traditional network approach), closest (each step finds the closest node to the target) in addition to the line following algorithm. There will have to be some user testing to see what makes the most sense, if any
Played around a bit with Summarization, but it didn’t work that well
TextHero came across my Twitter feed. It might be good for topic extraction? Trying it out, but the documentation is… sparse. Checking out
- Installing many things:
  - unidecode
  - spacy (which installs many other things)
  - plotly (I thought you had to pay to use this?)
  - wordcloud
- Working through the example, which is broken. Trying to fix based on the Getting Started

GOES

11:00 Meeting with Vadim
Got the DataDictionary streaming to InfluxDBL

More Satern – one more course down

Phil 7.3.20

Today is a federal holiday, so no rocket science

Huggingface has a pipeline interface now that is pretty abstract. This works:

from transformers import pipeline

translator = pipeline("translation_en_to_fr")
print(translator("Hugging Face is a technology company based in New York and Paris", max_length=40))

[{‘translation_text’: ‘Hugging Face est une entreprise technologique basée à New York et à Paris.’}]

Wow: GPT-3 writes code!

DtZ is back up! Too many countries have the disease and the histories had to be cropped to stay under the data cap for the free service

GPT-2 Agents

Work on more granular path finding
- Going to try the hypotenuse of distance to source and line first – nope
- Trying looking for the distances of each and doing a nested sort
- I had a problem where I was checking to see whether a point was between the current node and the target node using the original line between the source and target nodes. Except that I was checking on a lone from the current node to the target, and failing the test. Oops! Fixed
- I went back to the hypotenuse version now that the in_between test isn’t broken and look at that!

granular

- Added the option for coarse or granular paths
Start thinking about topic extraction for a given corpus

#COVID

Evaluate Arabic to English translation. Got it working!

from transformers import MarianTokenizer, MarianMTModel
from typing import List
src = 'ar'  # source language
trg = 'en'  # target language
sample_text = "لم يسافر أبي إلى الخارج من قبل"
sample_text2 = "الصحة_السعودية تعلن إصابة أربعيني بفيروس كورونا بالمدينة المنورة حيث صنفت عدواه بحالة أولية مخالطة الإبل مشيرة إلى أن حماية الفرد من(كورونا)تكون باتباع الإرشادات الوقائية والمحافظة على النظافة والتعامل مع #الإبل والمواشي بحرص شديد من خلال ارتداء الكمامة "
mname = f'Helsinki-NLP/opus-mt-{src}-{trg}'

model = MarianMTModel.from_pretrained(mname)
tok = MarianTokenizer.from_pretrained(mname)
batch = tok.prepare_translation_batch(src_texts=[sample_text2])  # don't need tgt_text for inference
gen = model.generate(**batch)  # for forward pass: model(**batch)
words: List[str] = tok.batch_decode(gen, skip_special_tokens=True) 
print(words)

It took a few tries to find the right model. The naming here is very haphazard.

Asked for a sanity check from the group

This:

الصحة_السعودية تعلن إصابة أربعيني بفيروس كورونا بالمدينة المنورة حيث صنفت عدواه بحالة أولية مخالطة الإبل مشيرة إلى أن حماية الفرد من(كورونا)تكون باتباع الإرشادات الوقائية والمحافظة على النظافة والتعامل مع #الإبل والمواشي بحرص شديد من خلال ارتداء الكمامة

Translates to this:

Saudi health announces a 40-year-old corona virus in the city of Manora, where his enemy was classified as a primary camel conglomerate, indicating that the protection of the individual from Corona would be through preventive guidance, hygiene, and careful handling of the Apple and the cattle by wearing the gag.

Write script that takes a batch of rows and adds translations until all the rows in the table are complete

Book chat

List of folks who would be interesting to interview
- Stewart Russel
- Stuart Kauffman
- Alex (Sandy) Pentland
- Kate Starbird
- Joanna Bryson
- Daniel DeNicola
- Margaret Gilbert
- Joseph Liechty /Cecelia Clegg
- Rebecca Solint
- Zeynep Tefuchi
- Christian Jacob (directeur de recherche at the Centre national de la recherche scientifique in Paris)
- Ezra Klien

Phil 7.2.20

Emergence of polarized ideological opinions in multidimensional topic spaces

Opinion polarization is on the rise, causing concerns for the openness of public debates. Additionally, extreme opinions on different topics often show significant correlations. The dynamics leading to these polarized ideological opinions pose a challenge: How can such correlations emerge, without assuming them a priori in the individual preferences or in a preexisting social structure? Here we propose a simple model that reproduces ideological opinion states found in survey data, even between rather unrelated, but sufficiently controversial, topics. Inspired by skew coordinate systems recently proposed in natural language processing models, we solidify these intuitions in a formalism where opinions evolve in a multidimensional space where topics form a non-orthogonal basis. The model features a phase transition between consensus, opinion polarization, and ideological states, which we analytically characterize as a function of the controversialness and overlap of the topics. Our findings shed light upon the mechanisms driving the emergence of ideology in the formation of opinions.

DtZ has broken

dtzfail

GPT2-Agents

Continue working on the trajectory. I think that a plot that works entirely on distance to target can result in spirals, so there needs to be some kind of system that looks at the distance to the center line first, and if there is a fail, move the last node from the trajectory list to a dirty list. Then the search restores the cur node to the previous, and continue the search with the trajectory and dirty list nodes ignored?

Found an example to fix: A6 – H7

get_closest_node() line = [337.0, 44.0, 581.0, 499.0], cur_node = h1, node_list = [‘a6’, ‘b6’, ‘c7’, ‘d7’, ‘e6’, ‘c5’, ‘b7’, ‘g7’, ‘h6’, ‘g6’, ‘c6’, ‘e7’, ‘f7’, ‘g8’, ‘f6’, ‘d8’, ‘a8’, ‘e8’, ‘d6’, ‘b4’, ‘b8’, ‘c8’, ‘c4’, ‘e5’, ‘d5’, ‘d4’, ‘b5’, ‘c3’, ‘e4’, ‘f5’, ‘f8’, ‘f4’, ‘g5’, ‘g4’, ‘h5’, ‘h4’, ‘f3’, ‘d3’, ‘c2’, ‘e3’, ‘d2’, ‘e2’, ‘b2’, ‘b1’, ‘c1’, ‘e1’, ‘d1’, ‘a1’, ‘f1’, ‘g3’, ‘h3’, ‘g2’, ‘f2’, ‘g1’, ‘h2’, ‘h1’]
It does fine until it gets to E6, where it chooses c5

Adding a target distance-based search if the distance to line search fails seems to have fixed it:

nlist = list(nx.all_neighbors(self.gml_model, cur_node))
print("\tneighbors = {}".format(nlist))
dist_dict = {}
sx, sy = self.get_center(cur_node)

for n in nlist:
    if n not in node_list:
        newx, newy = self.get_center(n)
        newa = [newx, newy]
        print("\tline dist checking {} at {}".format(n, newa))
        x, y = self.point_to_line([l[0], l[1]], [l[2], l[3]], newa)
        ca = [x, y]
        ib = self.is_between([sx, sy], [l[2], l[3]], [x, y])
        if ib:
            # option 1: Find the closest to the line
            dist = np.linalg.norm(np.array(newa)-np.array(ca))
            dist_dict[n] = dist
            print("\tis BETWEEN = {}, dist = {}".format(ib, dist))
if len(dist_dict) == 0:
    ta = [self.get_center(self.target_node)]
    for n in nlist:
        if n not in node_list:
            newx, newy = self.get_center(n)
            newa = [newx, newy]
            print("\ttarget dist checking {} at {}".format(n, newa))
            # option 2: Find the closest to the target node
            dist = np.linalg.norm(np.array(newa)-np.array(ta))
            dist_dict[n] = dist
            print("\tis CLOSEST: dist = {}".format(dist))

Got legal trajectories working. Below is a set of jumps that are legal (rook to c1, bishop to e3 and then h6, then rook the rest of the way) I think I want to also sort based on closest distance to the current node.

legal_moves

GOES

Add InfluxDB streaming to DD
10:00 Sim meeting
2:00 Status meeting

Phil 6.30.20

(Re)Discovering Protein Structure and Function Through Language Modeling (ArXiv)(Code)

In our study, we show how a language model, trained simply to predict a masked (hidden) amino acid in a protein sequence, recovers high-level structural and functional properties of proteins. In particular, we show how the Transformer language model uses attention (1) to capture the folding structure of proteins, connecting regions that are apart in the underlying sequence but spatially close in the protein structure, and (2) targets binding sites, a key functional component of proteins. We also introduce a three-dimensional visualization of the interaction between attention and protein structure. Our findings align with biological processes and provide a tool to aid scientific discovery. The code for the visualization tool and experiments is available at https://github.com/salesforce/provis.
TL;DR: Trained solely on language modeling, the Transformer’s attention mechanism recovers high-level structural and functional properties of proteins.
We explored the degree to which attention captures these contact relationships by analyzing the attention patterns of 5,000 protein sequences and comparing them to ground-truth contact maps. Our analysis revealed that one particular head — the 12th layer’s 4^th head, denoted as head 12-4 — aligned remarkably well with the contact map. For “high confidence” attention (> .9 ), 76% of this head’s total attention connected amino acids that were in contact. In contrast, the background frequency of contacts among all amino acid pairs in the dataset is just 1.3%.

GPT-2 Agents

Add a menu that writes node spatial information to the DB
Add a “Graph from DB” menu that assembles the edge information from the move table and the node information from the new table, above.

Continue on path finding

Distance between a point and a line using numpy (stackoverflow). Not exactly what I need, which is the point of intersection and the distance. There is a stackoverflow post that is close, but here’s a version that tests the results and plots it:

import numpy as np
import math
import matplotlib.pyplot as plt

p1 = np.array([1.0, 1.0])
l1 = np.array([0.0, 1.0])
l2 = np.array([1.0, 0.0])

lvec = l2 - l1
lvec /= np.linalg.norm(lvec, 2)

p2 = l1 + lvec * np.dot(p1 - l1, lvec)
print("intesection = {}".format(p2)) #0.2 1.

pvec = p2 - p1
dist = np.linalg.norm(pvec, 2)
pvec /= dist
det = np.linalg.det([lvec, pvec])
dot = np.dot(lvec, pvec)
rads = math.atan2(det, dot)
print("distance = {}, angle = {}".format(dist, math.degrees(rads)))

plt.plot([l1[0], l2[0]],[l1[1], l2[1]])
plt.plot([p1[0], p2[0]],[p1[1], p2[1]])
plt.show()

Here’s the test for seeing if a point is on a line. Again, loosely based on a stackoverflow post:

def is_between(self, l1:[int, int], l2:[int, int], p1:[int, int], epsilon:float = .1) -> bool:
    p1 = np.array(p1).astype(np.float)
    l1 = np.array(l1).astype(np.float)
    l2 = np.array(l2).astype(np.float)
    
    s1 = np.linalg.norm(l1-p1)
    s2 = np.linalg.norm(l2-p1)
    d = np.linalg.norm(l2-l1)
    # print("d = {}, s1 + s2 = {}".format(d, s1+s2))
    if abs(d - (s1+s2)) < epsilon:
        return True
    return False

Got graphical node selection working. Need to tie that back into the menus for start and stop

Proposal

Looks like no writing today. Done, maybe?

GOES

10:00 CASSIE demo – really good
12:00 All Hands – need to catch up on my training. Something for the afternoons?

ML Seminar

Status report
Participated in some some triage on Arpita’s and Fatima’s paper

Phil 6.29.20

ACM IUI 2021 is the 26th annual premier international forum for reporting
outstanding research and development on intelligent user interfaces.

ACM IUI is where the Human-Computer Interaction (HCI) and Artificial
Intelligence (AI) communities meet, with contributions from related fields
such as psychology, behavioral science, cognitive science, computer
graphics, design, the arts, and more. Our focus is on improving the
interaction between humans and digital technology, by leveraging both HCI
approaches and state-of-the art AI techniques from machine learning,
natural language processing, data mining, knowledge representation and
reasoning.

GOES:

Ping Erik about collaborative VR coding environments. Done
2:00 Meeting with Vadim
- Walked through the deep hierarchy example
- He’s now running 4 wheels and starting to get close, though the RW speed plots are not close to the actuals. It makes me think that there is more feedback control in the satellite implementation than there is implied in the documentation.

Proposal

After digging into the existing text, we realized that a lot of the technical sections were flat wrong, and depended on a kind of “magical ML thinking” that should have been in our phase III. So, lots of writing.

GPT-2 Agents

Working on trajectory plotting
Fix the listbox select. I was using the wrong event. It should be like this.

ListBoxSelect

Aaaand then there were a bunch of weird errors. For some reason, the call to a new ListBox also calls the previous ListBox with no args(?) so I get an error. Chased down and fixed.
Plot main line. Done!

Plot legal connections of closest lines
- I think this can be done by looking at the nodes that are connected to the start (current) node, then looking at the coordinates of all the children. The one that is closest to the line and between the current and the target gets added to the list
- Plotting all the node connections so there can be a sanity check:

Use the weight of the lines to choose the lines
Build a narrative rutter that describes the route (Here be there stampedes!)

Phil 6.25.20

Latent Embeddings of Point Process Excitations

When specific events seem to spur others in their wake, marked Hawkes processes enable us to reckon with their statistics. The underdetermined empirical nature of these event-triggering mechanisms hinders estimation in the multivariate setting. Spatiotemporal applications alleviate this obstacle by allowing relationships to depend only on relative distances in real Euclidean space; we employ the framework as a vessel for embedding arbitrary event types in a new latent space. By performing synthetic experiments on short records as well as an investigation into options markets and pathogens, we demonstrate that learning the embedding alongside a point process model uncovers the coherent, rather than spurious, interactions.

Misinformation, Crisis, and Public Health—Reviewing the Literature

The Covid-19 pandemic has been accompanied by a parallel “infodemic” (Rothkopf 2003; WHO 2020a), a term used by the World Health Organization (WHO) to describe the widespread sharing of false and misleading information about the novel coronavirus. Misleading information about the disease has been a problem in diverse societies around the globe. It has been blamed for fatal poisonings in Iran (Forrest 2020), racial hatred and violence against people of Asian descent (Kozlowska 2020), and the use of unproven and potentially dangerous drugs (Rogers et al. 2020). A video promoting a range of false claims and conspiracy theories about the disease, including an antivaccine message, spread widely (Alba 2020) across social media platforms and around the world. Those spreading misinformation include friends and relatives with the best intentions, opportunists with books and nutritional supplements to sell, and world leaders trying to consolidate political power.

GPT-2 Agents

Well, networkx can write a gefx file that Gephi can read, but not the other way around.
Networkx CAN read and write gml files, though. Switching to that.
That seems to be working well:

Now let’s see if we can draw it in the app
Things are starting to get very specific. creating a subclass

Pulling attributes is not obvious. Here’s how you do it for the nodes read in from gml:

attrs = nx.get_node_attributes(self.gml_model, 'graphics')
for key, val in attrs.items():
    print("{} = {}".format(key, val))

Loading and displaying the nodes! Next, I need to get piece data from the database. Also, since the graphics attribute can be a dictionary, it may be possible to add attributes like that to the edge data? Then I won’t need to re-access the db. Conversely, another way to do this might be to update the table in the db with positions, etc. Hmmmm

GOES

10:00 Meeting with Vadim. Nope, he broke his code. Rescheduled for tomorrow

Proposal

Work on technical section with Aaron?

Phil 6.24.20

GPT-2 Agents

Starting work on the navigator app
Today’s progress:

TkCanvasBase

I think that this can be the core of the initial navigation capability for any corpus. You should be able to identify a topic on the map or in the list, and the system will figure out the most direct route (linear distance).
I think there also needs to be an ability to see the directly connected neighbors as well, since they might be farther away due to mapping constraints. For example, we can see that d2 is linked directly to d7, which is almost completely across the board. This is the result of the white queen making a pretty aggressive move. It’s not common, but it does happen. It might be interesting for someone working their way from arithmetic to calculus to see, for example, how Johann Carl Friedrich Gauss did it:

nearest

GOES

10:00 Meeting with Vadim
- We’re going to try to get a single RW to move the vehicle through two successive 90-degree maneuvers, then verify that everything is working correctly on the other RWs, then go to RW sets
2:00 Status meeting

Proposal

LMN returned the top 3 results from iacr.org, which turns out to be the International Association for Cryptologic Research
Found and downloaded some papers
Long chat with Aaron

Phil 6.22.20

Cornell University was having a sale, so I got a book:

Mental Territories

Rarely recognized outside its boundaries today, the Pacific Northwest region known at the turn of the century as the Inland Empire included portions of the states of Washington and Idaho, as well as British Columbia. Katherine G. Morrissey traces the history of this self-proclaimed region from its origins through its heyday. In doing so, she challenges the characterization of regions as fixed places defined by their geography, economy, and demographics. Regions, she argues, are best understood as mental constructs, internally defined through conflicts and debates among different groups of people seeking to control a particular area’s identity and direction. She tells the story of the Inland Empire as a complex narrative of competing perceptions and interests.

DtZ:

Change the code so that there is a 30 day prediction based on the current rates regardless of trend. I think it tells the story of second waves better:

GPT-2 Agents

The ACSOS paper was rejected, so this is now the only path going forward for mapmaking research.
Used the known_nearest to produce a graph:

The graph on the left is the full graph, and the right is culled. First, note that node c is not in the second graph. There is no confirming link, so we don’t know if it’s an accident. Node e is also not on the chart, because it has no confirming link back through any 2-edge path.
Ok, I tried it for the first time on the chess data. There is a bug where [a-h] and [1-8] are showing up as nodes that I have to figure out. But they show up in the right way! Orthogonal and in order!

The bug seems to be in the way that List.extend() works. It seems to be splitting the string (which is a List, duh), and adding those elements as well? Nope, just doing one nesting too many
Ok, here are the first results. The first image is of all neighbors. The second is of only verified nearest neighbors (at least one edge chain of 2 that lead back to the original node)

In both cases, the large-scale features of the chessboard are visible. There is a progression from 1 to 8, and a to h. It seems clearer to me in the lower image, and the grid-like nature is more visible. I think I need to get the interactive manipulation working, because some of this could be drawing artifacts
Trying out the networkx_viewer. A little worried about this though:

networkxviewer

And rightly so:

kablooee

Going to try cloning and fixing. Nope. It is waaaaaaayyyyyy broken, and depends on earlier version of networkx
Networkx suggests Gephi, and there is a way to export graphs from networkx. Trying that
Seems usable?

GOES

Kind of stuck. Waiting on Vadim
Probably will be working on a couple of SBIRs for the next few weeks

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: research

Phil 8.7.20

Phil 8.6.20

Phil 7.23.20

Phil 7.21.20

Phil 7.9.20

Phil 7.8.20

Phil 7.7.20

Phil 7.6.20

Phil 7.3.20

Phil 7.2.20

Phil 6.30.20

Phil 6.29.20

Phil 6.25.20

Phil 6.24.20

Phil 6.22.20