Monthly Archives: November 2020

Phil 11.30.20

Call Verizon

Call stove repair – done

GPT-2 Agents

  • Upload db and corporas – done

GOES

  • Slides and presentation – done
  • 2:00 meeting with Vadim – delayed. I’m getting worried that he’s flailing again

SBIR

  • 10:00 meeting – done, waiting for names and emails

COE Meeting

  • I have my doubts if this is going to work. It seems more like an attempt by BD to get us to create a body of text that they can mine for proposals

Meeting with Aaron

  • Caught up on a lot of things

Phil 11.25.20

IRS stuff

Enjoying Google Research’s Verse by Verse

GPT-2 Agents

  • I think I want to put together a small command-line app that allows a discussion with the language model. All text from the ongoing conversation is saved and and used as the input for the next. A nice touch would be to have some small number of responses to choose from, and the conversation follows that branch.
  • Come to think of it, that could be a cool artificial JuryRoom/Eliza
  • Generate compact text for Sim to try training
  • Look into W2V 3D embedding of outputs, and mapping to adjacent outputs (The wo/man walked into the room). We know that there should be some level of alignment

GOES

  • Rehearse/record talk
  • Meeting with Vadim. He found Pyomo

Book

  • Working on the Attention + Dominance section

Phil 11.24.20

Anonymous 4: Alleluia: Gratulemur et letemur is lovely

PyTorch BigGraph is an Open Source Framework for Processing Large Graphs

  • Graphs are one of the fundamental data structures in machine learning applications. Specifically, graph-embedding methods are a form of unsupervised learning, in that they learn representations of nodes using the native graph structure. Training data in mainstream scenarios such as social media predictions, internet of things(IOT) pattern detection or drug-sequence modeling are naturally represented using graph structures. Any one of those scenarios can easily produce graphs with billions of interconnected nodes. While the richness and intrinsic navigation capabilities of graph structures is a great playground for machine learning models, their complexity posses massive scalability challenges. Not surprisingly, the support for large-scale graph data structures in modern deep learning frameworks is still quite limited. Recently, Facebook unveiled PyTorch BigGraph, a new framework that makes it much faster and easier to produce graph embeddings for extremely large graphs in PyTorch models.

GOES

  • Add composite rotation vector to ddict output. It’s kind of doing what it’s supposed to
  • Think about a NN to find optimal contributions? Or simultaneous solution of the scalars to produce the best approximation of the line? I think this is the way to go. I found pymoo: Multi-objective Optimization in Python
    • Our framework offers state of the art single- and multi-objective optimization algorithms and many more features related to multi-objective optimization such as visualization and decision making. Going to ask Vadim to see if it can be used for our needs
  • MORS talk, headshots, slides, etc
  • 11:00 meeting with Vadim

GPT-2 Agents

Phil 11.23.20

Call Jim Donnie’s! Sheesh – they are booked until January. Going to bring it by for someone to look at

Got the reviews back from AAMAS. Not as bad as SASO, but not great. I think I’ll write a rebuttal because why not? And it’s only 800 words

Call for Papers for Special Issue: Artificial Speakers – Philosophical Questions and Implications

  • With the increasing ubiquity of natural language processing (NLP) algorithms, interacting with “conversational artificial agents” such as speaking robots, chatbots, and personal assistants will be an everyday occurrence for most people. In a rather innocuous sense, we can perform a variety of speech acts with them, from asking a question to telling a joke, as they respond to our input just as any other agent would.

Book

  • Write some of the “Attention + Dominance” paper/chapter outline for Antonio. It’s important to mention that these are monolithic models. It could be a nice place for the Sanhedren 17a discussion too.

GOES

  • Rework primary_axis_rotations.py to use least-squares. It’s looking pretty good!
https://viztales.files.wordpress.com/2020/11/image-11.png
  • Folding into sim.
https://viztales.files.wordpress.com/2020/11/replayer_11_23_20.gif
  • It’s still not right, dammit! I’m beginning to wonder if the rwheels are correct? Wheels 1 and 4 are behaving oddly, and maybe 3. It’s like they may be spinning the wrong way?
  • Nope, it looks like it is the way the reaction wheel contributions are being calculated?
https://viztales.files.wordpress.com/2020/11/image-12.png
  • I think this is enlightening. There seems to be some kind of interplay between the computed rotation and the approximation based on the rwheels:
https://viztales.files.wordpress.com/2020/11/image-13.png

Phil 11.22.20

The Language Interpretability Tool (LIT): Interactive Exploration and Analysis of NLP Models

  • With these challenges in mind, we built and open-sourced the Language Interpretability Tool (LIT), an interactive platform for NLP model understanding. LIT builds upon the lessons learned from the What-If Tool with greatly expanded capabilities, which cover a wide range of NLP tasks including sequence generation, span labeling, classification and regression, along with customizable and extensible visualizations and model analysis. (GitHub)

Phil 11.20.20

From the Washington Post this morning

Book

  • Read and annotate Michelle’s outline, and add something about attention. That’s also the core of my response to Antonio
  • More cults
  • 2:00 Meeting
  • Thinking about how design must address American Gnosticism, and the danger and opportunities of online “research”, and also how things like maps and diversity injection can potentially make profound impacts

GOES

  • Update test code to use least squares/quaternion technique
  • look into celluloid package for animating pyplot
  • 3:00 Meeting

GPT-2 Agents

import tensorflow as tf
# pip install git+https://github.com/huggingface/transformers.git
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer

# options are (from https://huggingface.co/transformers/pretrained_models.html)
# 'gpt2' : 12-layer, 768-hidden, 12-heads, 117M parameters. # OpenAI GPT-2 English model
# 'gpt2-medium' : 24-layer, 1024-hidden, 16-heads, 345M parameters. # OpenAI’s Medium-sized GPT-2 English model
# 'gpt2-large' : 36-layer, 1280-hidden, 20-heads, 774M parameters. # OpenAI’s Large-sized GPT-2 English model
# 'gpt2-xl' : 48-layer, 1600-hidden, 25-heads, 1558M parameters.. # OpenAI’s XL-sized GPT-2 English model
tokenizer = GPT2Tokenizer.from_pretrained("../models/chess_model")

# add the EOS token as PAD token to avoid warnings
# model = TFGPT2LMHeadModel.from_pretrained("../models/gpt2-medium", pad_token_id=tokenizer.eos_token_id)
model = TFGPT2LMHeadModel.from_pretrained("../models/chess_model", pad_token_id=tokenizer.eos_token_id, from_pt=True)

wte = model.transformer.wte
wpe = model.transformer.wpe
word_embeddings:tf.Variable = wte.weight  # Word Token Embeddings
print("\nword_embeddings.shape = {}".format(word_embeddings.shape))
terms = ['black', 'white', 'king', 'queen', 'rook', 'bishop', 'knight', 'pawn']
for term in terms:
    text_index_list = tokenizer.encode(term)
    print("\nindex for {} = {}".format(term, text_index_list))
    for ti in text_index_list:
        vec = word_embeddings[ti]
        print("{}[{}] = {}...{}".format(term, ti, vec[:3], vec[-3:]))
  • It gives the following results:
word_embeddings.shape = (50257, 768)
 index for black = [13424]
 black[13424] = [ 0.1466832  -0.03205131  0.13073246]…[ 0.03556942  0.2691384  -0.15679955]
 index for white = [11186]
 white[11186] = [ 0.01213744 -0.08717686  0.09657521]…[-0.01646501  0.05803612 -0.14158668]
 index for king = [3364]
 king[3364] = [ 0.07679952 -0.36437798  0.04769149]…[-0.2532825   0.11794613 -0.22853516]
 index for queen = [4188, 268]
 queen[4188] = [ 0.01280739 -0.12996083  0.10692213]…[0.03401601 0.01343785 0.30656403]
 queen[268] = [-0.17423214 -0.14485645  0.04941033]…[-0.16350408 -0.10608979 -0.03318951]
 index for rook = [305, 482]
 rook[305] = [ 0.08708595 -0.13490516  0.17987011]…[-0.17060643  0.07456072  0.04632077]
 rook[482] = [-0.07434712 -0.01915449  0.04398194]…[ 0.02418434 -0.06441653  0.26534158]
 index for bishop = [27832]
 bishop[27832] = [-0.05137009 -0.11024677  0.0080909 ]…[-0.02372078  0.00158158 -0.08555448]
 index for knight = [74, 3847]
 knight[74] = [ 0.10828184 -0.20851855  0.2618368 ]…[0.10234124 0.1747297  0.15052234]
 knight[3847] = [-0.15940899 -0.14975397  0.13490209]…[ 0.01935775  0.056772   -0.08009521]
 index for pawn = [79, 3832]
 pawn[79] = [-0.02358418 -0.18336709  0.08343078]…[ 0.23536623  0.06735501 -0.13106383]
 pawn[3832] = [0.12719391 0.05303555 0.12345099]…[-0.15112995  0.14558738 -0.05049708]

Phil 11.19.20

GPT-2 Agents

  • Looks like we are getting close to ingesting all the new data
  • Had a meeting with Ashwag last night (Note – we need to move the time), and the lack of ‘story-ness’ in the training set is really coming out in the model. The meta information works perfectly, but it’s wrapped around stochastic tweets, since there is no threading. I think there needs to be some topic structure in the meta information that allows similar topics to be grouped sequentially in the training set.
  • 3:30 Meeting

GOES

  • 9:30 meeting
  • Update code with new limits on how small a step can be. Done, but I’m still having normal problems. It could be because I’m normalizing the contributions?
https://viztales.files.wordpress.com/2020/11/replayer_11_19_20.gif
  • Switching to a least-squares approach done?!
transform = lambda x: unpad(np.dot(pad(x), A))

print("\nTarget:")
print(secondary)
print("\nResult:")
print(transform(primary))
print("\nMax error: \n{}".format(np.abs(secondary - transform(primary)).max()))
print("\nA = \n{}".format(A))

Ap = A[:3, :3]
print("\nrotation matrix = \n{}".format(Ap))


print("getting quaternion")
q = Quaternion(matrix=Ap)
print("got quaternion")
print("Axis = {}".format(q.get_axis()))
print("Degrees = {}".format(q.degrees))

Book

  • More cults. Tying together Jonestown and Moby-Dick seems to be working better that what I was doing before

Phil 11.18.20

GPT-2 Agents

  • Good meeting last night. The next action items are to
    • Figure out how to build embeddings from generated tweets
    • Finish updating database
    • Create new training set
    • Train small and medium models using current and compact meta frame (something like [[month, location, retweets]], or some similar easy regex)
  • Need to register for Emerging Techniques forum: mors.org/Events/Workshops/Emerging-Techniques-Forum

GOES

  • Need to fix the angle rollover in vehicle (and reference?) frames. I don’t think that it will fix anything though. I just don’t get why the satellite drifts after 70-ish degrees:
  • There is something not right in the normal calculation?
https://viztales.files.wordpress.com/2020/11/replayer_11_18_20.gif?w=986
  • I think the problem is going to be here. Need to dig into this in a scratch file:
def get_angle(self, v1:Tuple, v2:Tuple) -> [float, float]:
dot = np.dot(v1, v2)
v1norm = np.linalg.norm(np.array(v1))
v2norm = np.linalg.norm(np.array(v2))
v1n_v2n = v1norm*v2norm
normdot = np.divide(dot, v1n_v2n)
# r_angle = np.arccos(np.clip(dot, -1.0, 1.0))
r_angle = np.arccos(normdot)
d_angle = math.degrees(r_angle)
# print("rads = {}, degrees = {}".format(r_angle, d_angle))
return r_angle, d_angle
  • The small angle steps seem to also be causing a problem:
https://viztales.files.wordpress.com/2020/11/image-8.png
  • Need to finish my telecommute form

Book

  • Writing cults section.

Phil 11.17.20

GOES

  • Got my panda3D text positioning working. I create a node and then attach the text to it. I think that’s not needed, that all you have to do is use render rather than the a2* options. Here’s how it works:
for n in self.tuple_list:
ls:LineSegs = self.lp.get_LineSeg(n)
node = TextNode(n)
node.setText(n)
tnp:NodePath = self.render.attachNewNode(node)
#tnp.set_pos(-1, ypos, 1)
tnp.setScale(0.1)
self.text_node_dict[n] = tnp
  • I then access the node paths through the dict

Book

  • Spent a good chunk of the morning discussing the concept of dominance hierarchies and how they affect networks with Antonio
  • Need to write some, dammit!

GOES

  • My abstract has been accepted at the Military Operations Research Symposium’s (MORS) 4 day workshop in November!
  • More Replayer. Working on text nodes. Done! It looks good, and is pointing out some REALLY ODD THINGS . I mean, the reaction wheel axis are not staying with the vehicle frame…
  • 10:00 meeting with Vadim
  • The (well, a) problem was that the reaction wheel vectors weren’t being reset each time, so the multiplies accumulated. Fixed! Now we have some additional problems, but these may be more manageable:
https://viztales.files.wordpress.com/2020/11/image-6.png

GPT-2 Agents

  • Continuing to ingest tweets:
  • 3:30 Meeting

Phil 11.16.20

How things are going

GPT-2 Agents

  • Ingesting more data. It may take a while

GOES

  • Continuing with Replayer
  • Got the points moving based on the spreadsheet. I need to label them. It looks pretty straightforward to use 3D positions? I may have to use Billboards to keep things pointing at the camera

Phil 11.13.20

Book

  • Look into Prof. Kristin Du Mez (Calvin College – @kkdumez)’s book (Jesus and John Wayne)?
  • More writing
  • Meeting with Michelle. Came up with an interesting tangent about programming/deprogramming wrt software development and cults

GPT-2 Agents

  • Adding an optional split regex to parse_substrings. Here’s how I wound up doing it. This turns out to be slightly trickey, because the matches don’t include the text we’re interested in, so we have a step that splits out all the individual texts. We also throw away the text that leads to the first line and the last line of text since both can be assumed to be incomplete
split_regex = re.compile(r"(?=On [a-zA-Z]+ of [0-9]+,)")
split_iter = split_regex.finditer(tweet_str)
start = 0
tweet_list = []
for s in split_iter:
t = tweet_str[start: s.start()]
tweet_list.append(t)
start = s.start()
  • Shimei’s presentation went well!
  • Work on translation

GOES

  • Start on playback of the vehicle and reference frames

Phil 11.12.20

The Next Decade Could Be Even Worse

  • A historian believes he has discovered iron laws that predict the rise and fall of societies. He has bad news.

GPT-2 Agents

  • Tried Sim’s model, it’s very nice!
  • Created a base class for creating and parsing tweets
  • Found a regex that will find any text between two tokens. Thanks, stackoverflow!
  • Here’s an example. I need to look into how large the meta information should be before it starts affecting the trajectory
On July of 2020, @MikenzieCromwell (screen name "Mikenzie Cromwell", 838 followers) posted a tweet from Boston, MA. They were using Twitter Web App. The post had 0 replies, 0 quotes, 1 retweets, and 3 favorites. "An example of the importance of the #mentalhealth community's response to #COVID19 is being featured in the @WorldBank survey. Check out the latest #MentalHealthResponse survey data on the state of mental health services in the wake of the pandemic. https://t.co/9qrq4G4XJi" 

GOES

  • More Replayer
  • Got the vertex manipulation! It’s hard to get at it though the geometry, but if you just save the LineSegs object,
ls = LineSegs(name)
self.prim_dict[name] = ls
  • you can manipulate that directly
ls:LineSegs = self.lp.get_LineSeg("test_line")
ls.setVertex(1, x, y, z)
  • Meeting with Vadim at 10:00. We found some pretty bad code that sets the torques on the reaction wheels

Book

  • Write and send letters – done!
  • More Moby Dick

Phil 11.11.20

I did something bad with my data yesterday. This is the correct version (I hope)

https://public.flourish.studio/visualisation/4303726/

GPT-2 Agents

  • Generating some results for Friday
  • Splitting the results on the probes. It looks like the second tweet in a series is better formed. That kind of makes sense, because the second tweet is based on the first. That leads to an interesting idea. Maybe we should try building chains of text using the result from the previous
  • Generating text with 1000 chars and parsing it, throwing away the first and last element in the list. I can also parse out the tweet, location, and sentiment:
[1]: In December of 2019, @svsvzz (21046 followers, 21784 following) posted a retweet, mentioning @ArticleSpot. It was sent from Saudi Arabia. @svsvzz wrote: "RT @ArticleSpot New update # Comment_study is coming..". The retweet was categorized as "Neutral". 
     Location = Saudi Arabia
     Sentiment = Neutral
     Tweet = RT @ArticleSpot New update # Comment_study is coming..
 [2]: In December of 2019, @HussainALhamad (2340 followers, 29 following) posted a retweet, mentioning @ejazah_ksa. It was sent from Riyadh, Saudi Arabia. @HussainALhamad wrote: "RT @ejazah_ksa Poll: Do you support #Suspension of studying in #Riyadh tomorrow, Monday? If you support (Retweet) If you do not support (Like)". The retweet was categorized as "Positive". 
     Location = Riyadh, Saudi Arabia
     Sentiment = Positive
     Tweet = RT @ejazah_ksa Poll: Do you support #Suspension of studying in #Riyadh tomorrow, Monday? If you support (Retweet) If you do not support (Like)
 [3]: In December of 2019, @mahfouz_nour (11 followers, 57 following) posted a tweet. She wrote: "♪ And the rest of the news about a news that the study was suspended in the study ♪ ♪ And God bless you ♪ ♪ Now ♪". The tweet was categorized as "Negative". 
     Location = None
     Sentiment = Negative
     Tweet = ♪ And the rest of the news about a news that the study was suspended in the study ♪ ♪ And God bless you ♪ ♪ Now ♪
 [4]: In December of 2019, @tansh99huda99 (1211 followers, 519 following) posted a retweet, mentioning @HashKSA. @tansh99huda99 wrote: "RT @HashKSA # comments on Monday at all schools for students in #Dahan, and the decision does not include teachers' and teachers' levels.". The retweet was categorized as "Neutral". 
     Location = None
     Sentiment = Neutral
     Tweet = RT @HashKSA # comments on Monday at all schools for students in #Dahan, and the decision does not include teachers' and teachers' levels.

Created some slides. I think they look pretty good:

Phil 11/10/20

America Is a Lot Sicker Than We Wanted to Believe

  • Nearly half of the voters have seen Trump in all of his splendor—his infantile tirades, his disastrous and lethal policies, his contempt for democracy in all its forms—and they decided that they wanted more of it.

Added some code that makes it easier to compare countries and states and produced an animated GIF. I’m more concerned about Maryland now!

https://public.flourish.studio/visualisation/4302655/

Book

  • Letter to Stuart Kauffman
  • Letter to Frans de Waal

GPT-2 Agents

  • Created an animated GIF of English, Chinese, and Arabic countries for the Friday presentation
  • DB-based topic text extraction
  • 3:30 Meeting

GOES

  • Work on Replayer

Phil 11.9.20

Went down to DC yesterday. So weird to see the White House behind multiple sets of walls, like a US base in Afghanistan

Dentist at 3:00

GPT-2 Agents

  • Working on generating a new normalized data set. It needs to be mush smaller to get results by the end of the week. Done. It takes a couple of passes through the data to get totals needed for percentages, but it seems to be working well
  • Restarted training
  • Topic extraction from Tweet content

GOES

  • Started working on 3D view of what’s going on with the two frames. I think I’m just going to have to start over with a a smaller codebase though, if Vadim can’t find what’s going on in his code.
  • 1:30 Meeting with Vadim