Category Archives: Phil

Phil 3.6.21

https://twitter.com/noahtren/status/1368114923956535296

Arkipelago.space is a searchable map of interesting things on the Internet. The content is taken from a web crawl of 70,000 webpages originating from high-quality, human-curated links via Curius.app. A neural network uses the text content of each page to determine which pages should appear near each other on the map.

It seems to be a bunch of students playing around with cool things

Huggingface has lots of models to handle speech tagging!

Main model page: huggingface.co/models?filter=sequence-tagger-model&pipeline_tag=token-classification
The standard phrase chunking model for English that ships with Flair: huggingface.co/flair/chunk-english
The fast 4-class NER model for English that ships with Flair: huggingface.co/flair/ner-english-fast
The standard part-of-speech tagging model for English that ships with Flair: huggingface.co/flair/pos-english

Phil 3.5.21

This is a lot like self-attention in Transformers: How social learning amplifies moral outrage expression in online social networks

Moral outrage shapes fundamental aspects of human social life and is now widespread in online social networks. Here, we show how social learning processes amplify online moral outrage expressions over time. In two pre-registered observational studies of Twitter (7,331 users and 12.7 million total tweets) and two pre-registered behavioral experiments (N = 240), we find that positive social feedback for outrage expressions increases the likelihood of future outrage expressions, consistent with principles of reinforcement learning. We also find that outrage expressions are sensitive to expressive norms in users’ social networks, over and above users’ own preferences, suggesting that norm learning processes guide online outrage expressions. Moreover, expressive norms moderate social reinforcement of outrage: in ideologically extreme networks, where outrage expression is more common, users are less sensitive to social feedback when deciding whether to express outrage. Our findings highlight how platform design interacts with human learning mechanisms to impact moral discourse in digital public spaces.

Book

2:00 Meeting with Michelle

GPT-Agents

Finish summary table – Mostly done. Needs tweaking
3:30 Meeting

GOES

11:00 Meeting
Continue working on data generation – generating faulty rw sims!

Phil 3.4.21

I wonder if any crazy things are going to happen today? Capitol Police say intelligence shows militia group may be plotting to breach the Capitol

GPT-Agents

In EccoToXlsx, add code to iterate over all the samples from a prompt and add selected token ranks for the selected columns to a summary Dict. Compute mean and variance (95% intervals?), display the table and plot a candlestick plot.
Set up a mapping directory in GPT-2 Agents. Do some test pulls using the Python API. I think the goal should be to populate a database that is similar to the gpt2_chess db table_moves (from, to, probe, response),

Combined with table_output from gpt_experiments (experiment_id, root_id, tag, before_regex, and after_regex):

Book

Work on chapters

GOES

Work on fast sim
- Finish moving code from frame3d_test file to FastRCSGenerator. Keep the plots too, just to make sure everything’s working. Done
- Realized that the pitch/roll/yaw calculations were being done by ODE, so I had to get them back from the quaternion. It turns out that pyquaternion has yaw_pitch_roll(), but I can’t get to it? Added it to the VecData code
  - Figured it out. The @property decorator means no parens. You treat a method as a variable
- I don’t think I’m incrementally updating setting the quaternion right.
- Turns out I was rotating twice and storing the incremental steps as the rotations. Fixed!

Phil 2.3.21

Panel Study Of The MAGA Movement

WaPo summary article: What explains MAGA supporters’ commitment to Trump and his conspiratorial and racist views? The answer is “status threat,” or the belief that one’s way of life or status is undermined by social and cultural change. As we’ve shown elsewhere, those who are attracted to reactionary movements like MAGA are often motivated by anxiety about possible cultural dispossession — seeing their social and cultural dominance eclipsed by other groups.

This is pretty cool! Not sure if it will work right, but…? Configure remote Python interpreters

Book

Work on chapters

GPT-Agents

Finished all the models!
Set up experiments that run through each model for each set of terms and set of probes. Batch size of 50

SBIR

GOES

Sitting in on GSAW keynote
Vadim has made progress! 11:00 Meeting
2:00 Meeting
Work on fast sim
- Created data_generators project in PyBullet
- Copied ScriptReaderScratch to FastRCSGenerator
- Copied over the classes in least_squares_rotations (VecData, Rwheel, Rwheels, and Frame3D) and made them their own files
- wrote up a frame3d_test file to exercise the classes and make sure that I haven’t broken anything. Everything still works!
Get connected to repo?
More on setting up a BERT-style (autoencoding) transformer for time series. Vector of sin waves at different frequencies first

JuryRoom

5:00 Meeting? Or just online?

Phil 3.2.21

Respond to Alden’s email done

AI Coffee Break with Letitia

GOES

Status report! Done!
Create a new class based on utils/ScriptReaderScratch that uses the the code from least_squares_rotations.py to create data for training
Attend the GSAW welcome and overview at 11:50 – missed it
Create a more generic generator based on timeseriesML2\generators that will create a numpy ndarray of n-dimensional times series data. Could also use a Dataframe and have labels.
- Randomized start, within a range
- Adjustable noise
- Adjustable time step
- Different function for each row
- Input file driven
- Saves to csv (with a header that describes the data?) or an excel file for humans. Use the to_excel() code from EccoToXlsx for this

GPT Agents

Run an Ecco experiment and create spreadsheets using the chess data – done

https://viztales.com/wp-content/uploads/2021/03/image-3.png

After that, back up the gpt_experiments and commit to svn – done
Make sure that the following are on the laptop for the 3:00 Meeting -done
- updated gpt_experiments
- small_feb2021
Uploading trained models to svn. When the last one is done, zip the whole batch and put it on DropBox
I think I know how to contribute to a project that I am not a member. I need to clone the project to my repo and work on that version. When I’m at a state that I like, then I can do a pull request. That means there are going to be one version of the source project in External and my branch in Sandboxes

Phil 3.1.2021

I reran my monthly COVID-19 visualizations. Here’s my sample of countries. The UK is at the top of the ‘badly handled’ cluster, which includes the USA, Italy, Sweden, France and Switzerland. Germany is a bit better, and Canada really seems to be keeping things under control. The bottom cluster ranges from Finland to Senegal to China. Effective policy doesn’t seem to be related to government, wealth, population or location:

https://public.flourish.studio/visualisation/4504138/

And here’s all 50 states plus territories. I switch between Republican and Democratic governors at the end. You can see that there’s not much difference except for Georgia. Something has gone horribly wrong there:

https://public.flourish.studio/visualisation/4303726/

GPT Agents

Running Ecco trend analysis with the new model that Sim made
- I think there is a multiple embedding problem that we’ll need to address.
- It looks really good though…

https://viztales.com/wp-content/uploads/2021/03/image-1.png

Still training monthly models. At October 2020 now. It takes a bit under 10 hours to train most models

Phil 2.26.21

Pick up RV! done

GPT Agents

Working on turning the rank matrix into a class of EccoTrendAnalytics – done!
Need to make a ‘json’ tag for table_output and load in the ETA dict
New version of Ecco out. I need to mergeand fold in my changes
Still training the April model – done! On to May
3:30 Meeting. We played with the GPT-3 a lot

GOES

11:00 Meeting with Erik & Vadim. Continue working on creating data using the LS model. Architect and train a classifier. Demo the yaw flip to show capability and then focus on Nadir Point.

SBIR

11:30 Meeting to finalize report. Done

Book:

2:00 Meeting with Michelle. More pitch organization

Phil 2.25.21

GPT Agents

Continuing to dig into GPT-3 prompt metaprogramming
Now training the March model. That’s been running for about 10 hours so far. Finished around 5:00. On to April!
Had a short chat with Jay about changes to Ecco and how to submit a pull request through the GitHub website. Maybe I did it right?
Working on updating experiment code to handle new format – done
Adding json outputs for ecco data into gpt_experiments
Got sequence data working:

Made it a dataframe

         ' pawn'  ' rook'  ' knight'  ' bishop'  ' queen'  ' king'
 knight        1        2          4          3         5        6
 from         36       29          7         13        19       33
 c            10       11          9         13        14       12
4            264     1208        696        865       887      372
 to          668     3314        486        513       325      533
 e            37      160         14         44       102       53
5            944     3567        452       4937      2836     4243
.           1361     3926        933        149      2472     1468
 Black      2512     3164       1508       1604      1974     1925
 moves         7       57         21         49        37       46
 bishop        5        4          2          1         3        6
 from         11        9          7          5        15       23
 b            13       11         12         10        14        9
7           2324     1244        788       1449      2228     1252

Phil 2.24.21

GPT Agents:

Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
- Prevailing methods for mapping large generative language models to supervised tasks may fail to sufficiently probe models’ novel capabilities. Using GPT-3 as a case study, we show that 0-shot prompts can significantly outperform few-shot prompts. We suggest that the function of few-shot examples in these cases is better described as locating an already learned task rather than meta-learning. This analysis motivates rethinking the role of prompts in controlling and evaluating powerful language models. In this work, we discuss methods of prompt programming, emphasizing the usefulness of considering prompts through the lens of natural language. We explore techniques for exploiting the capacity of narratives and cultural anchors to encode nuanced intentions and techniques for encouraging deconstruction of a problem into components before producing a verdict. Informed by this more encompassing theory of prompt programming, we also introduce the idea of a metaprompt that seeds the model to generate its own natural language prompts for a range of tasks. Finally, we discuss how these more general methods of interacting with language models can be incorporated into existing and future benchmarks and practical applications.
Language models are 0-shot interpreters
- In this post, I present evidence that the efficacy of 0-shot prompts for GPT-3 has been underestimated, and that more powerful models are more effective at deriving information from 0-shot prompts, while less powerful models have greater need for examples on equivalent tasks. From this evidence, I extrapolate three principal claims:
  - Few-shot prompts are not always an efficient or necessary means of task specification for GPT-3.
  - For some tasks, such as translation between well-known languages, GPT-3 is a 0-shot interpreter – a short task description or signifier suffices to invoke its full capabilities.
  - 0-shot performance scales with model size more drastically than few-shot performance, suggesting that 0-shot task specification will become a more important prompting strategy as language models increase in capability.
Started training the January 2020 model
It looks like sim got the new format model trained? It’s up through Feb 21. Need to adjust the query code and parser and do some runs for the queries we discussed last night as well as month/year prompts. And combos, e.g. October 2020 USA [[COVID-19 happened because
Here’s the first try:

Clearly, we’re going to have to look across a range of places to get good counts
Another thought, we could use autoencoding models like BERT to for fill-in-the-blank. Here’s how to do it: How to use BERT from the Hugging Face transformer library

SBIR

Working on the status report. I’ll distribute tomorrow for input to the financial section

GOES

2:00 Meeting
Still waiting on Vadim to get the reaction wheel efficiency in the right place and inertialess reset

JuryRoom

Reading “Purakau: Maori Myths Retold by Maori Writers”. Some interesting perspectives on group problem solving and education, particularly in the story ‘Rata’, by Hemi Kelly:
- ‘“Sail towards the rising sun,” she instructed him, “there you will find Pariroa, the home of Matuku.” After saying this, she handed Rata an old toki, “You will need this to fashion your waka.”’
- “You didn’t recite the correct karakia – or in fact any karakia. Instead you carelessly chopped down your ancestor, a child of Tāne, for your own gain without offering anything in return.”
- ‘It’s the same with our rongoā. Anybody can go and pick a leaf and eat it but it’s the process we follow that makes it right. It’s the time we go, the area we visit and the careful selection. The most important thing, though, is our acknowledgement of Tāne through karakia, as it’s the karakia that gives the rongoā its healing properties that make us better.’

Phil 2.23.21

GOES

SBIR

More status report

GPT Agents

Started digging into the GPT-3 documentation. They have a playground which lets you interactively try prompts on the different models. I think this could knowledge could be pulled out in a pretty straightforward way through multiple probes and regex. Here’s some examples:

The great religions of the world are:

Judaism

Christianity

Islam

Hinduism

Buddhism

Sikhism

Jainism

Confucianism

Shinto

A list of the closest religions to Judaism:

Christianity (30%)

Islam (30%)

Buddhism (5%)

Sikhism (5%)

Hinduism (3%)

A list of the closest religions to Christianity:

Judaism

Islam

Hinduism

Buddhism

Agnosticism

Atheism

Christianity

Orthodox

Catholic

Theism

God

Note that the Judaism and Christianity lists support each other. This could look a lot like the original mapping Java mapping code?
It does not know about the pandemic (prompt is bold): “coronavirus is a member of the Coronaviridae family, which includes animals and birds as known hosts. The virus is a single-stranded, positive-sense RNA virus with a genome of approximately 30 kb. The genome is organized into three segments: S, M, and L.“
3:00 meeting today
- See if I can train up monthly models
- Create prompts and evaluate their default
- Run prompts with Ecco for ranking with our relative terms
- We’re going to try for the social sensing workshop:
  - The social sensing workshop (started in 2015) is a multidisciplinary meeting place that brings together social scientists and computer scientists, interested in social media analysis, around research that interprets social media as measurement instruments. Social media democratized information production offering an unprecedented view into human habits, customs, culture, stances, and indeed descriptions of physical events that transpire in the world. They also give unprecedented opportunities to spread misinformation, influence opinion, distract from the truth, or advance specific agendas, hidden or overt. The potential of social media to influence populations has brought about an interest in understanding information operations; namely, coordinated efforts on social media meant to alter people’s opinions, emotions, or understanding of events. What are scientific foundations for modeling this new communication, measurement, and influence channel? How to utilize information media signals to better understand social systems, communities, and each other? How to identify and mitigate misuse of this medium? What specifically can one measure or influence, what underlying theoretical framework allows one to do so, and what applications are enabled by the endeavor? Since measurement and influence operations are well-studied in many physical domains, what can one learn from the physical domain (e.g., from the signal processing literature) to enable novel social media analysis methods? This scope brings about new interdisciplinary research challenges and opportunities at the intersection of communication and sensing, social network analysis, information theory, data mining, natural language processing, artificial intelligence, and social sciences.

Phil 2.22.21

Next year this date will be very exciting!

Replace rear tire!

Shopping! Done

Book

2:00 Meeting with Michelle
Need to add something about Nomads by choice and nomads by circumstance. One is pathfinding, and the other is abandonment/expulsion

SBIR

Bi-monthly report

GPT Agents

I just got on the OpenAI GPT-3 beta!
Train a single month using the new data and the small model
Set few-shot training by drawing selecting all the tweets that contain the phrase %xxxx% in the corpora and subsample as needed to the desired number of examples. Then generate and store the desired number of results
Use Ecco to illustrate selected examples?
Update Ecco and issue a pull request – done!
- Return the dict if html_output = False
- html_output = True in args
- Here’s the non-html version

hello, ecco
2021-02-22 09:02:11.376728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
<LMOutput 'one, two, three, four, five, five, six, six, seven, seven, eight, seven, ten, twelve,' # of lm outputs: 7>
{'tokens': [{'token': 'one', 'token_id': 505, 'type': 'input', 'value': '0.22258912', 'position': 0}, {'token': ',', 'token_id': 11, 'type': 'input', 'value': '0.064860135', 'position': 1}, {'token': ' two', 'token_id': 734, 'type': 'input', 'value': '0.15291078', 'position': 2}, {'token': ',', 'token_id': 11, 'type': 'input', 'value': '0.075403504', 'position': 3}, {'token': ' three', 'token_id': 1115, 'type': 'input', 'value': '0.21873675', 'position': 4}, {'token': ',', 'token_id': 11, 'type': 'input', 'value': '0.04952843', 'position': 5}, {'token': ' four', 'token_id': 1440, 'type': 'input', 'value': '0.17954932', 'position': 6}, {'token': ',', 'token_id': 11, 'type': 'input', 'value': '0.03642196', 'position': 7}, {'token': ' five', 'token_id': 1936, 'type': 'output', 'value': '0', 'position': 8}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 9}, {'token': ' five', 'token_id': 1936, 'type': 'output', 'value': '0', 'position': 10}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 11}, {'token': ' six', 'token_id': 2237, 'type': 'output', 'value': '0', 'position': 12}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 13}, {'token': ' six', 'token_id': 2237, 'type': 'output', 'value': '0', 'position': 14}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 15}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'value': '0', 'position': 16}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 17}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'value': '0', 'position': 18}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 19}, {'token': ' eight', 'token_id': 3624, 'type': 'output', 'value': '0', 'position': 20}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 21}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'value': '0', 'position': 22}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 23}, {'token': ' ten', 'token_id': 3478, 'type': 'output', 'value': '0', 'position': 24}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 25}, {'token': ' twelve', 'token_id': 14104, 'type': 'output', 'value': '0', 'position': 26}, {'token': ',', 'token_id': 11, 'type': 'output', 'value': '0', 'position': 27}], 'attributions': [[0.22258912026882172, 0.06486013531684875, 0.15291078388690948, 0.07540350407361984, 0.21873675286769867, 0.04952843114733696, 0.17954932153224945, 0.03642195835709572], [0.21059739589691162, 0.08506321907043457, 0.10517895966768265, 0.06970299780368805, 0.0833553671836853, 0.06893090158700943, 0.11408285796642303, 0.06513398140668869, 0.19795432686805725], [0.18917866051197052, 0.05591090768575668, 0.11437422782182693, 0.06194028630852699, 0.14940014481544495, 0.05109493061900139, 0.13605113327503204, 0.04323190823197365, 0.16186146438121796, 0.036956313997507095], [0.1881534308195114, 0.06915824860334396, 0.09677033871412277, 0.05285586044192314, 0.07482346147298813, 0.05150040239095688, 0.08954611420631409, 0.052106279879808426, 0.09847243875265121, 0.07199375331401825, 0.15461969375610352], [0.17256230115890503, 0.051994744688272476, 0.09864256531000137, 0.0542512908577919, 0.12412769347429276, 0.048067208379507065, 0.11856795847415924, 0.04466724395751953, 0.115085169672966, 0.03303493186831474, 0.10736697167158127, 0.0316319540143013], [0.1601886749267578, 0.061091333627700806, 0.07975628972053528, 0.04445934668183327, 0.06458994001150131, 0.04427047073841095, 0.06633348762989044, 0.04812125489115715, 0.06941992044448853, 0.04545144364237785, 0.05678795278072357, 0.04917420074343681, 0.2103556990623474], [0.15204396843910217, 0.045879215002059937, 0.08276450634002686, 0.04710154980421066, 0.10755057632923126, 0.04356013983488083, 0.10087976604700089, 0.04368861764669418, 0.1039900854229927, 0.03419572487473488, 0.09188895672559738, 0.018628299236297607, 0.10064685344696045, 0.027181735262274742], [0.15202827751636505, 0.054488517343997955, 0.07370518893003464, 0.03688199818134308, 0.06367962807416916, 0.03571085259318352, 0.06372495740652084, 0.03937176987528801, 0.0647173747420311, 0.04080101102590561, 0.04775563254952431, 0.03407425060868263, 0.08399699628353119, 0.05491986870765686, 0.15414364635944366], [0.14423254132270813, 0.043195948004722595, 0.07906319946050644, 0.04169945418834686, 0.0998333990573883, 0.03888460621237755, 0.09324592351913452, 0.040613241493701935, 0.10006645321846008, 0.03402014449238777, 0.08717798441648483, 0.02316855825483799, 0.06652144342660904, 0.01834581419825554, 0.0651322454214096, 0.024798991158604622], [0.13373175263404846, 0.04726942256093025, 0.06662306189537048, 0.03134704381227493, 0.055253904312849045, 0.029025832191109657, 0.05654153227806091, 0.03190178796648979, 0.05787787586450577, 0.03615624085068703, 0.04139142856001854, 0.03552539646625519, 0.05441423878073692, 0.03426354378461838, 0.05245133861899376, 0.0413157157599926, 0.19490987062454224], [0.1345134824514389, 0.04032375290989876, 0.07328703999519348, 0.03726156800985336, 0.08852741122245789, 0.03443199023604393, 0.08518599718809128, 0.03616797551512718, 0.09120426326990128, 0.031830523163080215, 0.07864660024642944, 0.023800017312169075, 0.05850181728601456, 0.02105688489973545, 0.05589877441525459, 0.015616844408214092, 0.06888125836849213, 0.02486378327012062], [0.12652075290679932, 0.04134056717157364, 0.06051903963088989, 0.02772645838558674, 0.050938140600919724, 0.025345684960484505, 0.05245476961135864, 0.026518667116761208, 0.05662158131599426, 0.030446955934166908, 0.044499896466732025, 0.03298814222216606, 0.05097997933626175, 0.03056175820529461, 0.04033765196800232, 0.02774038165807724, 0.07083491235971451, 0.043943196535110474, 0.15968146920204163], [0.12978246808052063, 0.03859121724963188, 0.06958351284265518, 0.03402606025338173, 0.08373469114303589, 0.030794424936175346, 0.07956207543611526, 0.032011084258556366, 0.08528966456651688, 0.02837086282670498, 0.07092872262001038, 0.02181713469326496, 0.053476281464099884, 0.020540550351142883, 0.05198293551802635, 0.018952928483486176, 0.05511363595724106, 0.013460895046591759, 0.05456216633319855, 0.02741875685751438], [0.1180478110909462, 0.04020005092024803, 0.054879143834114075, 0.02707388810813427, 0.047450218349695206, 0.023739859461784363, 0.05038924515247345, 0.024456234648823738, 0.047715943306684494, 0.02747093327343464, 0.03706439211964607, 0.0308961383998394, 0.04147675260901451, 0.031551189720630646, 0.0328393280506134, 0.027200669050216675, 0.0456111878156662, 0.02625013142824173, 0.05143207311630249, 0.03938102349638939, 0.17487379908561707], [0.1171923577785492, 0.034699417650699615, 0.06537913531064987, 0.03106089122593403, 0.0792543962597847, 0.027469279244542122, 0.0716739073395729, 0.02862895093858242, 0.07993895560503006, 0.025614075362682343, 0.06548977643251419, 0.020967641845345497, 0.05116114020347595, 0.02104913629591465, 0.05117170512676239, 0.021118970587849617, 0.05328201875090599, 0.017415646463632584, 0.0483008548617363, 0.010480904020369053, 0.0580831877887249, 0.020567581057548523], [0.10331424325704575, 0.03044593520462513, 0.0515039786696434, 0.020523859187960625, 0.04673514515161514, 0.01886308752000332, 0.04609629884362221, 0.018619094043970108, 0.04607436805963516, 0.020929865539073944, 0.03714223951101303, 0.02411222830414772, 0.04459238797426224, 0.024786628782749176, 0.033377647399902344, 0.023412106558680534, 0.061772845685482025, 0.022222455590963364, 0.050952207297086716, 0.029886750504374504, 0.05736793577671051, 0.043550651520490646, 0.14371803402900696], [0.11561653017997742, 0.03434549644589424, 0.06183779239654541, 0.029705578461289406, 0.07546865940093994, 0.026097867637872696, 0.06843607127666473, 0.027166275307536125, 0.07738236337900162, 0.024367135018110275, 0.06257404386997223, 0.019834842532873154, 0.04871196299791336, 0.019919248297810555, 0.04681717976927757, 0.020517172291874886, 0.049287013709545135, 0.018004678189754486, 0.0437239371240139, 0.01232482586055994, 0.04339848831295967, 0.008752784691751003, 0.04222355782985687, 0.02348649688065052], [0.10086216032505035, 0.03523547574877739, 0.04474911466240883, 0.02235623449087143, 0.03697269409894943, 0.0194859616458416, 0.04178309068083763, 0.018883418291807175, 0.04281749948859215, 0.020283740013837814, 0.037377286702394485, 0.022277342155575752, 0.03504293039441109, 0.02316695638000965, 0.02779640257358551, 0.02297920361161232, 0.033881187438964844, 0.022866230458021164, 0.03282133489847183, 0.02020248956978321, 0.041888438165187836, 0.025234123691916466, 0.04621865972876549, 0.0418705977499485, 0.18294738233089447], [0.09987115859985352, 0.028910968452692032, 0.053824856877326965, 0.025705233216285706, 0.06502629071474075, 0.02236601710319519, 0.05963760241866112, 0.022843429818749428, 0.06642232835292816, 0.021395988762378693, 0.054698407649993896, 0.018247412517666817, 0.04362887516617775, 0.017933566123247147, 0.04240269213914871, 0.018407726660370827, 0.0463126040995121, 0.017563357949256897, 0.04160071536898613, 0.015998894348740578, 0.03565698862075806, 0.014985213056206703, 0.04064951464533806, 0.0129097243770957, 0.0890578106045723, 0.023942623287439346], [0.0809876024723053, 0.03257425129413605, 0.037600934505462646, 0.02191905491054058, 0.03524129465222359, 0.018587255850434303, 0.04728523641824722, 0.017956718802452087, 0.036090362817049026, 0.018309801816940308, 0.03293214365839958, 0.018760766834020615, 0.03602268546819687, 0.019794179126620293, 0.03185930848121643, 0.020158812403678894, 0.02658083476126194, 0.01986663229763508, 0.02638169191777706, 0.019547127187252045, 0.03760283440351486, 0.02026677317917347, 0.0282907634973526, 0.02577134408056736, 0.0830618366599083, 0.037606652826070786, 0.1689431220293045]]}
C:\Program Files\Python\lib\site-packages\sklearn\decomposition\_nmf.py:1077: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence.
  " improve convergence." % max_iter, ConvergenceWarning)
{'tokens': [{'token': 'one', 'token_id': 505, 'type': 'input', 'position': 0}, {'token': ',', 'token_id': 11, 'type': 'input', 'position': 1}, {'token': ' two', 'token_id': 734, 'type': 'input', 'position': 2}, {'token': ',', 'token_id': 11, 'type': 'input', 'position': 3}, {'token': ' three', 'token_id': 1115, 'type': 'input', 'position': 4}, {'token': ',', 'token_id': 11, 'type': 'input', 'position': 5}, {'token': ' four', 'token_id': 1440, 'type': 'input', 'position': 6}, {'token': ',', 'token_id': 11, 'type': 'input', 'position': 7}, {'token': ' five', 'token_id': 1936, 'type': 'output', 'position': 8}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 9}, {'token': ' five', 'token_id': 1936, 'type': 'output', 'position': 10}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 11}, {'token': ' six', 'token_id': 2237, 'type': 'output', 'position': 12}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 13}, {'token': ' six', 'token_id': 2237, 'type': 'output', 'position': 14}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 15}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'position': 16}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 17}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'position': 18}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 19}, {'token': ' eight', 'token_id': 3624, 'type': 'output', 'position': 20}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 21}, {'token': ' seven', 'token_id': 3598, 'type': 'output', 'position': 22}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 23}, {'token': ' ten', 'token_id': 3478, 'type': 'output', 'position': 24}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 25}, {'token': ' twelve', 'token_id': 14104, 'type': 'output', 'position': 26}, {'token': ',', 'token_id': 11, 'type': 'output', 'position': 27}], 'factors': [[[1.1749050617218018, 0.0, 4.174020796199329e-05, 0.0, 0.0, 0.0, 0.0, 1.6842181139509194e-05, 1.6842181139509194e-05, 7.907724648248404e-05, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00021821660629939288, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.583216549595818e-05, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0006136916927061975, 0.0, 0.0, 0.03745964169502258, 0.004776454996317625, 0.004776454996317625, 0.08487999439239502, 0.0, 0.15284782648086548, 0.006533266510814428, 0.19210265576839447, 0.0, 0.23522864282131195, 0.0, 0.26809075474739075, 0.0, 0.2898506820201874, 0.0, 0.2553434669971466, 0.0005055475048720837, 0.30294179916381836, 0.005593298468738794, 0.03264965862035751, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.419406771659851, 0.0, 1.559943437576294, 0.0009293985203839839, 0.0009293985203839839, 1.383615255355835, 0.014789805747568607, 0.9984501600265503, 0.0, 0.707604169845581, 0.0, 0.4685404300689697, 0.0, 0.2941336929798126, 0.0, 0.13417471945285797, 0.009529122151434422, 0.10319297760725021, 0.007372288033366203, 0.0, 0.0, 0.06743992865085602, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.008554508909583092, 0.0, 0.0334680899977684, 0.0, 0.15881435573101044, 0.0, 0.23637165129184723, 2.5085502784349956e-05, 0.27648571133613586, 0.002731950953602791, 0.31169313192367554, 0.0, 0.26685473322868347, 0.0], [0.0, 0.0, 0.0, 0.003048897720873356, 0.0, 0.0, 0.0, 0.11994873732328415, 0.11994873732328415, 0.00020270798995625228, 0.43330657482147217, 0.0153672369197011, 0.6093799471855164, 0.01629449985921383, 1.021621823310852, 0.016422705724835396, 1.1039609909057617, 0.0, 0.6456769704818726, 0.0, 0.3512823283672333, 0.0, 0.09890929609537125, 0.0, 0.017450863495469093, 0.011394587345421314, 0.0, 0.0], [0.0, 0.0, 0.696977436542511, 0.006282013840973377, 0.2083570659160614, 0.0, 0.060352765023708344, 0.0, 0.0, 0.0, 0.0012489393120631576, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0014060646062716842, 0.0015231686411425471, 0.00017680390737950802, 0.004380697384476662, 0.0, 0.0006365026929415762, 0.0, 0.0, 0.0, 0.029077952727675438], [0.0, 0.0, 0.0, 0.0, 0.0, 0.5207799673080444, 0.0, 0.6182433366775513, 0.6182433366775513, 0.00682591088116169, 0.5392492413520813, 0.0012549938401207328, 0.4196029305458069, 0.0, 0.1323639303445816, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.042940836399793625, 0.0, 0.0, 0.0, 0.0986984446644783, 0.0], [0.0, 0.0, 0.0, 0.0015103527111932635, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00701142055913806, 0.0, 0.0, 0.0, 0.022583989426493645, 0.0, 0.007366797886788845, 0.00075104262214154, 0.0, 0.0, 0.0035612466745078564, 0.0, 0.13789713382720947, 0.0, 0.018509462475776672, 0.0, 0.7861870527267456, 0.023602478206157684, 0.8958369493484497], [1.92419270206301e-06, 0.8122134208679199, 0.0042172386310994625, 0.1016063317656517, 0.0, 0.0, 0.0, 0.0, 0.0, 0.004280794877558947, 0.030012501403689384, 0.0, 0.011906291358172894, 0.0, 0.0022146401461213827, 0.0, 0.0, 0.0, 0.0005024807178415358, 0.00017763646610546857, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.021231088787317276, 0.0016654416685923934], [0.0, 0.0, 0.0, 1.7516762018203735, 0.003883163910359144, 0.8880190253257751, 0.017067980021238327, 0.3859412372112274, 0.3859412372112274, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.01619700714945793, 0.0, 0.0, 0.0, 0.0, 0.0, 0.042748916894197464, 0.0, 0.025438468903303146, 0.0, 0.0, 0.0]]]}
{'input_tokens': ['one', ',', ' two', ',', ' three', ',', ' four', ',', ' five', ',', ' five', ',', ' six', ',', ' six', ',', ' seven', ',', ' seven', ',', ' eight', ',', ' seven', ',', ' ten', ',', ' twelve', ','], 'output_tokens': ["'one'", "','", "' two'"], 'rankings': array([[ 655,   12,   16],
       [ 380,   17,   29],
       [1365,    2,  184],
       [ 604,    1,  161],
       [ 797,    1,   66],
       [2141,    1,   98]])}

Phil 2.20.21

Huri Whakatau

Jason Edward Lewis is a digital media theorist, poet, and software designer. He is the University Research Chair in Computational Media and the Indigenous Future Imaginary as well as Professor of Computation Arts at Concordia University, Montreal. Born and raised in northern California, Lewis is Hawaiian and Samoan. (Publications)

Aboriginal Territories in Cyberspace is an Aboriginally determined research-creation network whose goal is to ensure Indigenous presence in the web pages, online environments, video games, and virtual worlds that comprise cyberspace.

The Initiative for Indigenous Futures (IIF) is a partnership of universities and community organizations dedicated to developing multiple visions of Indigenous peoples tomorrow in order to better understand where we need to go today.

From Interactions of the ACM: The Humboldt Cup: On narrative, taxonomies, and colonial violence:

Historian of science Londa Schiebinger [6] offers a compelling account of how the creation of racial and gender hierarchies has permeated the construction of biology and medicine as fields of knowledge. Engaging with the taxonomical system devised by Swedish naturalist Carl von Linnaeus, she points out that traits such as the breasts or the skull were subjected to processes of racialization and sexualization in attempts to produce arguments that would justify the subjugation of femininity and of all racialized peoples. Schiebinger argues that scientists were, in fact, fundamental actors in the colonizing process: In describing, classifying, taxonomizing, and representing this so-called new world, European powers sought to claim ownership over lands, peoples, flora, and fauna. Classifying entire groups of animals based on the presence of breasts was a choice; other characteristics could have been highlighted, such as the presence of hair [6]. White patriarchal domination was thus asserted through notebooks, measuring tools, pens, and paintbrushes just as much as it was through firearms.

Phil 2.19.21

Made cookies last night. I blame the weather

Book

Starting on DARPA Balloon challenge – done!
Ping Alex Rutherford about diversity injection?
Delayed meeting with Michelle until Monday at 2:00
Kurt Braddock is a professor of communications at American University, and the author of Weaponized Words: The Strategic Role of Persuasion in Violent Radicalization and Counter-Radicalization. He explains to Bob how these methods work — and why de-radicalization is a tougher challenge than intervening at an earlier stage. (from On the Media)

GOES

Harassment training

GPT Agents

Tweaked the twitter queries.
Got all the Ecco parts working with the chess DB!

https://viztales.com/wp-content/uploads/2021/02/image-21.png

ML Seminar. Fatima might be interested in time series transformers
Watched and discussed GPT-3: Language Models are Few-Shot Learners (Paper Explained)

Phil 2.18.21

Book

Finished the Google Doodle. Next is balloon challenge.

GOES

https://datascience.stackexchange.com/questions/51065/what-is-the-positional-encoding-in-the-transformer-model – Positional Encoding, the terms were a bit vague in some other articles
kazemnejad.com/blog/transformer_architecture_positional_encoding
From Tensorflow: tutorials/text/transformer
http://jalammar.github.io/illustrated-transformer/ – For Attention Heads

SBIR

Meeting on NN architecture

GPT Agents

Language Models are few-shot learners (video). Lots of good stuff about how to build probes.
Was able to get the ecco library to work with my local model!

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
from ecco.lm import LM

activations=False
attention=False
hidden_states=True
activations_layer_nums=None

model_str = '../models/chess_model'
tokenizer = AutoTokenizer.from_pretrained(model_str)
model = AutoModelForCausalLM.from_pretrained(model_str, output_hidden_states=hidden_states, output_attentions=attention)

lm_kwargs = {
    'collect_activations_flag': activations,
    'collect_activations_layer_nums': activations_layer_nums}
lm = LM(model, tokenizer, **lm_kwargs)

# Input text
text = "Check."

# Generate 100 tokens to complete the input text.
output = lm.generate(text, generate=100, do_sample=True)

print(output)

Had a nice chat with Antonio about an introduction for the special issue

Phil 2.17.21

Today we’re excited to release a big update to the Galaxy visualization, an interactive UMAP plot of graph embeddings of books and articles assigned in the Open Syllabus corpus! (This is using the new v2.5 release of the underlying dataset, which also comes out today.) The Galaxy is an attempt to give a 10,000-meter view of the “co-assignment” patterns in the OS data – basically, which books and articles are assigned together in the same courses. By training node embeddings on the citation graph formed from (syllabus, book/article) edges, we can get really high-quality representations of books and articles that capture the ways in which professional instructors use them in the classroom – the types of courses they’re assigned in, the other books they’re paired with, etc.

https://galaxy.opensyllabus.org/#!viewport/-17.7366/15.2999/-10.5617/8.5290

Working with Antonio on an introduction to the journal

GPT Agents

Was going to work on adding local models to Ecco, but got derailed
GPT-3: Language Models are Few-Shot Learners (Paper Explained)
Downloaded Sim’s new model
Uploaded/committed the zipped backup
Run Sim’s Queries and add to spreadsheet (need to create a view that uses text from extended tweet if available):

#to get week number (in range 0-53 ) from date

select count(*) as COUNT, extract(WEEK from created_at) as WEEK from twitter_root where text like “%chinavirus%” group by extract(WEEK from created_at) order by extract(WEEK from created_at);

#to drill down from month to week and week to day for March and April since they have significantly larger volume of data

select count(*) as COUNT, date(created_at) as DATE from twitter_root where text like “%chinavirus%” and (month(created_at) = 3 or month(created_at) = 4) and year(created_at) = 2020 group by date(created_at) order by date(created_at);

Figured out how to do this:

create or replace  view long_text_view as
select tr.row_id, tr.created_at,
    case when extended_tweet_row_id <> 0 then et.full_text else tr.text end as long_text
    from twitter_root tr
        left join extended_tweet et on tr.extended_tweet_row_id = et.row_id;

This is kind of cool. The chinavirus is most peaky but low volume. The sars-cov-2 is low volume but flatter, and the more common terms are pretty similar with gentler peaks and slower falloff. I want to know what that dip isaround week nine, and the peak at week 5 in the chinavirus plot.

GOES

Time to really dig into transformers
http://peterbloem.nl/blog/transformers – for converting a sequence to sequence into a classifier. The videos are very nice!
https://datascience.stackexchange.com/questions/51065/what-is-the-positional-encoding-in-the-transformer-model – Positional Encoding, the terms were a bit vague in some other articles
http://jalammar.github.io/illustrated-transformer/ – For Attention Heads

SBIR

10:00 Meeting