Monthly Archives: October 2020

Phil 10.30.20

So yesterday the oven broke, and today it’s the fridge

GPT-2 Agents

Try finetuning the large model on the large dev machine. While trying out the gpt2-large model (because the gpt2-xl model didn’t work), I had an odd problem. When I tried to finetune the model using my local file (which I had saved earlier), the system choked on the lack of a pytorch_model.bin, which must not download when you just want to use the model itself. My guess is that if you don’t specify the file, it will probably download and work, but my default drive is a small SSD, and I don’t want to load it up.
To see what was going on, I used a script that I had used before to find where transformers stores the model:

from transformers.file_utils import hf_bucket_url, cached_path

pretrained_model_name = f'gpt2-large'
archive_file = hf_bucket_url(
    pretrained_model_name,
    filename='pytorch_model.bin',
    use_cdn=True,
)
resolved_archive_file = cached_path(archive_file)
print(resolved_archive_file)

That printed out the following:

C:\Users\Phil/.cache\torch\transformers\eeb916d81211b381b5ca53007b5cbbd2f5b12ff121e42e938751d1fee0e513f6.999a50942f8e31ea6fa89ec2580cb38fa40e3db5aa46102d0406bcfa77d9142d

After renaming to pytorch_model.bin and moving it to my model directory, the finetuning is now working!
At around 10:30 last night, the checkpoints had filled up my 1TB data drive! Tried a bunch of things to use a checkpoint for restarting from chackpoint. Pointing the model to the checkpoint seems to be the right answer, but it was missing the vocab and merges.txt files. Tried to pull that over from the original model, but now I get a:

Traceback (most recent call last):
   File "run_language_modeling.py", line 355, in 
     main()
   File "run_language_modeling.py", line 319, in main
     trainer.train(model_path=model_path)
   File "D:\Program Files\Python37\lib\site-packages\transformers\trainer.py", line 621, in train
     train_dataloader = self.get_train_dataloader()
   File "D:\Program Files\Python37\lib\site-packages\transformers\trainer.py", line 417, in get_train_dataloader
     train_sampler = self._get_train_sampler()
   File "D:\Program Files\Python37\lib\site-packages\transformers\trainer.py", line 402, in _get_train_sampler
     if self.args.local_rank == -1
   File "D:\Program Files\Python37\lib\site-packages\torch\utils\data\sampler.py", line 104, in init
     "value, but got num_samples={}".format(self.num_samples))
 ValueError: num_samples should be a positive integer value, but got num_samples=0

Not sure what to do next. Going to try restarting and cleaning out the earlier checkpoints as the code runs
Another thing that I’m thinking about is the non-narrative nature of the tweets, due to the lack of threading, so I also pulled down the Kagle repository for Trump rally speeches, and am going to see if I can use that. I think that they are particularly interesting because Trump is very attuned to the behavior of the crown during a rally and will “try out” lines to see if they work and adjust what he is talking about. It should reflect what his base is thinking over the time period
Need to start thinking about a short presentation for Nov 13

GOES

Figure out how to taper the beginning and end of the reference frame rotation
Add method to adjust the RW contributions. Look at the original spreadsheet and see what the difference is
Added the tapering and fooled around a lot exploring how the system is behaving. I think the next step is to see why the vehicle doesn’t recover its pitch

https://viztales.com/wp-content/uploads/2020/10/image-20.png

Phil 10.29.20

Are we really at ‘Zeta’? We run out of Greek leters at omega… Hurricane Zeta batters a storm-weary Gulf Coast

It occurs to me that there could be a new military AI ethics paper about ML and hierarchical/dominance bias

GOES

Adjust the TopController so that the reference angle changes ramp up and down
Look at adjusting the contributions? Maybe square? Add a method for experimentation

GPT-2 Agents

Start the Chinese database using the csv file
Boy did I struggle with storing Chinese in a MySQL table. With the Arabic version, I had to change everything to UTF8, which turns out in MySQL and MariaDB, to not quite be UTF-8. For Chinese, you have to use utf8mb4,as described it this awesome post: How to support full Unicode in MySQL databases
The short answer is to first, set the DB to the appropriate character set:

alter database <your database here> default character set utf8mb4 collate utf8mb4_general_ci;

Then make sure that the table is correct. IntelliJ defaults to the latin charset, so I had to do this manually by creating the table in Intellij, dumping it, and editing as follows:

CREATE TABLE table_posts (
   rowid int(11) NOT NULL AUTO_INCREMENT,
   id varchar(16) DEFAULT NULL,
   user_id varchar(32) DEFAULT NULL,
   created_at datetime DEFAULT NULL,
   crawl_time datetime DEFAULT NULL,
   like_num int(11) DEFAULT NULL,
   repost_num int(11) DEFAULT NULL,
   comment_num int(11) DEFAULT NULL,
   content text,
   translation text,
   origin_weibo varchar(16) DEFAULT NULL,
   geo_info text,
   PRIMARY KEY (rowid)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Then I sourced the file back intpo the db, so this worked:

Insert into table_posts (`id`, user_id, created_at, crawl_time, like_num, repost_num, comment_num, content, origin_weibo, geo_info) values("IiF4ShXQZ", "d058566643d6657e", "2019-12-01 00:00:17", "2020-04-22 21:21:30", 1, 0, 0, "《药品管理法》《疫苗管理法》👏👏", NULL, NULL);

When selected:

1,IiF4ShXQZ,d058566643d6657e,2019-12-01 00:00:17,2020-04-22 21:21:30,1,0,0,《药品管理法》《疫苗管理法》👏👏,,,

Emojis and everything!

ML Group – Just statuses

JuryRoom – Alex and Tamahau are getting ready to submit their thesis’ so mostly helping with that.

Phil 10.28.20

GOES

Fix E-QiP paperwork – done?
Fix the RW clamp code. Also, think about how to reduce the velocity of the RW as it nears it’s goal. Some kind of linear function where the scalar is velocity/threshold_velocity, where velocity < threshold velocity or something like that

JuryRoom

Read Alex’s document and add comments
Put some guidance in for Tamahau’s discussion section

Book

Placed some thematic guidance at the top
Moved all the chimp stuff into it’s own section.
Work on Moby-Dick. Now at mobydick/section5 in the summary.

DHS PLANXS – Need to review the AI/ML section

Phil10.27.20

Watching this, on Yannic Kilcher’s transformer channel :

GOES

Finish E-QiP paperwork – Nope. Gotta fix some things
2:00 Meeting with Vadim.
- I think we don’t need to determine the sign of the rotation, since there is a single axis of rotation. Going to see if that helps. Nope
- Just discovered that the deg/rad for get_angle() were reversed! AND THAT FIXED IT

https://viztales.com/wp-content/uploads/2020/10/image-18.png

Did a little extra bug hunting and added a commanded rate limiter, which needs to be tweaked.

GPT-2 Agents

Sim has the Melville model, hopefully there will be some results as well. There are and it looks pretty good. I also had her do some longer texts to see what effect that has. It doesn’t appear to be that much?
3:30 Meeting
- Work through options for US model. Show the results from the Arabic full tweet probe. One of the neat things about that approach is that the process can be automated as long as there is access to the DB. It will require a way of breaking the results on the training boundary
- This makes me want to revisit the MD model and see how taking sentences at random could show alternate(?) plot trajectories. Hmm. Not as clear as I thought it would be.
- Need to start on the Chinese translator/parser
- Did some interesting word clouds based on the Arabic tweets

Book

Move all the chimp stuff into it’s own section. Start to bring in other animal behavior?
Work on Moby-Dick. Found a nice summary here: sparknotes.com/lit/mobydick. Continued writing. Now at mobydick/section5 in the summary.

DHS PLANXS – Need to review the AI/ML section

Phil 10.26.20

I did a thing on Reddit

GOES

Time for a E-QiP of paperwork – Can’t log in. Yay! Fixed. Updating, rather than starting from scratch, which is nice
Looking at rotation code. Vadim didn’t update. Now he has
2:00 Meeting with Vadim.
- I think we don’t need to determine the sign of the roatation, since there is a single axis of rotation. Going to see if that helps.

GPT-2 Agents

Book

Move all the chimp stuff into it’s own section. Start to bring in other animal behavior?
Work on Moby-Dick. Found a nice summary here: sparknotes.com/lit/mobydick

#COVID

3:00 Meeting to go over results and figure out what to do next
- Went over a lot. I also tried a tweet as the prompt, and it generated very specific, diverging results. I think this is a good mechanism. It could even be that we search through the original corpora to find tweets that reflect what we are interested in, and use those as probes. Uploaded a lot to the shared folder
- Next meeting is on Nov 3rd (cue ominous music) to discuss next steps

Phil 10.23.20

This never stops being horrible

https://public.flourish.studio/visualisation/3603910/

Book

Still spending a lot of on chimpanzee behavior. Writing more about how alliance-building works and starting to set up (finally!) Moby-Dick
4:00 Meeting with Michelle

GOES

Got my E-QIP number. Time for a lot of paperwork
10:00 Meeting with Vadim. Good progress, but we’re not *quite* there yet. More on Monday
1:30 Fingerprinting. Leave at 12:30? Done!

Phil 10.22.20

Identifying viral bots and cyborgs in social media

For this research, I have applied techniques from complexity theory, especially information entropy, as well as network graph analysis and community detection algorithms to identify clusters of viral bots and cyborgs (human users who use software to automate and amplify their social posts) that differ from typical human users on Twitter and Facebook. I briefly explain these approaches below, so deep prior knowledge of these areas is not necessary. In addition to commercial bots focused on promoting click traffic, I discovered competing armies of pro-Trump and anti-Trump political bots and cyborgs. During August 2017, I found that anti-Trump bots were more successful than pro-Trump bots in spreading their messages. In contrast, during the NFL protest debates in September 2017, anti-NFL (and pro-Trump) bots and cyborgs achieved greater successes and virality than pro-NFL bots.

Social Botnet Community Detection: A Novel Approach based on Behavioral Similarity in Twitter Network using Deep Learning

Detecting social bots and identifying social botnet communities are extremely important in online social networks (OSNs). In this paper, we first construct a weighted signed Twitter network graph based on the behavioral similarity and trust values between the participants (i.e., OSN accounts) as weighted edges. The behavioral similarity is analyzed from the viewpoints of tweet-content similarity, shared URL similarity, interest similarity, and social interaction similarity for identifying similar types of behavior (malicious or not) among the participants in the Twitter network; whereas the participant’s trust value is determined by a random walk model. Next, we design two algorithms – Social Botnet Community Detection (SBCD) and Deep Autoencoder based SBCD (called DA-SBCD) – where the former detects social botnet communities of social bots with malicious behavioral similarity, while the latter reconstructs and detects social botnet communities more accurately in presence of different types of malicious activities. Finally, we evaluate the performance of proposed algorithms with the help of two Twitter datasets. Experimental results demonstrate the efficacy of our algorithms with better performance than existing schemes in terms of normalized mutual information (NMI), precision, recall and F-measure. More precisely, the DA-SBCD algorithm achieves about 90% precision and exhibits up to 8% improvement on NMI.

#COVID

Need to finish installing all the bits for TF, PT, and HF and see how well the model inference works

GPT-2 Agents

Working on Chinese translation and topic extraction. Got most parts working on the Chinese translation, but need to figure out how to use split() on utf-8

GOES

10:00 meeting with Vadim. Cleaned up a lot of the data dictionary and started to look for how we can possibly have a reverse yaw
2:00 Meeting with Jason. Basically a blend of code walkthrough and demo. Seems pretty solid

Phil 10.21.20

Something on attention: How Lyft predicts a rider’s destination for better in-app experience

We tackle the destination recommendation problem using the rider’s historical rides. The main idea is to limit candidate recommendations to addresses where the rider has previously taken a Lyft ride to or from. Within this candidate set, we use an attention mechanism (discussed in more detail below), to determine which locations are most relevant to the current session.

A Game Designer’s Analysis Of QAnon

Even Q-Anon was only one of several “anons” including FBIanon and CIAanon, etc, etc. Q rose to the top, so it got its own YouTube channels. That tested, so it moved to Reddit. The theories that didn’t work, disappeared while others got up-voted. It’s ingenious. It’s AI with a group-think engine. The group, lead by the puppet masters, decide what is the most entertaining and gripping explanation, and that is amplified. It’s a Slenderman board gone amok.

Book

Still working on the relationship between communication, hierarchy, aggression and cult behavior

GOES

10:00 and 1:30 Meeting with Vadim. Everything is almost working. There seems to be a sign problem, where the rotations are the opposite of what they should be. Need to clean up the ddict and then put in more useful stuff for figuring this out and plotting it

GPT-2 Agents

Installed Python 3.8 on Dreamhost. Looking to serve up the GPT-2 model if possible

Phil 10.20.20

Two types of aggression in human evolution

Two major types of aggression, proactive and reactive, are associated with contrasting expression, eliciting factors, neural pathways, development, and function. The distinction is useful for understanding the nature and evolution of human aggression. Compared with many primates, humans have a high propensity for proactive aggression, a trait shared with chimpanzees but not bonobos. By contrast, humans have a low propensity for reactive aggression compared with chimpanzees, and in this respect humans are more bonobo-like. The bimodal classification of human aggression helps solve two important puzzles. First, a long-standing debate about the significance of aggression in human nature is misconceived, because both positions are partly correct. The Hobbes–Huxley position rightly recognizes the high potential for proactive violence, while the Rousseau–Kropotkin position correctly notes the low frequency of reactive aggression. Second, the occurrence of two major types of human aggression solves the execution paradox, concerned with the hypothesized effects of capital punishment on self-domestication in the Pleistocene. The puzzle is that the propensity for aggressive behavior was supposedly reduced as a result of being selected against by capital punishment, but capital punishment is itself an aggressive behavior. Since the aggression used by executioners is proactive, the execution paradox is solved to the extent that the aggressive behavior of which victims were accused was frequently reactive, as has been reported. Both types of killing are important in humans, although proactive killing appears to be typically more frequent in war. The biology of proactive aggression is less well known and merits increased attention.

GPT-2 Agents

Look at topic extraction?
Moving models from the data directory to the model directory and sending them to svn
3:30 Meeting
- Action items:
  - Try translating a few Chinese tweets and comparing them against Google
  - Continue with topic extraction

GOES

9:00 Meeting with Vadim – good progress. Found an important bug

Book

Still grappling with cults. While listening to the Drinking with Historians episode on patriotism, with Benjamin Railton. I learned about the Cornerstone Speech, which was Confederate Vice President Alexander H. Stephens’ justification for the South’s rebellion. This line particularly stands out:
- Our new government is founded upon exactly the opposite idea; its foundations are laid, its corner-stone rests, upon the great truth that the negro is not equal to the white man; that slavery subordination to the superior race is his natural and normal condition. This, our new government, is the first, in the history of the world, based upon this great physical, philosophical, and moral truth.
Working out from the idea that a hierarchy depends on violence or the threat of violence, then all this is justified. You do slavery because you can. And the fact that you can, defines your place as superior in the hierarchy.
So, at the most primal (literally chimpanzee-level), violence is the most basic mechanism to determine hierarchy. Since a good place in the hierarchy has tremendous benefits, it is worth fighting for, and potentially dying for. The flip side of that is also worth killing for. And I think I can see how that can be a stable state for group activity. It could even be viewed as a behavior attractor that is wired into us as a hierarchical species. The balancing act for us is to work out how to not wind up in this state, even though it seems to be an attractive answer, particularly for those who feel as though they are loosing their high place in the hierarchy.

Phil 10.19.20

GPT-2 Agents

Updating text to include

date, label, author, followers, following, influence_score, country, city, category, gender, post_type, translation

I think it’s better. A sample now reads

In December of 2019, @sparrow_e2 (7 followers, 250 following) posted a retweet, mentioning @b0_r9. It was sent from Saudi Arabia. @sparrow_e2 wrote: "RT @b0_r9 watch. ♪ Squirrels drowned in #Jeddah ♪ ♪ Now here shows the role ♪ ♪ Study suspension and drainage ♪ https://t.co/m3kIOZ0Gdk". The retweet was categorized as "Neutral".

Started training on the new dataset. Done! Here are some results:
“In December of 2019”

____MZN wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been confirmed, and the Ministry of Health and Social Protection confirms that the health status of the infected is stable and under medical observation.."

“In February of 2020”

ixoG5 wrote "RT @EremNews, a Yemeni student, crying out for not being evacuated and her colleagues from Wuhan, China. She addresses Yemeni officials: If you are unable to protect 180 students, how will you protect a country? #Term_News # Yemen # COVID19 #Coronaviruschina https://t.co/6wcevarxvQ."

“In March of 2020”

____2____2___2 wrote "If God wills, the Corona virus will remain invisible, because there is no way to detect it with the naked eye. God knows that because the veil of the Muslim nation remains the law of the polytheists, that even the hurricanes that hit the earth are not seen by God's wrath.."

“Is corona the will of Allah? “

In February of 2020, @Bader_Zayed wrote "RT @makkahnp Corona virus in China... the death toll rose to 80 and more than 11 thousand infections # Makkah Newspaper https://t.co/lxW8L9uE8X." In February of 2020, @Hn6YmR9cV4H1fv6 wrote "RT @jafarAbdeel A &quot;Chinese Muslim&quot; woman talks about how Muslims in China deal with # Corona disease https://t.co/vzW1dQy2Xm."

GOES

2:00 Meeting with Vadim – nope
GFC Environmental training

Phil 10.16.20

Ping UMBC about onboarding and connecting my accounts

Today I learned of living tree root bridges

Measuring Gendered Correlations in Pre-trained NLP Models

In “Measuring and Reducing Gendered Correlations in Pre-trained Models” we perform a case study on BERT and its low-memory counterpart ALBERT, looking at correlations related to gender, and formulate a series of best practices for using pre-trained language models. We present experimental results over public model checkpoints and an academic task dataset to illustrate how the best practices apply, providing a foundation for exploring settings beyond the scope of this case study. We will soon release a series of checkpoints, Zari¹, which reduce gendered correlations while maintaining state-of-the-art accuracy on standard NLP task metrics.

#COVID/GPT-2 Agents

Going to generate a new set of text for the Saudi tweets and then do the same for the US set, then start training runs for the weekend

GOES

10:00 Meeting with Vadim

Book

More on cults
Looking at how alpha males influence chimpanzee travel patterns

Phil 10.15.20

Book

Finished reading sections on violent cults. The need for an other to react against is really important to motivate extreme behaviors
Reading Goodall on dominance displays and alliances in the Chimpanzees of Gombe. In chimps, this is primarily a male issue, and happens most between males that are close to the same level in the hierarchy as they move up and down. There is often a blend of aggressive displays, violence, submissive behavior, and social activities such as grooming.
One of the most interesting phenomena that occurs with respect to violent displays is something that Goodall calls contagion. It’s when the violence of one male (often towards females) brings in other males who continue and spread the violence.
Stampedes are extreme behavior in low dimensional space. The most obvious example is cattle in a slot canyon that are spooked and can only run one way. Once everyone is moving in the same direction, it can be more dangerous to stop and get trampled than it is to keep on going. As such, stampedes often only stop when they encounter something that is impassible or they exhaust themselves.
But low dimensions don’t have to be physical. As we see in chimpanzee behavior, aggression and violence can become contagious. Just as with cattle in a slot canyon that have to keep running to keep from being trampled, a chimp may need to exhibit aggression and attack a lower-ranked member to keep from being attacked themselves.
In humans, this type of dominance behavior exists in a low-dimensional emotional space of threat, violence, and fear. Because it is such a simple alignment mechanism (do this or we kill you), it can be extremely effective in mass political movements. Arendt’s concept of terror in totalitarian states could be a manifestation of dominance at scale. There are two sides, the dominating and the dominated. They are low dimension in their own ways.
Also something about bullying vs. adversarial herding. Bullying is very low dimensional. It is generally reactive and not reflective. Herding can organize the power of bullying to move populations. It is extremely analytic. Consider the interactions between the Sheppard, the sheepdog, and and the sheep. Herding doesn’t have to be malevolent. It can simply be to sell products like supplements.

GOES

10:00 Meeting with Vadim and Erik. Progress. Hopefully we can start training models soon!

#COVID

Copy the ProbesToFile code over from GPT-2 Agents and try a few tests – Things are working nicely!
“In December of 2019, ”

____MZN wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been confirmed, and the Ministry of Health and Social Protection confirms that the health status of the infected is stable and under medical observation.." ____MZN wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been confirmed, and the Ministry of Health and Social Protection confirms that the health status of the infected is stable and under medical observation.." ____MZN wrote "RT @UAE_BARQ "Global Health": ABA defines the new Corona virus and its modes of transmission." ____MZN wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been confirmed, and the Ministry of Health and Social Protection confirms that the health status of the infected is stable and under medical observation.." ____MZN wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been confirmed, and the Ministry of Health and Social Protection confirms that the health status of the infected is stable and under medical observation.." ____MZN

“In January of 2020, ”

____3____3 wrote "RT @turkistanuzbah #China New Corona virus or Wuhan virus relative to the city from which it was launched is still spreading trains and airport traffic is on hold, the virus has arrived in Thailand, South Korea and America, and the spokesman for the Chinese Ministry of Health is addressing the Chinese: Don't go to Wuhan and no one gets out of Wuhan City. https://t.co/1v1i9h5nCP." ____3____3 wrote "RT @ahakawi Update on the fascism of the new Corona virus in China: Increase in the number of cases to 303 deaths: 6 cases of new regions: Taiwan and Zigging https://t.co/ryOMKChz2p." ____3____3 wrote "RT @Alroeya has reached 830 cases of detail: https://t.co/3WcOtPfHjF #Vision_bla_https://t.co/Q1YbRt7eE1v." ____3____3 wrote "RT @ahakawi Update on the fascism of the new Corona virus in China: Increase in the number of cases to 303 deaths:

“In February of 2020, ”

____10__ wrote "RT @cgtnarabic Chinese Health Committee: We still have to test the new Corona virus and assess the efficiency of the country &apos; s anti-corona system https://t.co/HfNh7y3t3r." @ahmd_alraaie wrote "RT @SaudiMOH traveling to China here health directions to limit the spread of the new virus Corona # Travel_Health https://t.co/Ln8GiujWtj." @Sultainalalshammari wrote "RT @nippon_ar said the Chinese health authorities confirmed 634 patients after 17 medical workers entered into effective control of the new Corona virus in Hobei Province, the new epicentre of the outbreak in central China, according to the New China News Agency (https://t.co/dj1SQc0bWc)." @i_panda wrote "RT @Hamed_Alali, New China News Agency (Xinhua) - according to a high-level team of experts from the country's National Health Commission-: We do not know the source of the virus, its source is bats or snakes,

“In March of 2020, ”

____M____N___ wrote "The Chinese government has provided 1.9 million yuan (HK$173 million) for the development of the Corona virus. This comes within one of the largest donations ever made by a pharmaceutical company.." ____M____N___ wrote "The Chinese government has provided 1.9 million yuan (HK$173 million) for the development of the Corona virus. This comes within one of the largest donations ever made by a pharmaceutical company.." ____M____N___ wrote "The Chinese government has provided 1.9 million yuan (HK$173 million) for the development of the Corona virus. This comes within one of the largest donations ever made by a pharmaceutical company.." ____M____N___ wrote "The Chinese government has provided 1.9 million yuan (HK$173 million) for the development of the Corona virus. This comes within one of the largest donations ever made by a pharmaceutical company.." ____M____N___ wrote "The Chinese government has provided 1.9 million yuan (HK$173 million) for the development of the Corona virus. This comes within one of the largest donations ever made by a pharmaceutical company.." ____M____N

“In April of 2020, ”

____3____3 wrote "The United Arab Emirates today officially announces that the first case of the new Koruna virus has been recorded, for a Chinese citizen, who has been quarantined in a hospital in Abha, while the rest awaits confirmation by the medical authorities. https://t.co/z7oG4nCp8r." ____3____3 wrote "RT @almasar_om # Path | # China announces a sharp rise in deaths due to the Corona virus # Al-Masar newspaper https://t.co/pCZ9Y5lUy3." ____3____3 wrote "https://t.co/PuJ7tuzpQT." ____3____3 wrote "https://t.co/pPuJ7tuzpQT." ____3____3 wrote "https://t.co/pPuJ7tuzpQT." ____3____3 wrote "RT @moe_gov_sa highlighted questions about the new virus #Korona, and answered by specialists, as part of the awareness campaign launched by the Ministry of Education. https://t.

Work on using the db to create queries from labeled data.
Ok, I spent Waaaaaaaaaaaaaaaaaaayyyyyyyyyy too much on this: $715.35, Stopping the run and using what I have

ML Seminar 3:30 – Went over idea on how to use meta information to make more detailed sentences.

JuryRoom 4:30 – Good discussion. Darcy is going to send out a link and a survey

Phil 10.14.20

There is an electric motorcycle dealer in Gaithersburg! cyclemaxusa.com

Your periodic reminder of how badly the Trump administration has handled the pandemic. If Trump had simply implemented, for example, Canada’s policies then he’d be a hero and coasting to re-election.

#COVID

Update the dev db with the one generated on the laptop – done!
Write an app that creates test/train data in a date-ordered output of the translation table. If a random value is < 0.2, the line goes to a test file, otherwise a train file – running!
- Whoops – found that there were a lot of rows that had NULL as the translation. Running those through Google
- Training on the smaller model to see how everything works
Train model – training!
Copy the ProbesToFile code over from GPT-2 Agents and try a few tests
Work on using the db to create queries from labeled data
3:00 meeting – just Ashwag. We went over progress and discussed next steps. Once I have a model up, I’ll bundle everything together with directions for TF and Torch for playing with the model

GOES

Still waiting on Vadim – He’s back! Scheduled meeting for tomorrow at 10:00
2:00 Meeting

JuryRoom

Review T’s req’s engineering section

Book

More Chimp behavior. Ordered Chimpanzee Politics: Power and Sex among Apes

Phil 10.13.20

This is cool: NumPy API on TensorFlow. It will be available in 2.4. I should revisit my simple NN

#COVID

Chunking away on translation. DB search makes it slow
Google Translate prices are really good. free for the first 500k characters, then $20/million chars. On average, that seems to be about $0.003/tweet, or about $20 for 7k. Pretty manageable to build a hybrid tweet translator that tries Huggingface first and falls back to Google
Really sped up the translation by using ORDER BY rather than DISTINCT and updating based on the GUID
Finished and backed up! Tomorrow do a select distinct translations order by date and put into a text file to train a model

GOES

Waiting on Vadim

GPT-2 Agents

3:30 Meeting. Went over the tweet ingestor (SchemaToDb) and the whole process of creating a model. The idea of finding tweets similar to generated tweets and tracing back to users came up, which is both cool and creepy

Book

Still working on dominance, leadership, cults and chimpanzees

Phil 10.12.20

Write response to Tony’s letter – done

Google call with Victor at 5:30. Had a nice talk about book writing and the idea of doing something about ESL that brings it into the modern age

Write Google Translate code for bad translations. Do it in small batches and check pricing. Running!

Huggingface changed their API a bit. what was

batch = tok.prepare_translation_batch(src_texts=[sample_text])

is now

batch = tok.prepare_seq2seq_batch(src_texts=[sample_text])

Book

Read more on cults
Read Goodall on dominance displays and alliances. I think this is the in. Dominance is very low dimension (one or two?) and effective in mass movements. Arendt’s concept of terror in totalitarian states could be a manifestation of dominance at scale. There are two sides, the dominating and the dominated. They are low dimension in their own ways.
Also something about bullying vs. adversarial herding. Bullying is very low dimensional. It is generally reactive and not reflective. Herding can organize the power of bullying to move populations. It is extremely analytic. Consider the interactions between the Sheppard, the sheepdog, and and the sheep. Herding doesn’t have to be malevolent. It can simply be to sell products like supplements.