# Phil 7.6.21

The idea of the Western construct of time as a source of neurosis came up yesterday. I found this, which kind of supports the idea. It also ties into the thing that we’re trying to work out with indigenous software practices, that might not have the same focus on scheduling and individual adherence to a schedule?

• Temporal experience as a core quality in mental disorders
• The goal of this paper is to introduce Phenomenology and the Cognitive Sciences’ thematic issue on disordered temporalities. The authors begin by discussing the main reason for the neglect of temporal experience in present-day psychiatric nosologies, mainly, its reduction to clock time. Methodological challenges facing research on temporal experience include addressing the felt sense of time, its structure, and its pre-reflective aspects in the life-world setting. In the second part, the paper covers the contributions to the thematic issue concerning temporal experience in anxiety, depression, mania, addiction, post-traumatic stress disorder, autism, and in recovery from psychosis. The authors argue in favor of integrative and cross-disciplinary approaches. In conclusion, they present time as a significant aspect of human suffering.

GPT-Agents

• The model finished training on Thursday, and I got the model putting values into table_synth_review, with an entry in the experiment table as well
• Today, do ten 1,000 review runs and compare. Then compare to the actual data
• Got some nice internally consistent runs!
3:00 Meeting
• Sentiment analyzer on actual data. Test whether predict whether predicted sentiment matches stars
• Binary threshold accuracy?
• Correlation analysis of confidence vs stars?
• Does the sentiment analyzer work? Evaluate sentiment analyzer vs star ratings?
• Does ground truth sentiment distribution in GPT data match ground distribution in the data?
• Does predicted sentiment distribution in GPT data match ground distribution in the data?
• Does predicted sentiment distribution in GPT data match predicted distribution in the data?
• Here’s more data:

SBIR

9:00 Sprint review (slides!)
• Phase 2 proposal – updated. Now I need to fill in some initial working text
4:00 Tagup

# Phil 7.5.21

After sprinting to finish a pile of writing, and before the next sprint, I took a looooong weekend!

• First detected in 2015, the NSO Group’s Pegasus malware has reportedly been used in at least 45 countries worldwide to infect the phones of activists, journalists and human rights defenders. Having learnt that our former collaborators and close associates were hacked by Pegasus, Forensic Architecture undertook 15 months of extensive open-source research, interviews assisted by Laura Poitras, and developed bespoke software to present this data as an interactive 3D platform, along with video investigations narrated by Edward Snowden to tell the stories of the individuals targeted and the web of corporate affiliations within which NSO is nested. Supported by Amnesty International and the Citizen Lab, our analysis reveals relations and patterns between separate incidents in the physical and digital sphere, demonstrating how infections are entangled with real world violence, and extend within the professional and personal networks of civil society actors worldwide.

# Phil 6.30.21

SBIR(s)

• Roll in Clay’s last-minute changes
\caption[position=bottom]{Title of Figure\label{fig:F1}}

Submit!

Book

• Keep on adding content to new project

GPT Agents

• Welcome to this ‘101 course’ on getting started with academic research using the Twitter API. The objective of this course is to help academic researchers learn how to get Twitter data using the new Twitter API v2.
• Meaningful measures of human society in the twenty-first century
• Science rarely proceeds beyond what scientists can observe and measure, and sometimes what can be observed proceeds far ahead of scientific understanding. The twenty-first century offers such a moment in the study of human societies. A vastly larger share of behaviours is observed today than would have been imaginable at the close of the twentieth century. Our interpersonal communication, our movements and many of our everyday actions, are all potentially accessible for scientific research; sometimes through purposive instrumentation for scientific objectives (for example, satellite imagery), but far more often these objectives are, literally, an afterthought (for example, Twitter data streams). Here we evaluate the potential of this massive instrumentation—the creation of techniques for the structured representation and quantification—of human behaviour through the lens of scientific measurement and its principles. In particular, we focus on the question of how we extract scientific meaning from data that often were not created for such purposes. These data present conceptual, computational and ethical challenges that require a rejuvenation of our scientific theories to keep up with the rapidly changing social realities and our capacities to capture them. We require, in other words, new approaches to manage, use and analyze data.
• Start testing model when its ready
• Still chunking along:

# Phil 6.29.21

Stewardship of Ourselves

• The first (and perhaps foremost) of my concerns is the impact that the perturbation of our social dynamics may have on our collective cognitive abilities. In Cognitive Democracy, Henry Farrell and Cosma Shalizi make the (credible) case that democracy is intrinsically better at solving complex problems (of the kind that have rugged solution landscapes) than markets or hierarchies/bureaucracies.
• I am far more concerned with how uniform these algorithms are across huge populations. The underlying insight that explains why diverse groups are better at complex problems is that a diverse set of intellectual tools and viewpoints will be better at finding solutions on a rugged landscape. In mediating so much of humankind’s discovery through the tiny funnel of a handful of systems, we are creating an unprecedented impoverishment of our intellectual toolbox. I am far less concerned about filter bubbles than I am about turning a complex, likely scale-free network of discovery into a fully-mediated hub-and-spokes structure in which everything flows through a system of very limited variety.

SBIR

• Roll in Clay’s changes today!
Done????

Book

• Start putting all the chapters in the same place. Take out all the placeholders and let’s see what we have

GPT Agents

• Start training
• yelp_American: started 7:55
3:00 Meeting

# Phil 6.28.21

I had a pretty wild dream last night. I was working at Google building physical neural networks. I think we were precipitating them out of a metallic semiconductor solution. My sense is that it was something where the input buffer was the cathode and the output was the anode. The finished systems were placed in mineral oil tanks, so they were basically artificial brains in a bucket. They looked something like this:

Netron is a viewer for neural network, deep learning and machine learning models.

Sentence Transformers in the Hugging Face Hub

• Sentence Transformers is a framework for sentence, paragraph and image embeddings. This allows to derive semantically meaningful embeddings (1) which is useful for applications such as semantic search or multi-lingual zero shot classification. As part of Sentence Transformers v2 release, there are a lot of cool new features:

Start enjoying those backcountry roads with the OHV 3″ lift kit engineered specifically for the RAM ProMaster chassis. Out of the factory, the van is naturally lower on the front end, a 3″ lift on the front axle, and a 2.25″ lift in the rear helps level out the vehicle. In addition to greater clearance, the lift kit also increases protection for any gear you may have mounted underneath. The ProMaster lift kit is truly designed to let you go anywhere.

GPT Agents

• Create a view for reviews and businesses. Done
• Search for types and start pulling out reviews + stars. Done. Here’s the estimate of the number of rows based on the number of rows it takes to get to 100 samples:

The same info as a chart:

• Once I figure that out start making training corpora. I think I’ll stick to those cuisines that have more than 100k estimated reviews – code is done, running the queries and creating the test/train corpora. I’m adding useful_votes, funny_votes, and cool_votes for some more ground truth numbers to look at. The format should work for Excel too, so the stats can be computed from there

Book

• Roll in Upendra’s changes
• Start updating Overleaf

SBIR

4:00 Tagup. Ping Andreea

# Phil 6.27.21

Learning to hesitate

• We investigate how people make choices when they are unsure about the value of the options they face and have to decide whether to choose now or wait and acquire more information first. In an experiment, we find that participants deviate from optimal information acquisition in a systematic manner. They acquire too much information (when they should only collect little) or not enough (when they should collect a lot). We show that this pattern can be explained as naturally emerging from Fechner cognitive errors. Over time participants tend to learn to approximate the optimal strategy when information is relatively costly.
• Overall, participants make their decisions too quickly (sample too little information) when information is relatively cheap. Inversely, they hesitate too long (sample too much information) when information is relatively expensive. They stop after approximately 9 draws in the $0.10 treatment, 7 draws in the$0.50 treatment and 4 draws in the $1 treatment. In the lower cost treatment, this average is below the theoretical prediction, and in the two other costs treatments it is above it. The average stopping time is significantly different from the theoretical one in each treatment (p<0.001 for a Wilcoxon signed-rank test in$0.10 and $1 treatments, p=0.0085 in the$0.50 treatment).

Book

• People are drawn to the easy and to the easiest side of the easy. But it is clear that we must hold ourselves to the difficult, as it is true for everything alive. Everything in nature grows and defends itself in its own way and against all opposition, straining from within and at any price, to become distinctively itself. It is good to be solitary, because solitude is difficult, and that a thing is difficult must be even more of a reason for us to undertake it.

SBIR

• Read whatever I finished writing on Friday and edit/fix -done!

# Phil 6.25.21

Book

• Went up to Princeton yesterday to have lunch with Barna Donovan to discuss conspiracy theories and the GPT-3. Fun!
• 2:00 Meeting with Michelle. Need to move all the content over to Overleaf

SBIR

• Writing – done with the first draft!

GPT-Agents

• Ingest the DB on the ML box
• Ping David Mclure about GPT-3 Mapping? He used AllenNLP for this which looks like it’s worth a deep dive.

# Phil 6.23.21

Look! A map with beliefs!

Book

• Add some content on modern maps – Done!

GPT Agents

• Changing the primary key to row_id. Slow! Done!

SBIR

10:00 Meeting – all kinds of IT-related Zoom and Google Meet problems
• Working on final report. Working on folding the first two reports into the methods/results narrative

# Phil 6.22.21

Back from a long weekend off with some interesting adventures. And I managed to break the RV again. Need to contact Jim Donnie’s today and get something scheduled. Done! Drop off tomorrow.

Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Time series analysis is an essential component of Data Science and Engineering work at industry, from understanding the key statistics and characteristics, detecting regressions and anomalies, to forecasting future trends. Kats aims to provide the one-stop shop for time series analysis, including detection, forecasting, feature extraction/embedding, multivariate analysis, etc. Kats is released by Facebook’s Infrastructure Data Science team. It is available for download on PyPI.

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

• We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. To this end, we propose a two-stage framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects. We bring the improvement upon previous arts by 3.1 AUCs when learning representations from scratch. By transfer learning on pretrained representations on ImageNet, we achieve a new state-of-theart 96.6 AUC. Lastly, we extend the framework to learn and extract representations from patches to allow localizing defective areas without annotations during training.

Search

Andrej Karpathy (Tesla): CVPR 2021 Workshop on Autonomous Vehicles

SBIR

• Starting to really dig into the phase 1 final report. Trying to make it useful as a source for a conference paper

Book

• Continue on the Diana section. Done?

GPT-2 Agents

• The re-indexing is done! Need to change the primary key to row_id and generate some text
3:00 Meeting

# Phil 6.16.21

The BBC did a show on using outside technology rather than HR as a way of tracking harassments. The idea seems to be that you can report an incident, and if the app detects enough activity around one person (which can be anonymously reported), then the company can be contacted. This sounds a lot like a model for trustworthy anonymous citizen journalism. Need to look into it some more.

Also, I realize that this is in some ways coordination without (explicit) communication. It could also be a framework for a more updated form of collective action such as unions. The issue to solve with that would be the ability to negotiate with the union as a whole, rather than an individual negotiator.

3pm – 3:40pm meeting with Upendra

JuryRoom

7:00 Waikato meeting

SBIR

• Abstract for 007 – done
• Abstract for 008 – done
• Cyber training – done. Painful!

# Phil 6.15.21

• During my time at Facebook, I saw compromised accounts functioning in droves in Latin America, Asia, and elsewhere. Most of these accounts were commandeered through autolikers: online programs which promise users automatic likes and other engagement for their posts. Signing up for the autoliker, however, requires the user to hand over account access. Then, these accounts join a bot farm, where their likes and comments are delivered to other autoliker users, or sold en masse, even while the original user maintains ownership of the account. Although motivated by money rather than politics — and far less sophisticated than government-run human troll farms — the sheer quantity of these autoliker programs can be dangerous.

Book

• Princess Di and others!
• Maybe a postscript?

SBIR

• First drafts of 008 and 007 are done! Time to clean up and edit Done. Just need to do an abstract for each

GPT-Agents

• At 7.5M reviews indexed
• Need to bow out of meeting today, but maybe send a copy of the article draft?

# Phil 6.14.21

• Does contact across social groups influence sociopolitical behavior? This question is among the most studied in the social sciences with deep implications for the harmony of diverse societies. Yet, despite a voluminous body of scholarship, evidence around this question is limited to cross-sectional surveys that only measure short-term consequences of contact or to panel surveys with small samples covering short time periods. Using advances in machine learning that enable large-scale linkages across datasets, we examine the long-term determinants of sociopolitical behavior through an unprecedented individual-level analysis linking contemporary political records to the 1940 U.S. Census. These linked data allow us to measure the exact residential context of nearly every person in the United States in 1940 and, for men, connect this with the political behavior of those still alive over 70 years later. We find that, among white Americans, early-life exposure to black neighbors predicts Democratic partisanship over 70 years later.
• Diversity injection works!

SBIR

• Try to get a pass through 007
• Add Peter’s bits to 008

GPT-Agents

4:30 Meeting with Andreea

# Phil 6.11.21

Book

• More writing – send current state to Michelle this morning!
2:00 Meeting

SBIR

• More writing
• Done with the first draft of 008!

# Phil 6.10.21

Social Cooling is a name for the long-term negative side effects of living in a reputation economy

Big Sleep (Github) – Ryan Murdock has done it again, combining OpenAI’s CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Brain rhythms that help us to detect borders

• Oscillations in neuronal activity in the medial temporal lobe of the human brain encode proximity to boundaries such as walls, both when navigating while walking and when watching another person do so.
• Lots of very interesting projects here, too

Book

• More writing

SBIR(s)

9:15 standup
• cybersecurity training
• More writing