Category Archives: Phil

Phil 6.27.21

Learning to hesitate

  • We investigate how people make choices when they are unsure about the value of the options they face and have to decide whether to choose now or wait and acquire more information first. In an experiment, we find that participants deviate from optimal information acquisition in a systematic manner. They acquire too much information (when they should only collect little) or not enough (when they should collect a lot). We show that this pattern can be explained as naturally emerging from Fechner cognitive errors. Over time participants tend to learn to approximate the optimal strategy when information is relatively costly.
  • Overall, participants make their decisions too quickly (sample too little information) when information is relatively cheap. Inversely, they hesitate too long (sample too much information) when information is relatively expensive. They stop after approximately 9 draws in the $0.10 treatment, 7 draws in the $0.50 treatment and 4 draws in the $1 treatment. In the lower cost treatment, this average is below the theoretical prediction, and in the two other costs treatments it is above it. The average stopping time is significantly different from the theoretical one in each treatment (p<0.001 for a Wilcoxon signed-rank test in $0.10 and $1 treatments, p=0.0085 in the $0.50 treatment).

Book

  • People are drawn to the easy and to the easiest side of the easy. But it is clear that we must hold ourselves to the difficult, as it is true for everything alive. Everything in nature grows and defends itself in its own way and against all opposition, straining from within and at any price, to become distinctively itself. It is good to be solitary, because solitude is difficult, and that a thing is difficult must be even more of a reason for us to undertake it.

SBIR

  • Read whatever I finished writing on Friday and edit/fix -done!

Phil 6.25.21

Book

  • Went up to Princeton yesterday to have lunch with Barna Donovan to discuss conspiracy theories and the GPT-3. Fun!
  • 2:00 Meeting with Michelle. Need to move all the content over to Overleaf

SBIR

  • Writing – done with the first draft!

GPT-Agents

  • Ingest the DB on the ML box
  • Ping David Mclure about GPT-3 Mapping? He used AllenNLP for this which looks like it’s worth a deep dive.
https://twitter.com/clured

Phil 6.23.21

Drop truck off for service – done

Look! A map with beliefs!

https://twitter.com/baseballot/status/1407544690996547584

Book

  • Doing a readthrough
  • Add some content on modern maps – Done!

GPT Agents

  • Changing the primary key to row_id. Slow! Done!

SBIR

  • 10:00 Meeting – all kinds of IT-related Zoom and Google Meet problems
  • Working on final report. Working on folding the first two reports into the methods/results narrative

Phil 6.22.21

Back from a long weekend off with some interesting adventures. And I managed to break the RV again. Need to contact Jim Donnie’s today and get something scheduled. Done! Drop off tomorrow.

Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Time series analysis is an essential component of Data Science and Engineering work at industry, from understanding the key statistics and characteristics, detecting regressions and anomalies, to forecasting future trends. Kats aims to provide the one-stop shop for time series analysis, including detection, forecasting, feature extraction/embedding, multivariate analysis, etc. Kats is released by Facebook’s Infrastructure Data Science team. It is available for download on PyPI.

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

  • We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. To this end, we propose a two-stage framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects. We bring the improvement upon previous arts by 3.1 AUCs when learning representations from scratch. By transfer learning on pretrained representations on ImageNet, we achieve a new state-of-theart 96.6 AUC. Lastly, we extend the framework to learn and extract representations from patches to allow localizing defective areas without annotations during training.

Search

Andrej Karpathy (Tesla): CVPR 2021 Workshop on Autonomous Vehicles

SBIR

  • Starting to really dig into the phase 1 final report. Trying to make it useful as a source for a conference paper

Book

  • Continue on the Diana section. Done?

GPT-2 Agents

  • The re-indexing is done! Need to change the primary key to row_id and generate some text
  • 3:00 Meeting

Phil 6.16.21

The BBC did a show on using outside technology rather than HR as a way of tracking harassments. The idea seems to be that you can report an incident, and if the app detects enough activity around one person (which can be anonymously reported), then the company can be contacted. This sounds a lot like a model for trustworthy anonymous citizen journalism. Need to look into it some more.

Also, I realize that this is in some ways coordination without (explicit) communication. It could also be a framework for a more updated form of collective action such as unions. The issue to solve with that would be the ability to negotiate with the union as a whole, rather than an individual negotiator.

https://twitter.com/anilkseth/status/1404878880213770241

3pm – 3:40pm meeting with Upendra

JuryRoom

  • 7:00 Waikato meeting

SBIR

  • Abstract for 007 – done
  • Abstract for 008 – done
  • Cyber training – done. Painful!

Phil 6.15.21

I saw millions compromise their Facebook accounts to fuel fake engagement

  • During my time at Facebook, I saw compromised accounts functioning in droves in Latin America, Asia, and elsewhere. Most of these accounts were commandeered through autolikers: online programs which promise users automatic likes and other engagement for their posts. Signing up for the autoliker, however, requires the user to hand over account access. Then, these accounts join a bot farm, where their likes and comments are delivered to other autoliker users, or sold en masse, even while the original user maintains ownership of the account. Although motivated by money rather than politics — and far less sophisticated than government-run human troll farms — the sheer quantity of these autoliker programs can be dangerous.

Book

  • Princess Di and others!
  • Maybe a postscript?

SBIR

  • First drafts of 008 and 007 are done! Time to clean up and edit Done. Just need to do an abstract for each

GPT-Agents

  • At 7.5M reviews indexed
  • Need to bow out of meeting today, but maybe send a copy of the article draft?

Phil 6.14.21

Childhood cross-ethnic exposure predicts political behavior seven decades later: Evidence from linked administrative data

  • Does contact across social groups influence sociopolitical behavior? This question is among the most studied in the social sciences with deep implications for the harmony of diverse societies. Yet, despite a voluminous body of scholarship, evidence around this question is limited to cross-sectional surveys that only measure short-term consequences of contact or to panel surveys with small samples covering short time periods. Using advances in machine learning that enable large-scale linkages across datasets, we examine the long-term determinants of sociopolitical behavior through an unprecedented individual-level analysis linking contemporary political records to the 1940 U.S. Census. These linked data allow us to measure the exact residential context of nearly every person in the United States in 1940 and, for men, connect this with the political behavior of those still alive over 70 years later. We find that, among white Americans, early-life exposure to black neighbors predicts Democratic partisanship over 70 years later.
  • Diversity injection works!

SBIR

  • Try to get a pass through 007
  • Add Peter’s bits to 008

GPT-Agents

  • 4:30 Meeting with Andreea

Phil 6.11.21

Book

  • More writing – send current state to Michelle this morning!
  • 2:00 Meeting

SBIR

  • More writing
    • Done with the first draft of 008!

Phil 6.10.21

Social Cooling is a name for the long-term negative side effects of living in a reputation economy

Big Sleep (Github) – Ryan Murdock has done it again, combining OpenAI’s CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Brain rhythms that help us to detect borders

  • Oscillations in neuronal activity in the medial temporal lobe of the human brain encode proximity to boundaries such as walls, both when navigating while walking and when watching another person do so.
  • Lots of very interesting projects here, too
https://github.com/lucidrains

Book

  • More writing

SBIR(s)

  • 9:15 standup
  • cybersecurity training
  • More writing

Phil 6.9.21

https://twitter.com/mcxfrank/status/1402367796940472322

Thinking ahead: spontaneous prediction in context as a keystone of language in humans and machines

  • Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). These models are trained to generate appropriate linguistic responses in a given context using a self-supervised prediction task. We provide empirical evidence that the human brain and autoregressive DLMs share two computational principles: 1) both are engaged in continuous prediction; 2) both represent words as a function of the previous context. Behaviorally, we demonstrate a match between humans and DLM’s next-word predictions given sufficient contextual windows during the processing of a real-life narrative. Neurally, we demonstrate that the brain, like autoregressive DLMs, constantly predicts upcoming words in natural speech, hundreds of milliseconds before they are perceived. Finally, we show that DLM’s contextual embeddings capture the neural representation of context-specific word meaning better than arbitrary or static semantic embeddings. Our findings suggest that autoregressive DLMs provide a novel and biologically feasible computational framework for studying the neural basis of language.

GPT-Agents

  • Even though I have Windows updates turned off, it seems that MS has rebooted my machine last night. Figuring out where the updates stopped so that I can pick up in a reasonable way. Currently,
select count(*) from table_review where row_id is not null;
  • has taken 25 minutes to return. Grrr. And I need a new valve in the shower. Grrr!
  • Update – it took 89 minutes and 32 seconds. There are 3,954,779 values set
    • Back to adding row numbers. Had to figure out where in the table we were, but an hour of coding beats the hell out of a few days of redundant inserts!
  • Pinged Antonio about meeting at 10:00 on Friday

Book

  • More Writing

SBIR

  • More writing

Phil 6.8.21

RE-ran my COVID states model, basically to see how Georgia is doing. It’s still a mess

https://public.flourish.studio/visualisation/4303726/
https://github.com/3b1b/manim
  • Manim is an engine for precise programatic animations, designed for creating explanatory math videos.

Computer-Assisted Keyword and Document Set Discovery from Unstructured Text

  • The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend on this choice, researchers usually pick keywords in ad hoc ways that are far from optimal and usually biased. Paradoxically, this often means that the validity of the most sophisticated text analysis methods depend in practice on the inadequate keyword counting or matching methods they are designed to replace. Improved methods of keyword selection would also be valuable in many other areas, such as following conversations that rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look-alike modeling; intelligence analysis; and sentiment and topic analysis. We develop a computer-assisted (as opposed to fully automated) statistical approach that suggests keywords from available text without needing structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and summarizing results with Boolean search strings. We illustrate how the technique works with analyses of English texts about the Boston Marathon Bombings, Chinese social media posts designed to evade censorship, among others.

Book

  • More writing

SBIR

  • More writing

Phil 6.7.21

Retractable Pergola Covers & Awnings

  • Unlike drop awnings, the Verona is a traditional horizontal awning or angled awning that provides excellent shade coverage. The Verona is designed to be installed on top of a pergola or trellis providing a natural and effective means for light and temperature control while still allowing open air spaces. The Verona can also be installed over traditional construction such as conservatories, glass ceilings, atriums, solariums and skylights to control interior light, ultraviolet rays, glare, and heat. The box awning frame of the Verona uses compact mounting hardware that make it simple to install over almost any kind of frame.

SocialSens 2021

  • Keynote: Cecile Paris, CSIRO, Australia, “Mapping Emotions on Social Media”
  • 10:45 session
  • Done with my presentation! This person looks interesting: Adriana Iamnitchi
    • My research is rooted in distributed systems, with emphasis on characterizing cyber-social systems and designing, implementing and experimenting with algorithms, services and applications for large-scale networked-systems. In a typical project cycle, in our group we quantitatively characterize socio-technical phenomena at scale, model them, apply new understandings to the design of distributed systems, and experimentally measure the performance differences. In the process we often rely on, and contribute to, research from other fields. Recently we have used research from sociology, psychology and political science to build better understandings of quantitative observations or to inform my design and experiments. While my recent work is related mainly to online social interactions and big data processing, the same research practice (of quantitatively evaluating socio-technical environments and then applying observations to the design of distributed systems or services) defines my early work in scientific grids and peer-to-peer systems. For more details, please refer to my research statement.
  • Had to bail to frantically assemble 3 near-useless quad charts by 4:00

SBIR

  • Had to assemble 3 near useless quad charts by COB because someone realized that LM needed them today. First time I seriously thought about quitting this company

Phil 6.4.21

Tesla sees a truck carrying traffic lights (via Twitter):

Ping Tim!

Send David money!

GPT Agents

  • Finish slides
  • 3:30 Walkthrough

Book

  • Started the “Do you see yourself here” section. Thought a lot about John 1:1
  • 2:00 Meeting with Michelle

Phil 6.3.21

Decision Transformer: Reinforcement Learning via Sequence Modeling

  • We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
  • I think this means that the backwards transformer could be trained to write questions that are most likely to result in a particular answer.

Book

  • Did a little fixing of the maps and chapters when I realized that the government is not like a large company. Companies are much more tied up in money, which makes sense. The government is about the power to protect, punish, and hide knowledge. It’s much closer to Greek/Roman gods?
  • Need to respond to Uprenda today

SBIR

  • More final report writing
  • 9:15 standup
  • 10:30 proposal meeting

GPT-Agents

  • More slide re-working
  • At 600k updates. So this will take about 2 weeks