Does contact across social groups influence sociopolitical behavior? This question is among the most studied in the social sciences with deep implications for the harmony of diverse societies. Yet, despite a voluminous body of scholarship, evidence around this question is limited to cross-sectional surveys that only measure short-term consequences of contact or to panel surveys with small samples covering short time periods. Using advances in machine learning that enable large-scale linkages across datasets, we examine the long-term determinants of sociopolitical behavior through an unprecedented individual-level analysis linking contemporary political records to the 1940 U.S. Census. These linked data allow us to measure the exact residential context of nearly every person in the United States in 1940 and, for men, connect this with the political behavior of those still alive over 70 years later. We find that, among white Americans, early-life exposure to black neighbors predicts Democratic partisanship over 70 years later.
Oscillations in neuronal activity in the medial temporal lobe of the human brain encode proximity to boundaries such as walls, both when navigating while walking and when watching another person do so.
Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). These models are trained to generate appropriate linguistic responses in a given context using a self-supervised prediction task. We provide empirical evidence that the human brain and autoregressive DLMs share two computational principles: 1) both are engaged in continuous prediction; 2) both represent words as a function of the previous context. Behaviorally, we demonstrate a match between humans and DLM’s next-word predictions given sufficient contextual windows during the processing of a real-life narrative. Neurally, we demonstrate that the brain, like autoregressive DLMs, constantly predicts upcoming words in natural speech, hundreds of milliseconds before they are perceived. Finally, we show that DLM’s contextual embeddings capture the neural representation of context-specific word meaning better than arbitrary or static semantic embeddings. Our findings suggest that autoregressive DLMs provide a novel and biologically feasible computational framework for studying the neural basis of language.
Even though I have Windows updates turned off, it seems that MS has rebooted my machine last night. Figuring out where the updates stopped so that I can pick up in a reasonable way. Currently,
select count(*) from table_review where row_id is not null;
has taken 25 minutes to return. Grrr. And I need a new valve in the shower. Grrr!
Update – it took 89 minutes and 32 seconds. There are 3,954,779 values set
Back to adding row numbers. Had to figure out where in the table we were, but an hour of coding beats the hell out of a few days of redundant inserts!
The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend on this choice, researchers usually pick keywords in ad hoc ways that are far from optimal and usually biased. Paradoxically, this often means that the validity of the most sophisticated text analysis methods depend in practice on the inadequate keyword counting or matching methods they are designed to replace. Improved methods of keyword selection would also be valuable in many other areas, such as following conversations that rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look-alike modeling; intelligence analysis; and sentiment and topic analysis. We develop a computer-assisted (as opposed to fully automated) statistical approach that suggests keywords from available text without needing structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and summarizing results with Boolean search strings. We illustrate how the technique works with analyses of English texts about the Boston Marathon Bombings, Chinese social media posts designed to evade censorship, among others.
Unlike drop awnings, the Verona is a traditional horizontal awning or angled awning that provides excellent shade coverage. The Verona is designed to be installed on top of a pergola or trellis providing a natural and effective means for light and temperature control while still allowing open air spaces. The Verona can also be installed over traditional construction such as conservatories, glass ceilings, atriums, solariums and skylights to control interior light, ultraviolet rays, glare, and heat. The box awning frame of the Verona uses compact mounting hardware that make it simple to install over almost any kind of frame.
Keynote: Cecile Paris, CSIRO, Australia, “Mapping Emotions on Social Media”
My research is rooted in distributed systems, with emphasis on characterizing cyber-social systems and designing, implementing and experimenting with algorithms, services and applications for large-scale networked-systems. In a typical project cycle, in our group we quantitatively characterize socio-technical phenomena at scale, model them, apply new understandings to the design of distributed systems, and experimentally measure the performance differences. In the process we often rely on, and contribute to, research from other fields. Recently we have used research from sociology, psychology and political science to build better understandings of quantitative observations or to inform my design and experiments. While my recent work is related mainly to online social interactions and big data processing, the same research practice (of quantitatively evaluating socio-technical environments and then applying observations to the design of distributed systems or services) defines my early work in scientific grids and peer-to-peer systems. For more details, please refer to my research statement.
Had to bail to frantically assemble 3 near-useless quad charts by 4:00
Had to assemble 3 near useless quad charts by COB because someone realized that LM needed them today. First time I seriously thought about quitting this company
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
I think this means that the backwards transformer could be trained to write questions that are most likely to result in a particular answer.
Did a little fixing of the maps and chapters when I realized that the government is not like a large company. Companies are much more tied up in money, which makes sense. The government is about the power to protect, punish, and hide knowledge. It’s much closer to Greek/Roman gods?
Gave up on the SQL update because I think it was taking more time per each line. I’m now using the primary key for a select/update pair and it seems to be taking the same time for each. Looks like around 25k updates/hour?
Updating my slides to show the gpt generating text:
We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention. In an AFT layer, the key and value are first combined with a set of learned position biases, the result of which is multiplied with the query in an element-wise fashion. This new operation has a memory complexity linear w.r.t. both the context size and the dimension of features, making it compatible to both large input and model sizes. We also introduce AFT-local and AFT-conv, two model variants that take advantage of the idea of locality and spatial weight sharing while maintaining global connectivity. We conduct extensive experiments on two autoregressive modeling tasks (CIFAR10 and Enwik8) as well as an image recognition task (ImageNet-1K classification). We show that AFT demonstrates competitive performance on all the benchmarks, while providing excellent efficiency at the same time.
More writing. It turns out that the conference that I was aiming for had a (required) early submission for US authors that I missed. Sigh
Wrote of a description of cloud computing for big science for Eric H
Worked on 2 proposal overviews of Orest
Working on conspiracy article/chapter
Still running the statement that I put together Saturday
3:00 – ICWSM rehearsal – lots of good comments, which means lots of revisions. Another walkthrough this Friday at 3:30
Adding an incrementing value to an existing table in MySQL
I’ve been working on the Yelp dataset, and realized that I had forgotten to have some simple way to order the table. There is a review ID and date, but those can take a lot of time to work with. I wanted to add a row_id field, after creating the table, and then fill it with incrementing numbers. That took a little work to figure out, but here’s a full toy example based on this stackoverflow post. The table is very simple:
I initially populate it with only str values:
insert into table_test(str) values ('qwerty'), ('asdfgh'), ('zxcvbn'), ('qwerty');
That sets values in the table:
I then create the procedure with a delimiter:
/* set delimiter */ DELIMITER $$ /* remove procedure if exists... */ DROP PROCEDURE IF EXISTS insert_it $$ /* create procedure */ CREATE PROCEDURE insert_it () BEGIN DECLARE varcount INT DEFAULT 1; DECLARE varmax INT DEFAULT 4;
WHILE varcount <= varmax DO UPDATE table_test set row_id = varcount where row_id IS NULL LIMIT 1; SET varcount = varcount + 1; END WHILE; END $$ /* reset delimiter back to normal */ DELIMITER ;
Then you can run it and check the results
/* call procedure */ CALL insert_it(); select * from table_test;
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IOs). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007 to February 2020), over 50,000 accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the precision-recall (P-R) curve; maps out salient network communities; and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from US Congressional reports, investigative journalism, and IO datasets provided by Twitter.
Choosing among spatially-distributed options is a central challenge for animals, from deciding among alternative potential food sources or refuges, to choosing with whom to associate. Using an integrated theoretical and experimental approach (employing immersive virtual reality), we consider the interplay between movement and vectorial integration during decision-making regarding two, or more, options in space. In computational models of this process we reveal the occurrence of spontaneous and abrupt “critical” transitions (associated with specific geometrical relationships) whereby organisms spontaneously switch from averaging vectorial information among, to suddenly excluding one, among the remaining options. This bifurcation process repeats until only one option—the one ultimately selected—remains. Thus we predict that the brain repeatedly breaks multi-choice decisions into a series of binary decisions in space-time. Experiments with fruit flies, desert locusts, and larval zebrafish reveal that they exhibit these same bifurcations, demonstrating that across taxa and ecological context, we show that there exist fundamental geometric principles that are essential to explain how, and why, animals move the way they do.
Working on map chapter/article
2:00 Meeting with Michelle
Pulled some papers for Ron
Need to sync up with Rukan – done! Really nice work. We need to produce better statistics for analyzing ensembles
More writing. The abstracts are due Monday! Uploaded map:
Finished processing the Yelp files. Backing up the DB
Only star ratings
Business name, type, review, then star ratings
Generate 1,000,000 line samples that are based on different business?
Automatic ablation study of 10k, 20k, … 1M corpora
Look for different ways to name the same thing that tells something about who you are. (look for racist ways of describing food?) Analog of #chinavirus and #Sars-Cov-2
Paki vs. Pakistani, Curry vs. Indian, Chinese vs. Takeout.
Invite all for a presentation next Tuesday at 3:00 for 90 minutes (include Fatima and Arpita)
Growing popular and industry interest in high-performing natural language generation models has led to concerns that such models could be used to generate automated disinformation at scale. This report examines the capabilities of GPT-3–a cutting-edge AI system that writes text–to analyze its potential misuse for disinformation. A model like GPT-3 may be able to help disinformation actors substantially reduce the work necessary to write disinformation while expanding its reach and potentially also its effectiveness.
A quick thought about organizing topics from the GPT-3.
For each topic, have the GPT define the phrase – something like “___ is a complex subject. Here is a one-paragraph overview of ___”
Using Doc2Vec or something similar, cluster all the overview paragraphs
Order the topic names by occurrence, possibly with some similarity filtering as well
Use the best topic, and keep the descriptions for enhancing the map display using popups
Need to integrate the DB into the interactive code
Need to clean up the interactive code so that there is a callback dispatcher that handles all the ins and outs, rather than the current multiple callbacks
Need to make a component class that keeps the html/dash elements along with names, Inputs and Outputs so that the important elements aren’t scattered all over the code
3:30 Meeting with Sim to go over Twitter API
Good discussion with Rukan. We were able to do a bit of regression analysis on loss with respect to parameters, though the Bayesian search got stuck in an odd place. Turns out that less is better. Trying a grid search next
Coordination without communication abstract
Made more progress on the article than I thought I would
2:00 Meeting with Michelle – she likes the direction it’s going! Need something for the beginning. Also, incorporate her edits on Scratch.