Oscillations in neuronal activity in the medial temporal lobe of the human brain encode proximity to boundaries such as walls, both when navigating while walking and when watching another person do so.
Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). These models are trained to generate appropriate linguistic responses in a given context using a self-supervised prediction task. We provide empirical evidence that the human brain and autoregressive DLMs share two computational principles: 1) both are engaged in continuous prediction; 2) both represent words as a function of the previous context. Behaviorally, we demonstrate a match between humans and DLM’s next-word predictions given sufficient contextual windows during the processing of a real-life narrative. Neurally, we demonstrate that the brain, like autoregressive DLMs, constantly predicts upcoming words in natural speech, hundreds of milliseconds before they are perceived. Finally, we show that DLM’s contextual embeddings capture the neural representation of context-specific word meaning better than arbitrary or static semantic embeddings. Our findings suggest that autoregressive DLMs provide a novel and biologically feasible computational framework for studying the neural basis of language.
Even though I have Windows updates turned off, it seems that MS has rebooted my machine last night. Figuring out where the updates stopped so that I can pick up in a reasonable way. Currently,
select count(*) from table_review where row_id is not null;
has taken 25 minutes to return. Grrr. And I need a new valve in the shower. Grrr!
Update – it took 89 minutes and 32 seconds. There are 3,954,779 values set
Back to adding row numbers. Had to figure out where in the table we were, but an hour of coding beats the hell out of a few days of redundant inserts!
The (unheralded) first step in many applications of automated text analysis involves selecting keywords to choose documents from a large text corpus for further study. Although all substantive results depend on this choice, researchers usually pick keywords in ad hoc ways that are far from optimal and usually biased. Paradoxically, this often means that the validity of the most sophisticated text analysis methods depend in practice on the inadequate keyword counting or matching methods they are designed to replace. Improved methods of keyword selection would also be valuable in many other areas, such as following conversations that rapidly innovate language to evade authorities, seek political advantage, or express creativity; generic web searching; eDiscovery; look-alike modeling; intelligence analysis; and sentiment and topic analysis. We develop a computer-assisted (as opposed to fully automated) statistical approach that suggests keywords from available text without needing structured data as inputs. This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm. Our specific approach is based on training classifiers, extracting information from (rather than correcting) their mistakes, and summarizing results with Boolean search strings. We illustrate how the technique works with analyses of English texts about the Boston Marathon Bombings, Chinese social media posts designed to evade censorship, among others.
Unlike drop awnings, the Verona is a traditional horizontal awning or angled awning that provides excellent shade coverage. The Verona is designed to be installed on top of a pergola or trellis providing a natural and effective means for light and temperature control while still allowing open air spaces. The Verona can also be installed over traditional construction such as conservatories, glass ceilings, atriums, solariums and skylights to control interior light, ultraviolet rays, glare, and heat. The box awning frame of the Verona uses compact mounting hardware that make it simple to install over almost any kind of frame.
Keynote: Cecile Paris, CSIRO, Australia, “Mapping Emotions on Social Media”
My research is rooted in distributed systems, with emphasis on characterizing cyber-social systems and designing, implementing and experimenting with algorithms, services and applications for large-scale networked-systems. In a typical project cycle, in our group we quantitatively characterize socio-technical phenomena at scale, model them, apply new understandings to the design of distributed systems, and experimentally measure the performance differences. In the process we often rely on, and contribute to, research from other fields. Recently we have used research from sociology, psychology and political science to build better understandings of quantitative observations or to inform my design and experiments. While my recent work is related mainly to online social interactions and big data processing, the same research practice (of quantitatively evaluating socio-technical environments and then applying observations to the design of distributed systems or services) defines my early work in scientific grids and peer-to-peer systems. For more details, please refer to my research statement.
Had to bail to frantically assemble 3 near-useless quad charts by 4:00
Had to assemble 3 near useless quad charts by COB because someone realized that LM needed them today. First time I seriously thought about quitting this company
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, our Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
I think this means that the backwards transformer could be trained to write questions that are most likely to result in a particular answer.
Did a little fixing of the maps and chapters when I realized that the government is not like a large company. Companies are much more tied up in money, which makes sense. The government is about the power to protect, punish, and hide knowledge. It’s much closer to Greek/Roman gods?
Gave up on the SQL update because I think it was taking more time per each line. I’m now using the primary key for a select/update pair and it seems to be taking the same time for each. Looks like around 25k updates/hour?
Updating my slides to show the gpt generating text:
We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention. In an AFT layer, the key and value are first combined with a set of learned position biases, the result of which is multiplied with the query in an element-wise fashion. This new operation has a memory complexity linear w.r.t. both the context size and the dimension of features, making it compatible to both large input and model sizes. We also introduce AFT-local and AFT-conv, two model variants that take advantage of the idea of locality and spatial weight sharing while maintaining global connectivity. We conduct extensive experiments on two autoregressive modeling tasks (CIFAR10 and Enwik8) as well as an image recognition task (ImageNet-1K classification). We show that AFT demonstrates competitive performance on all the benchmarks, while providing excellent efficiency at the same time.
More writing. It turns out that the conference that I was aiming for had a (required) early submission for US authors that I missed. Sigh
Wrote of a description of cloud computing for big science for Eric H
Worked on 2 proposal overviews of Orest
Working on conspiracy article/chapter
Still running the statement that I put together Saturday
3:00 – ICWSM rehearsal – lots of good comments, which means lots of revisions. Another walkthrough this Friday at 3:30