Author Archives: pgfeldman

Phil 8.24.21

Learning to predict the cosmological structure formation

  • Matter evolved under the influence of gravity from minuscule density fluctuations. Nonperturbative structure formed hierarchically over all scales and developed non-Gaussian features in the Universe, known as the cosmic web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and use a large ensemble of computer simulations to compare with the observed data to extract the full information of our own Universe. However, to evolve billions of particles over billions of years, even with the simplest physics, is a daunting task. We build a deep neural network, the Deep Density Displacement Model (D3M3), which learns from a set of prerun numerical simulations, to predict the nonlinear large-scale structure of the Universe with the Zel’dovich Approximation (ZA), an analytical approximation based on perturbation theory, as the input. Our extensive analysis demonstrates that D3MD3M outperforms the second-order perturbation theory (2LPT), the commonly used fast-approximate simulation method, in predicting cosmic structure in the nonlinear regime. We also show that D3MD3M is able to accurately extrapolate far beyond its training data and predict structure formation for significantly different cosmological parameters. Our study proves that deep learning is a practical and accurate alternative to approximate 3D simulations of the gravitational structure formation of the Universe.

GPT-Agents

  • Generating content for the small-corpora models. 6k is done, working on 3k done
  • Generated sentiment
  • Do this to speed up the load of a mysql database (via stackoverflow)
mysql> use db_name;

mysql> SET autocommit=0 ; source the_sql_file.sql ; COMMIT ;
  • 3:00 Meeting
  • https://www.pnas.org/authors/submitting-your-manuscript – set up a paper repo in Overleaf and start to rough out
  • Need to get the spreadsheets built for the 3k and 6k models
  • Build a spreadsheet (and template?) for the LWIWC data
  • Sent Shimei reviews from the 50k, 25k, 12k, 6k, and 3k models
  • One of the really observable results is that the model tends to amplify the number of items that exist in larger quantities in the training corpora and reduce the number of items that are less common in the corpora. However, the tokens within a review seem to be unchanged. The average number of stars associated with a POSITIVE or NEGATIVE review seem very resilient.

SBIRs

  • Writing the consumer
  • That’s working too!
  • Seems plenty speedy when batched up, too
  • 9:15 standup
  • 1:00 Meeting about the sim for ARL. Going to talk about missile command, where the physics are simple, but the tactics are difficult.

Book

  • Clean up chapter thumbnails. Done!

Phil 8.23.21

SBIR(s)

  • Was getting started with Zach and then lost the power from about 8:30 to 1:30
  • Looking into RabbitMQ
  • Finishing up the NASA initial writeup

GPT Agents

  • Based on the good results, trying a 6k and 3k models just to see how small we can get
  • Trained up in less than 30 minutes! Generating content now

Phil 8.20.21

Need to look at this article in Science that does some multidimensional similarity mapping between COVID-19 variants.

  • Derek Smith, an evolutionary biologist at the University of Cambridge, has worked for decades on visualizing immune evasion in the influenza virus in so-called antigenic maps. The farther apart two variants are on Smith’s maps, the less well antibodies against one virus protect against the other. In a recently published preprint, Smith’s group, together with David Montefiori’s group at Duke University, has applied the approach to mapping the most important variants of SARS-CoV-2

The Geometry of Shape Space: Application to Influenza

  • Shape space was proposed over 20 years ago as a conceptual formalism in which to represent antibody/antigen binding. It has since played a key role in computational immunology. Antigens and antibodies are considered to be points in an abstract “shape space”, where coordinates of points in this space represent generalized physico-chemical properties associated with various (unspecified) physical properties related to binding, such as geometric shape, hydrophobicity, charge, etc. Distances in shape space between points representing antibodies and (the shape complement) of antigens are assumed to be related to their affinity, with small distances corresponding to high affinity.
  • In this paper, we provide algorithms, related to metric and ordinal multidimensional scaling algorithms first developed in the mathematical psychology literature, which construct explicit, quantitative coordinates for points in shape space given experimental data such as hemagglutination inhibition assays, or other general affinity assays. Previously, such coordinates had been conceptual constructs and totally implicit. The dimension of shape space deduced from hemagglutination inhibition assays for influenza is low, approximately five dimensional.
  • The deduction of the explicit geometry of shape space given experimental affinity data provides new ways to quantify the similarity of antibodies to antibodies, antigens to antigens, and the affinity of antigens to antibodies. This has potential utility in, e.g. strain selection decisions for annual influenza vaccines, among other applications. The analysis techniques presented here are not restricted to the analysis of antibody–antigen interactions and are generally applicable to affinity data resulting from binding assays.

SBIR(s)

  • Meeting with Zach on the Webapp Framework. Made a lot of progress, though I only kind of know what’s going on. We were able to access MySQL on the server and add a D3 chart:
Behold! SvelteKit with D3 and MySql!
  • Working on the NASA proposal

GPT Agents

  • Make spreadsheets for other models and compare to 100k

Phil 8.19.21

“Before we say “explainable AI” we must decide WHAT is it that we wish to explain. Are we about to explain the function that the system fitted to the data? or are we about to explain the world behind the data? Science writers seem unaware of the difference.”Judea Pearl

Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 1 – Introduction

  • From the syllabus: To realize the dreams and impact of AI requires autonomous systems that learn to make good decisions. Reinforcement learning is one powerful paradigm for doing so, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This class will provide a solid introduction to the field of reinforcement learning and students will learn about the core challenges and approaches, including generalization and exploration. Through a combination of lectures, and written and coding assignments, students will become well versed in key ideas and techniques for RL. Assignments will include the basics of reinforcement learning as well as deep reinforcement learning — an extremely promising new area that combines deep learning techniques with reinforcement learning.

GPT-Agents

  • Generate synthesized data – running
  • Calculate sentiment
  • Create spreadsheets (make a new directory for review-stars)

SBIR(s)

  • 9:15 standup – done
  • 10:30 NASA meeting – done Write up a 3 page version that describes a minimum viable project and then future work that extends the MVP
  • Ping Zach for a meeting to set up project – done
  • Start framing out paper on Overleaf – done
  • EXPENSE REPORT – this is chewing up hours. I STILL don’t have a code that works

JuryRoom

  • Write some rants for Tamahau

Phil 8.18.21

https://twitter.com/naunihalpublic/status/1427999617539522561

GPT-Agents

  • Finished creating the 50k, 25k, and 12k models
  • Uploading to repo – done
  • Generate synthesized data – running
  • Calculate sentiment
  • Create spreadsheets (make a new directory for review-stars)

SBIR(s)

  • Meeting with Ron at 9:00 – lots of various details about phase 1 and LAIC
  • Read through and write up paragraphs for NASA – I am becoming confused, but managed to write up an approach on what I think makes sense. Sent it off to John, and we’ll have a meeting about it tomorrow morning
  • Ping Zach for a meeting to set up project
  • EXPENSE REPORT – this is chewing up hours. I don’t have a code that works

Phil 8.17.21

I want to write a paper about the one unambiguously good option that AI/ML + simulation provides – problem domain exploration and the industrialization of imagination. The failures in Vietnam, Iraq, and Afghanistan, not to mention 9/11 and Pearl Harbor have all been described as failures of imagination. These failures exist at multiple levels – the tactical (think Jimmy Doolittle), and the strategic (human nature). AI/ML allows us to safely explore these domains before the unimaginable occurs. Because these potentials can be visualized in narratives, it is possible to broadly and compellingly present these possibilities, and increase the effectiveness and resiliency of our choices in combat and combat-adjacent domains.

  • Enhanced simulation means that ML can explore tactical options
    • Deliver the right amount of energy in the right place for the lowest cost
  • Language model maps means that ML can explore strategic options
    • And maybe avoid a fourth Vietnam

labml.ai Annotated PyTorch Paper Implementations

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations, and the website renders these as side-by-side formatted notes. We believe these would help you understand these algorithms better. We are actively maintaining this repo and adding new implementations.

GPT-Agents

  • Need to do some preliminary (e.g. stars) evaluations on the synthesized and ground truth data before meeting
  • 3:30 Meeting
    • Went over results
    • Make a new 50k, 25k, and 12k model and do the same tests
    • Sent Shimei a set of CSV files for
  • On the Opportunities and Risks of Foundation Models
    • AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles (e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on conventional deep learning and transfer learning, their scale results in new emergent capabilities, and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

SBIR(s)

  • Something something NASA proposal?
  • Meeting with Rukan
  • Sprint planning

Phil 8.16.21

Rather than say anything about Afghanistan here, I’d rather urge you to go read Thieves of State, by Sarah Chayes. Or, if you only have a few minutes, this blog post: The Ides of August

SBIR(s)

  • Sprint demos – done!
  • Lots more training – done!

GPT_Agents

  • Generated 10k synthetic reviews and added sentiment. Need to do that for ground truth now
  • Got that done. Next let’s see how they compare

Phil 8.13.21

This looks super interesting for building domain-specific belief maps:

https://twitter.com/ssgrn/status/1425615542837075968?s=12

Here’s a link to the paper: DEMix Layers: Disentangling Domains for Modular Language Modeling

  • We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text. A DEMix layer is a collection of expert feedforward networks, each specialized to a domain, that makes the LM modular: experts can be mixed, added or removed after initial training. Extensive experiments with autoregressive transformer LMs (up to 1.3B parameters) show that DEMix layers reduce test-time perplexity, increase training efficiency, and enable rapid adaptation with little overhead. We show that mixing experts during inference, using a parameter-free weighted ensemble, allows the model to better generalize to heterogeneous or unseen domains. We also show that experts can be added to iteratively incorporate new domains without forgetting older ones, and that experts can be removed to restrict access to unwanted domains, without additional training. Overall, these results demonstrate benefits of explicitly conditioning on textual domains during language modeling.
  • Git repo: github.com/kernelmachine/demix

GPT Agents

  • Get the review extraction working and produce some content. Got everything running and generating 10,000 reviews. We’ll see how the pattern of stars looks first, and then do a sentiment run on the stored data
  • Export the DB and run sentiment analysis

SBIR(s)

  • Had a long talk yesterday with Aaron about what to do with MARE. I think it becomes the framework for training and using our enhanced simulation scenario explorer. Basically AlphaZero but for physics-based games like tennis.
  • Got Andrew to buy off on the LAIC stories and show me how to put them properly(!) in Jira, so I’ll do that today
  • Endless, mind-numbing training
  • EXPENSE REPORT

Book

  • Skipping this week – Michelle has meetings

Phil 8.12.21

Just back from a conference in Huntsville. Lots of very expensive ways to deliver energy to a point in space at a particular time. I need to write up my thoughts in more detail later. Also EXPENSE REPORT!

Announcing AI21 Studio and Jurassic-1 Language Models

  • We are thrilled to announce the launch of AI21 Studio, our new developer platform where you can use our state-of-the-art Jurassic-1 language models to build your own applications and services. Jurassic-1 models come in two sizes, where the Jumbo version, at 178B parameters, is the largest and most sophisticated language model ever released for general use by developers. AI21 Studio is currently in open beta, allowing anyone to sign up and immediately start querying Jurassic-1 using our API and interactive web environment.

Research community dynamics behind popular AI benchmarks

  • The widespread use of experimental benchmarks in AI research has created competition and collaboration dynamics that are still poorly understood. Here we provide an innovative methodology to explore these dynamics and analyse the way different entrants in these challenges, from academia to tech giants, behave and react depending on their own or others’ achievements. We perform an analysis of 25 popular benchmarks in AI from Papers With Code, with around 2,000 result entries overall, connected with their underlying research papers. We identify links between researchers and institutions (that is, communities) beyond the standard co-authorship relations, and we explore a series of hypotheses about their behavior as well as some aggregated results in terms of activity, performance jumps and efficiency. We characterize the dynamics of research communities at different levels of abstraction, including organization, affiliation, trajectories, results and activity. We find that hybrid, multi-institution and persevering communities are more likely to improve state-of-the-art performance, which becomes a watershed for many community members. Although the results cannot be extrapolated beyond our selection of popular machine learning benchmarks, the methodology can be extended to other areas of artificial intelligence or robotics, and combined with bibliometric studies.

The Learning on Graphs and Geometry Reading Group

Alpha Zero’s “Alien” Chess Shows the Power, and the Peculiarity, of AI

  • What’s also remarkable, though, Hassabis explained, is that it sometimes makes seemingly crazy sacrifices, like offering up a bishop and queen to exploit a positional advantage that led to victory. Such sacrifices of high-value pieces are normally rare. In another case the program moved its queen to the corner of the board, a very bizarre trick with a surprising positional value. “It’s like chess from another dimension,” Hassabis said.

SBIR

  • Standup – done
  • Respond to Steve – done multiple
  • Schedule story time with Andrew – done. Now I just need to put them in Jira
  • Schedule golf with Aaron? Done! Sim first (using MARE and enhanced sim), then prototype, then build a trade show version (indoor so no weather), then try fielding at some willing golf course? Paul could probably help with that

Phil 8.9.21

Nice ride on Saturday. An 18mph average pace and I still got dropped by the lead group! But I did hang on for over 40 miles

Book

  • Want a \TODO{write something here} that can disappear as needed? Use these two versions of TODO:
%\newcommand\TODO[1]{\textcolor{red}{(TODO: #1)}} % show
\newcommand\TODO[1]{} % hide

SBIRs

  • Go over stories with Aaron?
  • MARCOM meeting
  • Off to the SMD symposium

GPT Agents

  • Setting up the DB to handle sentiment and PoS – done
  • Generating and parsing the review/stars model. When there is an exception thrown while debugging, the IDE loses the ability to edit?

Phil 8.7.21

There is a version of DALL-E at huggingface for image to text! (huggingface.co/spaces/flax-community/dalle-mini)

A man in a room:

A woman in a room:

Need to fix my timesheet for Monday

A Network Framework of Cultural History

  • The emergent processes driving cultural history are a product of complex interactions among large numbers of individuals, determined by difficult-to-quantify historical conditions. To characterize these processes we have reconstructed aggregate intellectual mobility over two millennia through the birth and death locations of more than 150,000 notable individuals. The tools of network and complexity theory were then used to identify characteristic statistical patterns and determine the cultural and historical relevance of deviations. The resulting network of locations provides a macroscopic perspective of cultural history, which helps us to retrace cultural narratives of Europe and North America using large-scale visualization and quantitative dynamical tools and to derive historical trends of cultural centers beyond the scope of specific events or narrow time intervals.

Phil 8.6.21

Had to get my truck serviced yesterday (oil change and recalls) which took a bunch of hours, so I brought my bike and went on a really nice ride on a wonderful day

Speaking of the truck. These folks (103 Creek Ridge Road,Greensboro, North Carolina 27406) will install lift kits from these folks. I could also do wheels and tires. Stay at Haw River State Park?

Book

  • Put the proposal in the Overleaf folder using proposal.tex as the root document
  • 2:00 Meeting. We are getting very close! I need to make the TODOs vanish

SBIRs

  • The Delta tix did not save to PDF worth a damn, so I created a new document with screenshots that didn’t suck. Delta is horrible and expensive. I think if I have to go to Huntsville again I’ll try to take the train
  • Steve has many questions. Did some answering and pointed him at Microsoft Flight Simulator, which is getting more amazing all the time. It’s over 40 years old!

More stupid travel stuff. Clay suggests American for next time

Phil 8.4.2021

Finished Stewardship of global collective behavior. It’s quite good and a nice way to frame all this research

  • Collective behavior provides a framework for understanding how the actions and properties of groups emerge from the way individuals generate and share information. In humans, information flows were initially shaped by natural selection yet are increasingly structured by emerging communication technologies. Our larger, more complex social networks now transfer high-fidelity information over vast distances at low cost. The digital age and the rise of social media have accelerated changes to our social systems, with poorly understood functional consequences. This gap in our knowledge represents a principal challenge to scientific progress, democracy, and actions to address global crises. We argue that the study of collective behavior must rise to a “crisis discipline” just as medicine, conservation, and climate science have, with a focus on providing actionable insight to policymakers and regulators for the stewardship of social systems.

Put my bids in for ICTAI-2021 reviews

GPT Agents

  • Building 6-epoch review, stars model – done! Need to verify they work

SBIR

Phil 8.3.21

Examining the consumption of radical content on YouTube

  • Daily share of news consumption on YouTube, a social media platform with more than 2 billion monthly users, has increased in the last few years. Constructing a large dataset of users’ trajectories across the full political spectrum during 2016–2019, we identify several distinct communities of news consumers, including “far-right” and “anti-woke.” Far right is small and not increasing in size over the observation period, while anti-woke is growing, and both grow in consumption per user. We find little evidence that the YouTube recommendation algorithm is driving attention to this content. Our results indicate that trends in video-based political news consumption are determined by a complicated combination of user preferences, platform features, and the supply-and-demand dynamics of the broader web.

GPT Agents

  • I now have 3 and 6 epoch runs for name, review, stars models.
  • Evaluate stars to see how much has changed
  • Maybe try to train up a bigger model? Start with the xl model and step back to find the largest model that will fit. Then train that with the name, review, stars corpora
    • Nope, the 117m model is the biggest that will fit. When I’ve got the time try the Huggingface Course and see how to do cloud training
  • 3:00 Meeting. Went over results and the mapping tool proposal
    • Need to adjust the counts to relative percent for easier compare
    • Try training a model from scratch on the stars/votes corpora? Thaty way we could see if it learns the ratios better. This could be an artifact from finetuning
    • Create models for review+star since the name sets up the review

SBIRs

  • Sprint planning
    • Plan LM Epic – DSR-646
    • SMD conference – DSR-645
  • Long-ish chat with Rukan about transforms in scene graphs