Monthly Archives: September 2022

We offer comprehensive evidence of preferences for ideological congruity when people engage with politicians, pundits, and news organizations on social media. Using 4 years of data (2016–2019) from a random sample of 1.5 million Twitter users, we examine three behaviors studied separately to date: (i) following of in-group versus out-group elites, (ii) sharing in-group versus out-group information (retweeting), and (iii) commenting on the shared information (quote tweeting). We find that the majority of users (60%) do not follow any political elites. Those who do follow in-group elite accounts at much higher rates than out-group accounts (90 versus 10%), share information from in-group elites 13 times more frequently than from out-group elites, and often add negative comments to the shared out-group information. Conservatives are twice as likely as liberals to share in-group versus out-group content. These patterns are robust, emerge across issues and political elites, and exist regardless of users’ ideological extremity.

Tasks

Jim Donnies (winterize and generator) – done
BGE
ProServ

SBIRs

9:30 RCSNN design discussion – done
2:00 Meeting with Loren – done
Write up some sort of trip report
Reach out to folks from conference – done
Start on distributed data dictionary? Kind of?

Book

Roll in Brenda’s Changes – continuing
Ping Ryan for chapter/paper/article on authoritarians and sociotechnical systems

GPT Agents

Add cluster ID to console text when a node is clicked and a button to “exclude topic” that adds an entry to “table_exclude” that has experiment_id, keyword (or “all_keywords”), and cluster_id. These clusters are excluded when a corpora is generated.
Re-clustering will cause these rows to be deleted from the table
Add training corpora generation with checkboxes for meta-wrappers and dropdown for “before” or “after”

Phil 2.29.2022

MORS conference wraps up today. They seem like a good organization with a strong AI/ML component

Bridging Divides & Strengthening Democracy: From Science to Practice (agenda)

Efficient Python Tricks and Tools for Data Scientists

GPT-Agents

Reading and writing the reduced and clustering data? Hooray! I had to do some funky things to deal with different representations of number lists: r”/-?\d+.\d”
Loaded graph with clusters from DB:

If reducing and clustering has been changed for the data, I should list that somewhere. Maybe a “parameters” table

Phil 2.28.2022

I have gotten to the point where I am proud of my regex-fu: r”[^\d^\w^\s^[:punct:]]”

SBIRs

I’m at the MORS conference and did my first presentation to people in three years. Very pleasant. Based on how ML is going over in the other areas, I think MOR is about 2 years behind where I am, which is about a year behind SOTA. This includes large companies like Raytheon. Transfer learning is magic, people are still working with RNNs and LSTMs, and no one knows about Transformers and Attention.
Dataiku is pretty neat, but doesn’t learn from their users what works best for a dataset. So odd
Had a good chat with Dr. Ryan Barrett about how authoritarian leaders in general and Putin in particular trap themselves in an ever more extreme echo chamber by controlling the media in such a way that the egalitarian voices that use mockery are silenced, while the more extreme hierarchicalists are allowed to continue.

GPT Agents

Got the Explore option, where subsampled, clickable nodes are drawn to the canvas works. I was able to find out that several clusters in my pull were in German:

Need to save all this to the DB next. Started on the buttons. I will also need to load up these other fields, so the retreive_tweet_data_callback() will have to be changed

Phil 9.26.2022

Part-Based Models Improve Adversarial Robustness

We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks by introducing a part-based model for object classification. We believe that the richer form of annotation helps guide neural networks to learn more robust features without requiring more samples or larger models. Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts and then classify the segmented object. Empirically, our part-based models achieve both higher accuracy and higher adversarial robustness than a ResNet-50 baseline on all three datasets. For instance, the clean accuracy of our part models is up to 15 percentage points higher than the baseline’s, given the same level of robustness. Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations. The code is publicly available at this https URL.

Named Tensors

Named Tensors allow users to give explicit names to tensor dimensions. In most cases, operations that take dimension parameters will accept dimension names, avoiding the need to track dimensions by position. In addition, named tensors use names to automatically check that APIs are being used correctly at runtime, providing extra safety. Names can also be used to rearrange dimensions, for example, to support “broadcasting by name” rather than “broadcasting by position”.

SBIRs

9:00 Sprint review – done
Added stories for next sprint
2:00 MDA Meeting – done. Loren’s sick
Set up overleaf for Q3 – done
Go to DC for forum

GPT Agents

Load up laptop
Get some more done on embedding? Yup, split out each step so that changing clustering (very fast) doesn’t have to wait for loading and manifold reduction

Save everything back to the DB and make sure the reduced embeddings and clusters are loaded if available

Book

4:00 Meeting

Phil 9.23.2022

And just like that, the season has changed:

Send check for painting

Get paid account with Overleaf. I need the history feature! Done

Try fixing laptop

SBIRs

Add a method to the DataDictionary that returns a list of entries by type Done
Update graphics code to use and commit. Done
Slides for Monday’s review
Meeting with Aaron? Done
10:00 Meeting with Erika. Went well

Book

Sent a copy off to Greg
Working on edits – About 10-20 pages to go

GPT Agents

Try running the new clustering on the Tweet data
Add user adjustments for perplexity (TSNE), eps and min_samples (DBSCAN) in the clustering app
Maybe add the average distance code from here to help with options?
Try drawing to the canvas and see how slow that is

Ivermectin with PCA reduction to 10 dimensions and DBSCAN clustering

Progress! Just not as much as I’d like

Phil 9.22.2022

Book

Rolling in changes
Bumped into Greg C last night. Need to say hi and send a copy after fixing the Deep Bias chapter

SBIRs

More 3D. Hopefully get everything running. Then start building out the scenario

Had a good discussion about the experiment logging. I think tables for the folllowing:
- Program (id, name, brief description, contract start, contract end, status)
- Project (id, name, brief description, program_id)
- Code (text) (id, experiment_id, filename, date_stored, code)
- Figures (blobs) (id, experiment_id, figure name, image)
- Parameters (id, experiment_id, name, type, string value)
- Results (same as Parameters)
- Experiment (id, name, user, brief description, date run, project_id)
9:15 standup

GPT Agents

More DBSCAN. Try to figure out a way to automatically work it out given some statistical measures of the dimensions?
Got the basics of dbscan working. I was using the wrong id and name

Added PCA dimension reduction as an option. On the same dataset reduced to 10 dimensions from 100, the clustering still looks good.

I’m clearly going to need user adjustments for perplexity (TSNE), eps and min_samples (DBSCAN) in the clustering app. Tomorrow.

Phil 9.21.2022

SBIRs

Meeting with Aaron to go over MORS slides at 10:30 – done and sent in. Make a copy though in case there is some kind of problem
Working on 3D – good progress. The hierarchy is being called and doing its simple task. The Base3dObject is moving data in the dictionary
Timesheets! Done

Book

Rolling in changes

GPT Agents

More clustering – Got DBSCAN in but
Had to fix GitHub again because I keep forgetting to delete large files. Created an SVN repo to handle those.

Phil 9.20.2022

Got pinged by Twitter about my work, which is nice

Need to try the Hardware and Devices troubleshooter on my busted laptop as per here.

Book

Good talk with Brenda. Need to roll in the changes – rolling
Probably need to add this to the diversity injection section as an example of personal-level, not technological level approaches to misinformation: Interventions to reduce partisan animosity
- Rising partisan animosity is associated with a reduction in support for democracy and an increase in support for political violence. Here we provide a multi-level review of interventions designed to reduce partisan animosity, which we define as negative thoughts, feelings and behaviours towards a political outgroup. We introduce the TRI framework to capture three levels of intervention—thoughts (correcting misconceptions and highlighting commonalities), relationships (building dialogue skills and fostering positive contact) and institutions (changing public discourse and transforming political structures)—and connect these levels by highlighting the importance of motivation and mobilization. Our review encompasses both interventions conducted as part of academic research projects and real-world interventions led by practitioners in non-profit organizations. We also explore the challenges of durability and scalability, examine self-fulfilling polarization and interventions that backfire, and discuss future directions for reducing partisan animosity.

SBIRs

Finish MORS slides. Need a fortification map bridge, and then back to the main deck. Done!
Started to add the 3d visualization. Panda3d is in and loading the hierarchy. Need to put in the trivial top level that moves an object with the data dictionary

GPT Agents

Pinged Shimei and got a response. Restart meetings next Wednesday after the conference?

Phil 9.19.2022

Out of One, Many: Using Language Models to Simulate Human Samples

We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the “algorithmic bias” within one such tool — the GPT-3 language model — is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property “algorithmic fidelity” and explore its extent in GPT-3. We create “silicon samples” by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.

SBIRs

1:00 MDA presentation – done!
Finish MORS deck – nope, but closer. Need to have a slide that shows we are building small maps – almost like fortification scale
Ping Erika

GPT Agents

Start clustering with sklearn.cluster.DBSCAN
Send paper to the gang. Done

Phil 9.16.2022

Put in a ride for tomorrow!

https://twitter.com/amasad/status/1570598156160897024

SBIRs

Finish up MORS slides
Create a single-level hierarchy that moves a target in 3D space and set up the visualizer to track through the data dictionary

GPT Agents

Clustering

Phil 9.15.2022

SBIRs

9:15 standup – done
Sanity check on slides, look through Loren’s slides – done
Next pass through MORS deck
Travel forms for MORS and CHIRP – done. This is a nice site for getting government per diem rates
Register for MORS – done
Make cards! – done

GPT Agents

Try umap-learn on another box – done. Same problems
Clustering

Book

Pinged Katy at Elsivier

Phil 9.14.2022

Train station at noon!

Git Re-Basin: Merging Models modulo Permutation Symmetries

The success of deep learning is thanks to our ability to solve certain massive non-convex optimization problems with relative ease. Despite non-convex optimization being NP-hard, simple algorithms — often variants of stochastic gradient descent — exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes contain (nearly) a single basin, after accounting for all possible permutation symmetries of hidden units. We introduce three algorithms to permute the units of one model to bring them into alignment with units of a reference model. This transformation produces a functionally equivalent set of weights that lie in an approximately convex basin near the reference model. Experimentally, we demonstrate the single basin phenomenon across a variety of model architectures and datasets, including the first (to our knowledge) demonstration of zero-barrier linear mode connectivity between independently trained ResNet models on CIFAR-10 and CIFAR-100. Additionally, we identify intriguing phenomena relating model width and training time to mode connectivity across a variety of models and datasets. Finally, we discuss shortcomings of a single basin theory, including a counterexample to the linear mode connectivity hypothesis.

https://twitter.com/SamuelAinsworth/status/1569719499263471616

SBIRs

Finish first pass at slide deck
Register for MORS

GPT Agents

Set up keyword data repo
Thinking that I can store multiple variants of the manifold reductions as a list of dicts in the EmbeddedText object
Tried umap-learn with a brand-new Python 3.10 install. Same problem

Phil 9.13.2022

SBIRs

Sprint planning
Slides for Q1-Q2 presentation

Book

Put in Brenda’s changes and sent her the updated version
Still need to find the first use of credibility and trustworthiness

GPT Agents

Working on TSNE manifold reduction from data
3:30 Meeting? It’s not on the calendar…
Here’s some initial clustering from the twitter data. This is TSNE down to 2 dimensions:
Paxlovid, then Ivermectin:

Need to add clustering, probably at higher dimensions and visualization from that reduced set, just to keep things related (and maybe faster?). Anyway, enough for today.
Got tired of making slides. Here are a few perplexity tests:

It looks like there are several clusters in the Paxlovid space, but people are mostly talking about the same thing wrt ivermectin with a few small outliers?

Phil 9.12.2022

SBIRs

Sprint demos
2:00 MDA Meeting
More work on MORS presentation
Register for MORS? Waiting for info
TRAVEL REQUESTS

GPT Agents

Manifold reduction and clustering
I had a problem which I was expecting, but dreading nonetheless. I tried to push a file larger than 100MB to GitHub. Ooops! And I’m doing this within the JetBrainsIDE, so the command line options are… difficult
To fix this problem in JetBrains, go to the Git->show Git log menu item that brings up the ‘log’ display:

Right clicking on the problem commit will bring up a menu. Select Revert Commit (Or possibly Undo Commit? both may work). That will clear the branch.
Then re-commit the current branch in the normal way
Cannot get umap-learn to import. It just hangs. TSNE works though, so working out how all that works. Success!

Book

4:00 Meeting with Brenda
Ping Katy on Thursday

Phil 9.9.2022

Call powerwasher! Dave Tobias 410 271-8795 – done

The antibiotics seem to be working on the bronchitis… slowly

Book

Read through Brenda’s notes. See if I want to fix anything

SBIRs

Demo Slides
MORS slides for Workshop (Tuesday-Thursday, 27-29 September 2022) – first pass done Good chat with Aaron about changing the framing
Slides for Q2 report (Fri Sept 16) – roughed out
Hotel for CHIRP November 15-16
Travel Requests for MORS and CHIRP

GPT Agents

Start on visualizing and clustering embeddings. Started. Can update the db
- umap-learn 0.5.3
- sklearn.cluster.DBSCAN

viztales

Dimension reduction, State, Orientation, and Speed

Monthly Archives: September 2022

Phil 9.30.2022

Phil 2.29.2022

Phil 2.28.2022

Phil 9.26.2022

Phil 9.23.2022

Phil 9.22.2022

Phil 9.21.2022

Phil 9.20.2022

Phil 9.19.2022

Phil 9.16.2022

Phil 9.15.2022

Phil 9.14.2022

Phil 9.13.2022

Phil 9.12.2022

Phil 9.9.2022