Phil 10.12.2022

I think that this could be a really interesting way of debugging vision systems around edge cases. Mask out everything but the attribution area and run separate image-to-text models to see what they identify and then check for consensus.

https://twitter.com/lintool/status/1579830653126086656

This also makes me think of what it might mean for human-monitored AI systems. People will need to be trained to quickly identify if the model is behaving poorly and flag it, as opposed to if the decision is correct. It’s more like driving a race car where you have to monitor the performance of the vehicle and adapt to that. You can’t stop racing if the tires are wearing out or you’re running out of fuel. You have to adjust your behavior to optimize the behavior of the system as it is. Which implies that we need simulators of AI systems that break a lot.

Third World War: System, Process and Conflict Dynamics

How digital media drive affective polarization through partisan sorting

  • Politics has in recent decades entered an era of intense polarization. Explanations have implicated digital media, with the so-called echo chamber remaining a dominant causal hypothesis despite growing challenge by empirical evidence. This paper suggests that this mounting evidence provides not only reason to reject the echo chamber hypothesis but also the foundation for an alternative causal mechanism. To propose such a mechanism, the paper draws on the literatures on affective polarization, digital media, and opinion dynamics. From the affective polarization literature, we follow the move from seeing polarization as diverging issue positions to rooted in sorting: an alignment of differences which is effectively dividing the electorate into two increasingly homogeneous megaparties. To explain the rise in sorting, the paper draws on opinion dynamics and digital media research to present a model which essentially turns the echo chamber on its head: it is not isolation from opposing views that drives polarization but precisely the fact that digital media bring us to interact outside our local bubble. When individuals interact locally, the outcome is a stable plural patchwork of cross-cutting conflicts. By encouraging nonlocal interaction, digital media drive an alignment of conflicts along partisan lines, thus effacing the counterbalancing effects of local heterogeneity. The result is polarization, even if individual interaction leads to convergence. The model thus suggests that digital media polarize through partisan sorting, creating a maelstrom in which more and more identities, beliefs, and cultural preferences become drawn into an all-encompassing societal division.

SBIRs

  • 9:00 kernel methods discussion with Rukan. Need to look at a way of using SD to look at “outlier-ness” maybe SD of all points – SD of all other points
  • 10:00 experiment logger review
  • Got GeoPandas installed!
  • An example for a Windows Python 3.7 install in the directory where the whl files are located would be (in order):
    • pip install .\GDAL-3.4.2-cp37-cp37m-win_amd64.whl
    • pip install .\Fiona-1.8.21-cp37-cp37m-win_amd64.whl
    • pip install geopandas
  • Integrate GeoPandas and start on a textured-polygon map import that is lat/long accurate. Start with Mercator?
  • Start on one of the many papers that are due over the next few weeks

GPT Agents

  • Got the EmbeddingExplorer App mostly done yesterday. Need to get author information for the other keywords, generate some corpora, and train a model!
  • 4:00 Meeting

Book

  • Roll in Brenda’s edits

Phil 10.11.2022

Slow day. The SBIR meetings got delayed until the afternoon, so I worked on the TweetEmbedExplorer. There is a lot to the schema now:

And tooltips work!

Phil 10.10.2022

Drop the truck off today

SBIRs

  • 9:00 sprint review
  • Write stories for next sprint

Book

  • 4:00 meeting with Brenda

GPT Agents

  • Wire up selections and generate a corpus!

Phil 10.7.2022

The Seagull is tomorrow! Need to do my chores early and then prep. Getting there by 7:00am means leaving at 4:45 or so. Brrr!

The News Literacy Project is a nonpartisan education nonprofit building a national movement to create a more news-literate America.

SBIRs

  • Need to talk to Aaron about getting the geopandas out of the base classes

GPT Agents

  • Meetings are back on for 4:00 Wednesdays

Book

  • Going to try to catch up to Brenda

Phil 10.6.2022

BSO! Done

GPT Agents

  • Got the user downloading working. Smooth, with no surprises!
  • Write a class for creating corpora. It will take the settings from the UI and generate files with the appropriate wrapping
  • New view that includes cluster exclusion True/False
  • Dig up the code that produces percentage meta info

SBIRs

  • Asked Loren and Rukan to keep their parts of the Overleaf up to date
  • 9:15 standup
  • Port Panda3d visualizer. Mostly done, but I’m having problems with objects being multiply defined. Turns out the setup() method was being called twice – Once when the class was instanced, once when the visualizer called setup() directly. Fixed.
  • Start outlining update of AI & Weapons paper – nope
  • Continue with Docker/Kafka tutorial – nope

Book

  • 4:00 Meeting with Brenda – good progress

Phil 10.5.2022

It has been raining since Saturday!

Nice interactive statistical power calculator

Somewhat similar to the way that Putin worked to consolidate the independent media because they mocked him, Trump is suing CNN for defamation:

https://twitter.com/JuddLegum/status/1577370249506209804

SBIRs

GPT Agents

  • Write up status report – done
  • Add export buttons (all, by keyword) – done
  • Add User -specific checkmarks that include a notification that author download is separate – done

Phil 10.4.2022

https://www.whitehouse.gov/ostp/ai-bill-of-rights/

Collective intelligence for deep learning: A survey of recent developments

  • In the past decade, we have witnessed the rise of deep learning to dominate the field of artificial intelligence. Advances in artificial neural networks alongside corresponding advances in hardware accelerators with large memory capacity, together with the availability of large datasets enabled practitioners to train and deploy sophisticated neural network models that achieve state-of-the-art performance on tasks across several fields spanning computer vision, natural language processing, and reinforcement learning. However, as these neural networks become bigger, more complex, and more widely used, fundamental problems with current deep learning models become more apparent. State-of-the-art deep learning models are known to suffer from issues that range from poor robustness, inability to adapt to novel task settings, to requiring rigid and inflexible configuration assumptions. Collective behavior, commonly observed in nature, tends to produce systems that are robust, adaptable, and have less rigid assumptions about the environment configuration. Collective intelligence, as a field, studies the group intelligence that emerges from the interactions of many individuals. Within this field, ideas such as self-organization, emergent behavior, swarm optimization, and cellular automata were developed to model and explain complex systems. It is therefore natural to see these ideas incorporated into newer deep learning methods. In this review, we will provide a historical context of neural network research’s involvement with complex systems, and highlight several active areas in modern deep learning research that incorporate the principles of collective intelligence to advance its capabilities. We hope this review can serve as a bridge between the complex systems and deep learning communities.

Tasks

  • Chat with Dave tonight? – Need to send links to papers, OpenAI, Stable Diffusion thread, Paul Sharre’s books, etc

SBIRs

  • Fill out reimbursement forms – done
  • Travel to Chirp – tried in Concur. Hopeless mess
  • More RCSNN2
  • 9:15 standup
  • Experiment logger meeting – done
  • Reached out to Dr. Giddings on format of white paper

GPT Agents

  • Change table_user to have user ID as unique, primary key and see if update into works right – done
  • Add checkboxes for optional user attributes (requires downloading users for tweets)
  • No meeting is scheduled, so write up a status report

Book

Phil 10.3.2022

Tasks

  • BGE

SBIRs

  • RCSNN 2?
  • Write up some sort of trip report
  • Reach out to folks from conference – done

Book

  • Roll in Brenda’s Changes
  • Ping Ryan for chapter/paper/article on authoritarians and sociotechnical systems

GPT Agents

  • Get exclude coloring working
  • Adding author information. It gets some location information

Phil 9.30.2022

Getting Started With Stable Diffusion: A Guide For Creators

Most users do not follow political elites on Twitter; those who do show overwhelming preferences for ideological congruity

  • We offer comprehensive evidence of preferences for ideological congruity when people engage with politicians, pundits, and news organizations on social media. Using 4 years of data (2016–2019) from a random sample of 1.5 million Twitter users, we examine three behaviors studied separately to date: (i) following of in-group versus out-group elites, (ii) sharing in-group versus out-group information (retweeting), and (iii) commenting on the shared information (quote tweeting). We find that the majority of users (60%) do not follow any political elites. Those who do follow in-group elite accounts at much higher rates than out-group accounts (90 versus 10%), share information from in-group elites 13 times more frequently than from out-group elites, and often add negative comments to the shared out-group information. Conservatives are twice as likely as liberals to share in-group versus out-group content. These patterns are robust, emerge across issues and political elites, and exist regardless of users’ ideological extremity.

Tasks

  • Jim Donnies (winterize and generator) – done
  • BGE
  • ProServ

SBIRs

  • 9:30 RCSNN design discussion – done
  • 2:00 Meeting with Loren – done
  • Write up some sort of trip report
  • Reach out to folks from conference – done
  • Start on distributed data dictionary? Kind of?

Book

  • Roll in Brenda’s Changes – continuing
  • Ping Ryan for chapter/paper/article on authoritarians and sociotechnical systems

GPT Agents

  • Add cluster ID to console text when a node is clicked and a button to “exclude topic” that adds an entry to “table_exclude” that has experiment_id, keyword (or “all_keywords”), and cluster_id. These clusters are excluded when a corpora is generated.
  • Re-clustering will cause these rows to be deleted from the table
  • Add training corpora generation with checkboxes for meta-wrappers and dropdown for “before” or “after”

Phil 2.29.2022

MORS conference wraps up today. They seem like a good organization with a strong AI/ML component

Bridging Divides & Strengthening Democracy: From Science to Practice (agenda)

Efficient Python Tricks and Tools for Data Scientists

GPT-Agents

  • Reading and writing the reduced and clustering data? Hooray! I had to do some funky things to deal with different representations of number lists: r”/-?\d+.\d”
  • Loaded graph with clusters from DB:
Working!
  • If reducing and clustering has been changed for the data, I should list that somewhere. Maybe a “parameters” table

Phil 2.28.2022

I have gotten to the point where I am proud of my regex-fu: r”[^\d^\w^\s^[:punct:]]”

SBIRs

  • I’m at the MORS conference and did my first presentation to people in three years. Very pleasant. Based on how ML is going over in the other areas, I think MOR is about 2 years behind where I am, which is about a year behind SOTA. This includes large companies like Raytheon. Transfer learning is magic, people are still working with RNNs and LSTMs, and no one knows about Transformers and Attention.
  • Dataiku is pretty neat, but doesn’t learn from their users what works best for a dataset. So odd
  • Had a good chat with Dr. Ryan Barrett about how authoritarian leaders in general and Putin in particular trap themselves in an ever more extreme echo chamber by controlling the media in such a way that the egalitarian voices that use mockery are silenced, while the more extreme hierarchicalists are allowed to continue.

GPT Agents

  • Got the Explore option, where subsampled, clickable nodes are drawn to the canvas works. I was able to find out that several clusters in my pull were in German:
Cluster of German Tweets
  • Need to save all this to the DB next. Started on the buttons. I will also need to load up these other fields, so the retreive_tweet_data_callback() will have to be changed

Phil 9.26.2022

Part-Based Models Improve Adversarial Robustness

  • We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks by introducing a part-based model for object classification. We believe that the richer form of annotation helps guide neural networks to learn more robust features without requiring more samples or larger models. Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts and then classify the segmented object. Empirically, our part-based models achieve both higher accuracy and higher adversarial robustness than a ResNet-50 baseline on all three datasets. For instance, the clean accuracy of our part models is up to 15 percentage points higher than the baseline’s, given the same level of robustness. Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations. The code is publicly available at this https URL.

Named Tensors

  • Named Tensors allow users to give explicit names to tensor dimensions. In most cases, operations that take dimension parameters will accept dimension names, avoiding the need to track dimensions by position. In addition, named tensors use names to automatically check that APIs are being used correctly at runtime, providing extra safety. Names can also be used to rearrange dimensions, for example, to support “broadcasting by name” rather than “broadcasting by position”.

SBIRs

  • 9:00 Sprint review – done
  • Added stories for next sprint
  • 2:00 MDA Meeting – done. Loren’s sick
  • Set up overleaf for Q3 – done
  • Go to DC for forum

GPT Agents

  • Load up laptop
  • Get some more done on embedding? Yup, split out each step so that changing clustering (very fast) doesn’t have to wait for loading and manifold reduction
Nice clustering!
  • Save everything back to the DB and make sure the reduced embeddings and clusters are loaded if available

Book

  • 4:00 Meeting

Phil 9.23.2022

And just like that, the season has changed:

Send check for painting

Get paid account with Overleaf. I need the history feature! Done

Try fixing laptop

SBIRs

  • Add a method to the DataDictionary that returns a list of entries by type Done
  • Update graphics code to use and commit. Done
  • Slides for Monday’s review
  • Meeting with Aaron? Done
  • 10:00 Meeting with Erika. Went well

Book

  • Sent a copy off to Greg
  • Working on edits – About 10-20 pages to go

GPT Agents

  • Try running the new clustering on the Tweet data
  • Add user adjustments for perplexity (TSNE), eps and min_samples (DBSCAN) in the clustering app
  • Maybe add the average distance code from here to help with options?
  • Try drawing to the canvas and see how slow that is
Ivermectin with PCA reduction to 10 dimensions and DBSCAN clustering

Progress! Just not as much as I’d like

Phil 9.22.2022

Book

  • Rolling in changes
  • Bumped into Greg C last night. Need to say hi and send a copy after fixing the Deep Bias chapter

SBIRs

  • More 3D. Hopefully get everything running. Then start building out the scenario
Success!
  • Had a good discussion about the experiment logging. I think tables for the folllowing:
    • Program (id, name, brief description, contract start, contract end, status)
    • Project (id, name, brief description, program_id)
    • Code (text) (id, experiment_id, filename, date_stored, code)
    • Figures (blobs) (id, experiment_id, figure name, image)
    • Parameters (id, experiment_id, name, type, string value)
    • Results (same as Parameters)
    • Experiment (id, name, user, brief description, date run, project_id)
  • 9:15 standup

GPT Agents

Working DBSCAN
  • Added PCA dimension reduction as an option. On the same dataset reduced to 10 dimensions from 100, the clustering still looks good.
PCA + TSNE + DBSCAN
  • I’m clearly going to need user adjustments for perplexity (TSNE), eps and min_samples (DBSCAN) in the clustering app. Tomorrow.