Monthly Archives: November 2021

Phil 11.30.2021

Today I learned that I’m working on Computational Sociolinguistics

GPT Agents

  • Found a bug with my GML code. I seem to be saving only the source->target relationships
  • Need to explicitly add the target->source info as well, but I think there are multiple maps. The image above is the outbound map. It might make sense to have inbound and both
  • Rewrote the entire GML export to use database queries. Makes more sense now

SBIRs

  • 9:15 standup
  • Working on the paper. Now at 13.5 pages. Working on Results

Phil 11.29.2021

Having a hard time getting started today

Try calling Principal again

SBIRs

  • 10:00 Belief map meeting. Just discussed getting the paper through the review process
  • Working on paper – finished cleaning up the intro and am working on background

GPT Agents

  • Good meeting with Andreea. She introduced me to sociolinguistics, which looks to be the root of a lot of the ‘unique populations’ that we can train a GPT to mimic. William Labov (en.wikipedia.org/wiki/William_Labov) is one of the founders in the field. Got some of his books and Introducing Sociolinguistics by Miriam Meyerhoff (scholar)

Phil 11.24.2021

GPT Agents

  • Got my Twitter developer account!
  • Test topic connections
    • Write and read to db – maybe? Verify. Seems to be working!
    • Continue with conspiracy theories
  • Had a some interesting ideas.
    • Once a map is built, narratives can be created by the GPT, either in standalone or in dialog with others. The trajectory of these stories across the map to see the major flows. In turn, the flows can be used to adjust layout.
    • The re-clustering can still use the group labels as the ‘topics’. So we might get Jews/Puppetmasters and Government/False Flags. The order can depend on the ratio
  • GML output
  • Got the groups writing out:
Groups

And the topics in the groups. Here’s PuppetMasters:

Puppet Masters Topic Group

SBIRs

  • GML output
  • Work with Aaron on the RFI text?
  • Work the paper (due Dec 10).
    • Single-column IEEE format – done and on Overleaf using their template
    • 12 pages (Currently at 15!)

Phil 11.23.2021

Get bike! Nope, still needs bearing races

Updated conspiracy chart for 2021

GPT Agents

  • Got my Twitter developer account! Now I need to see if I can figure out threading
  • Test topic connections
    • Add connections to MapTopic.to_string() – done
    • Pull found text from raw – done
    • Write and read to db – maybe? Verify
    • Continue with conspiracy theories

SBIRs

  • Write stories for today’s sprint planning
    • GML output
    • Topic work
    • Refactoring
    • Change player so that it can handle any number of ‘players’
    • Add GML input to player
    • Meeting about tasking and IP. I need to write up something about when patents are good (slowly changing environments, where defense is important) and bad (dynamic, disruptive environments where exploration is easy). Incidentally, the Patent Office needs something like ArXiv, where ideas can be published as prior art so that they cannot be taken from the public domain. It should be a free service.
      • Preliminary meeting with Aaron. Moved to Monday at 10:00
    • The only real task is the paper. Which is due on Dec 10, so that’s probably a good thing

Phil 11.22.2021

It seems as though I’m kinda burnt out from the last effort

Get bike!

GPT Agents

  • Playing around with the conspiracy theory map since I know that area and it needs another pass
    • I have a thought about counting new results from the gpt. When there is a response from the gpt, there should be a button(?) that counts and deletes all substrings in the return that matches a topic in a node, with the appropriate connections between topics and nodes. I’ll need to add a count value to table_topic. It can have a default of 1
  • Start generating gml files
  • 3:30 Meeting with Andreea. Looked at the word frequencies for the yelp vegetarian data to see if this is the kind of frequencies she mentioned last week as a linguist
  • Got my Twitter dev API!

SBIRs

  • Sprint demos! Done!
  • Add brownbag of technique for next sprint
  • Set up meeting to discuss next steps for 4:00 tomorrow
  • Got pulled into some RFI thing?

Phil 11.19.2021

NLP+CSS 201: Beyond the basics

  • This website hosts the upcoming tutorial series for advanced NLP methods, for computational social science scholars.
  • Every few weeks, we will host some experts in the field of computational social science to present a new method in NLP, and to lead participants in an interactive exploration of the method with code and sample text data. If you are a graduate student or researcher who has some introductory knowledge of NLP (e.g. has learned text analysis from SICSS) and wants to “level up”, come join us!
  • Watch past tutorials on our YouTube channel.

OpenAI makes GPT-3 generally available through its API | VentureBeat

Book

  • Submitted proposal to Oxford press!

GPT Agents

  • Adding connections between topics – done!
  • Playing around with the conspiracy theory map since I know that area and it needs another pass
    • I have a thought about counting new results from the gpt. When there is a response from the gpt, there should be a button(?) that counts and deletes all substrings in the return that matches a topic in a node, with the appropriate connections between topics and nodes. I’ll need to add a count value to table_topic. It can have a default of 1
  • Monday start generating gml files

Phil 11.18.2021

SBIRs

GPT Agents

  • Add in topic links
  • Add check Wikipedia button? I guess at the bottom row of buttons – done
  • Work on GML output once the topic links are in
  • Start conspiracy map

Phil 11.17.2021

GPT-Agents

  • Add correlation to the columns of the LIWC data – done
  • When saving as GML, export the MapGroup as one file and then each set of MapTopics for that group as a file – started
  • 4:15 Meeting – went over the paper and the new spreadsheets

SBIRs

  • Add ground-truth checking to slides – done
  • Add qualitative slide – done
  • Added some screenshots for the live demo
  • Walk through presentation with Aaron. Done.
  • Change the script to match the current. Done
  • Add a final slide?

Book

  • Add Resume/CV

JuryRoom

  • 6:00 Meeting. Lots of work on Jarod’s stuff. Did a practice walkthrough of the LAIC work

Phil 11.16.2021

Registered for graduation! Done!

Set up physical appt

JuryRoom

  • Continuing to run Andreea’s probes
  • Responded to Jarod’s email on large transformer language models
  • Meeting with Andreea today

SBIRs

  • 9:15 standup – done
  • Working on adding connections between topics: TODO: add self.connect_topics(self.selected_topic, self.selected_seed_topic) # Verify this works!
    • Added connection_set:Set to MapTopic, but no implementation yet

GPT Agents

  • Send draft to Jimmy and Shimei – done

Book

  • Evaluate proposal and send in?

Phil 11.15.2021

SBIRs

  • Worked most of the weekend. I have a first pass on the methods and results section, some notes on discussion, and still need to start the conclusions. Also, fold the text into the white paper – DONE!
  • Meetings with Aaron?

GPT Agents

  • Got LIWC data from Shimei
  • Need to run probes for Andreea – running

Phil 11.11.2021

Armistice Day is commemorated every year on 11 November to mark the armistice signed between the Allies of World War I and Germany at Compiègne, France, at 5:45 am for the cessation of hostilities on the Western Front of World War I, which took effect at eleven in the morning—the “eleventh hour of the eleventh day of the eleventh month” of 1918.Wikipedia

Dynamics of online hate and misinformation

  • Online debates are often characterised by extreme polarisation and heated discussions among users. The presence of hate speech online is becoming increasingly problematic, making necessary the development of appropriate countermeasures. In this work, we perform hate speech detection on a corpus of more than one million comments on YouTube videos through a machine learning model, trained and fine-tuned on a large set of hand-annotated data. Our analysis shows that there is no evidence of the presence of “pure haters”, meant as active users posting exclusively hateful comments. Moreover, coherently with the echo chamber hypothesis, we find that users skewed towards one of the two categories of video channels (questionable, reliable) are more prone to use inappropriate, violent, or hateful language within their opponents’ community. Interestingly, users loyal to reliable sources use on average a more toxic language than their counterpart. Finally, we find that the overall toxicity of the discussion increases with its length, measured both in terms of the number of comments and time. Our results show that, coherently with Godwin’s law, online debates tend to degenerate towards increasingly toxic exchanges of views.

GPT Agents

  • Sent data off to Shimei to run through LIWC
  • Good meeting yesterday
  • Start writing some outline text

SBIRs

  • Write!
  • Try running the script text through the text matcher to see which nodes it goes to. That can be used in the results section. Also add a “recalculate” button to the text compare popup?

Phil 11.10.2021

I remember years ago, it must have been some time in the ’90’s, seeing billboards for ISPs on the 101 heading from SFX to San Francisco. This feels like that:

GPT Agents

  • Add KL-divergence and Total Variation Distance to analysis.
    • Need to normalize everything, then then compare the normalized versions in a new spreadsheet. Don’t forget the offset and scalar! (x*scalar + offset)

SBIR(s)

  • Start writing methods section

Book

  • See what else needs to be done for the Oxford proposal and send it off by the end of the week?

Phil 11.9.2021

https://twitter.com/MattGrossmann/status/1457789601980948480

In a related thread:

https://twitter.com/dannybarefoot/status/1457784428462153730

Mapping Affinities: Democratizing Data Visualization (book)

  • Nowadays, many of our actions are transformed into digital information, which we can use to draw diagrams that describe complex operations, such as those of institutions. This book introduces us to the reading of complex systems through the concept of affinity: the alchemy that brings people together and makes them creative and productive.
  • Affinity’s mapping is a data visualization method that allows us to observe the dynamics of an organization subdivided into complex systems: institutions, universities, governments, etc. It is a graphical tool based on the collaboration variable. Mapping Affinities is, according to the author, an instrument for deciphering complex organizations and improving them. By inserting individuals on these maps, it is also a way of helping them to understand how to evolve in life within an institution. The book tackles this problem with a case study concerning the Federal Polytechnic School of Lausanne. Data from the actions of researchers at the Lausanne institution are brought together and transformed into an innovative and attractive map
  • Stored as an epub in gdrive Books

GPT Agents

  • Generate spreadsheets from single stars – done
  • Add KL-divergence and Total Variation Distance to analysis.
    • Created query for ground truth by vegetarian options by stars and saved to spreadsheets
    • I need to normalize everything, then then compare the normalized versions in a new spreadsheet. Don’t forget the offset and scalar! (x*scalar + offset)

SBIR(s)

  • Stories for next sprint – done

Book

  • See what else needs to be done for the Oxford proposal and send it off by the end of the week?

Phil 10.8.2021

This looks like a great corpora to compare text characteristics of these two groups:

Book

  • Why we’re polarized review -done
  • Liars – done

GPT Agents

  • Generate spreadsheets from single stars
  • Add KL-divergence and Total Variation Distance to analysis

SBIRs

  • 9:00 Sprint demos. Slides! Done!
  • 10:00? 2:00? LAIC demo Done!
  • Write stories for next sprint (basically writing and tweaking?)