Monthly Archives: November 2022

Phil 11.26.22

Pyktok is a simple module to collect video, text, and metadata from TikTok.

Book

  • Finished getting all the images that I can fix without going to a photoservice fixed. Made this map of White Stork sightings:

Phil 11.24.22

This Stability-AI repository contains Stable Diffusion models trained from scratch and will be continuously updated with new checkpoints. The following list provides an overview of all currently available models. More coming soon.

Adjective Ordering Across Languages

  • Adjective ordering preferences stand as perhaps one of the best candidates for a true linguistic universal: When multiple adjectives are strung together in service of modifying some noun, speakers of different languages—from English to Mandarin to Hebrew—exhibit robust and reliable preferences concerning the relative order of those adjectives. More importantly, despite the diversity of the languages investigated, the very same preferences surface over and over again. This tantalizing regularity has led to decades of research pursuing the source of these preferences. This article offers an overview of the findings and proposals that have resulted.

Disinformation Watch is a fortnightly newsletter covering the latest news about disinformation, including case studies, research and reporting from the BBC, international media and leading experts in the field.

Book

  • Working on chasing down pictures that I can use. Folks, I strongly suggest never using images that might have copyright issues as placeholders. You can get very attached to them!
  • Finished with figures. Here’s an example of what needs to be done. Here’s the before with the placeholder:
  • And here’s the after, using assets from Wikimedia, and an hour or so with Illustrator
  • It looks better, I think, but it was a lot of work

Phil 11.22.2022

This is really good: Annals of a Warming Planet CLIMATE CHANGE FROM A TO Z The stories we tell ourselves about the future.

Book

  • Work on Permissions. It doesn’t look like there some kind of list that is required, but I’ll put together a spreadsheet
  • Take pix of metronomes!
  • Keywords

GPT Agents

  • Meeting with Jason at 2:00
  • Set up reviewer status for Springer AI and Ethics
  • CICERO: An AI agent that negotiates, persuades, and cooperates with people
    • The key to our achievement was developing new techniques at the intersection of two completely different areas of AI research: strategic reasoning, as used in agents like AlphaGo and Pluribus, and natural language processing, as used in models like GPT-3, BlenderBot 3, LaMDA, and OPT-175B. CICERO can deduce, for example, that later in the game it will need the support of one particular player, and then craft a strategy to win that person’s favor – and even recognize the risks and opportunities that that player sees from their particular point of view.

Phil 11.21.22

SBIRs

  • 9:00 Sprint review
  • 2:00 MDA
  • More writing

GPT Agents

  • Need to look at the Mastodon API with an eye towards anonymous journalism
  • Had a good chat with Aaron on how population thinking is kind of like NN models, with all the odd artifacts and required dimension reduction for the loss function. This tends to explain how companies like Facebook approximate the canonical paperclip AI and consume everything to create engagement and grow the network

Book

  • Screen Capture – Copyright Violation or Fair Use?
    • To determine fair use of copyrighted works you must evaluate it against the four-factor balancing test.
      • The purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
      • The nature of the copyrighted work;
      • The amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
      • The effect of the use upon the potential market for or value of the copyrighted work.

Phil 11.18.2022

Watching Twitter fall apart is a sight to behold:

Book

  • The metronomes work! Need to take some pictures
  • More migrating – done! Need to create some keywords and fix all the figures.

GPT Agents

  • In contact with Jason Baumgartner at PushShift to work out the Mastodon API. Meeting next Tuesday at 2:00!

SBIRs

  • A bit more reading and then work on the lit review
  • Started copying relevant quotes into the background section. Finished the LAWS/ethics section. There is more effort on a ban than I was aware of

Phil 11.17.2022

GPT Agents

  • Connected with Azin and generated some data

Book

  • More migrating
  • Discovered gbif.org, which has a lot of tracked wildlife data, including white storks. I can make a new map with Plotly express maps
  • Got my metronomes yesterday! Need to get a lightweight platform and see if the experiment works

SBIRs

  • 9:15 standup
  • Great discussion with Aaron about the JMOR paper. Looking at War Elephants and Mahouts (mAIhouts? Nah) as a useful metaphor for models and handlers. Aaron’s going to write a short story introduction.

11.16.2022

When robots set advertising

Eclipse next year in the West: www.timeanddate.com/eclipse/map/2023-october-14

Galactica is an AI trained on humanity’s scientific knowledge. You can use it as a new interface to access and manipulate what we know about the universe. (Made by Papers with Code, Meta AI)

Characterizing Emergent Phenomena in Large Language Models

  • In “Emergent Abilities of Large Language Models,” recently published in the Transactions on Machine Learning Research (TMLR), we discuss the phenomena of emergent abilities, which we define as abilities that are not present in small models but are present in larger models. More specifically, we study emergence by analyzing the performance of language models as a function of language model scale, as measured by total floating point operations (FLOPs), or how much compute was used to train the language model. However, we also explore emergence as a function of other variables, such as dataset size or number of model parameters (see the paper for full details). Overall, we present dozens of examples of emergent abilities that result from scaling up language models. The existence of such emergent abilities raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.

Book

  • More migration. Done with part one! Don is working on getting me a studio.

SBIRs

GPT Agents

Phil 11.15.2022

GPT Agents

  • Tweaked my twitter counts to work with other languages. Here’s trends for “world cup” in Persian:
  • Trying Fentanyl again with more traps for zero tweets – Done! Got all the user info as well. Total of 5,507,159 tweets and 2,402,215 users

Book

  • More migration
  • Order metronomes! Done!

SBIRs

  • Had a discussion with Aaron and Rukan about what our first model should be – commands or target priority, We’ll start with targets and the idea that there might be multiple NNs within a controller
  • More reading

Phil 11.14.2022

The Mastodon public API looks like it might be pretty straightforward? https://docs.joinmastodon.org/client/public/

Also, every user has a public rss feed, so you barely need an api at all, just a list of users. Doesn’t work for hashtags, though.

Book

  • Get chapter 1 and the biblio working

GPT Agents

  • Continue the Great download
  • Look for slang terms for fentanyl
  • Start poking at the Mastodon Rest API

SBIRs

  • More writing and structuring
  • password!
  • 1:00 Meeting
  • 2:00 Meeting

Phil 11.11.2022

GPT Agents

  • I’ve found the terms I want, which are the top 10 keywords from my set. I really want to pull 1k/day for 3 years which would be 3,650,000 tweets. I should be able to do a clamped balanced pull that will give me two samples of 500 (or maybe 3 samples of 500 depending on the rounding) per day. Going to start with one keyword at a time so I can time things. It will also produce unique experiment table entries, which is probably fine
  • Send note back to the First Line
  • Started a balanced pull at 8:45am, finished at 10:45, so 2 hours for 500k tweets. Not bad!
    • Second pull at 10:48. Third Pull at 1:30 – it seems to be running slower? 4th pull at 4:00. 5th pull at 5:45

Book

  • Download and backup current project – done!
  • Set up Elsevier template in Overleaf – done!
  • Move material over
  • Create new DARPA map for balloon challenge
  • Send Brenda her $$ and a link to the GPT3

SBIRs

  • Finish reading China paper – done!

Phil 11.10.2022

See if I can get BSO tix for Friday -yes!

Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers

  • Recent developments in natural language generation (NLG) using neural language models have brought us closer than ever to the goal of building AI-powered creative writing tools. However, most prior work on human-AI collaboration in the creative writing domain has evaluated new systems with amateur writers, typically in contrived user studies of limited scope. In this work, we commissioned 13 professional, published writers from a diverse set of creative writing backgrounds to craft stories using Wordcraft, a text editor with built-in AI-powered writing assistance tools. Using interviews and participant journals, we discuss the potential of NLG to have significant impact in the creative writing domain–especially with respect to brainstorming, generation of story details, world-building, and research assistance. Experienced writers, more so than amateurs, typically have well-developed systems and methodologies for writing, as well as distinctive voices and target audiences. Our work highlights the challenges in building for these writers; NLG technologies struggle to preserve style and authorial voice, and they lack deep understanding of story contents. In order for AI-powered writing assistants to realize their full potential, it is essential that they take into account the diverse goals and expertise of human writers.

SBIRs

  • More reading. Need to search each paper for “loop”, “centaur”, and “team” and check those paragraphs at least. As you might think, the reality is more complex. All the papers have some parts of the concepts, but they often don’t use the terms
  • Chat with Aaron. Really good. I think I was able to explain my concept. We’re going to write some sections and worry about the structure later
  • 9:15 standup
  • 2:00 Presentation. Went ok. Steve needs to join Toasmasters

Book

  • Roll in more edits – DONE!
  • Set up new template?

GPT Agents

  • Add to the spreadsheet a week of numbers with min/max/avg to see what the pull size should be – done

Phil 11.9.2022

Fooling around with Mastodon a bit. The lack of advertising and the associated visual clutter is… remarkable

Book

  • Rolling in edits
  • Replied to my new Editorial Project Manager. This is starting to feel very real and scheduled

SBIRs

  • More reading. Going to work through human factors

GPT-Agents

  • Having an interesting discussion with Jack Chen about synthetic story spaces
  • Continuing documentation while watching Twitter implode. I’ll probably rework the tools to wok with the Reddit API
  • 4:00 meeting

Phil 11.8.2022

Election day! Absolutely no idea how any of this is likely to play out

Also, create a Mastodon account? I probably have enough info at this point. Applied at fediscience.org – done and set up! I have even tooted

SBIRs

  • 9:00 planning meeting
  • 10:00 MC meeting?
  • Paper
    • Finish populating annotated bibliography
    • Add category and “LMN ranking” to spreadsheet
    • Read top 3(?) papers for each
    • Search all papers for Hi/otL statements and add those quotes to the spreadsheets.
    • Tempted to do some embedding clustering, but that’s overkill
  • Add a task to Rukan to check out MinGPT as possible NN for out modules? Done

Book

  • Roll in changes

GPT Agents

  • More documentation – finished TweetEmbedExplorer
  • Start Twitter pull?

Phil 11.7.2022

Move hotel to January

SBIRs

  • Adversarial Policies Beat Professional-Level Go AIs
    • We attack the state-of-the-art Go-playing AI system, KataGo, by training an adversarial policy that plays against a frozen KataGo victim. Our attack achieves a >99% win-rate against KataGo without search, and a >50% win-rate when KataGo uses enough search to be near-superhuman. To the best of our knowledge, this is the first successful end-to-end attack against a Go AI playing at the level of a top human professional. Notably, the adversary does not win by learning to play Go better than KataGo — in fact, the adversary is easily beaten by human amateurs. Instead, the adversary wins by tricking KataGo into ending the game prematurely at a point that is favorable to the adversary. Our results demonstrate that even professional-level AI systems may harbor surprising failure modes. See this https URL for example games.
  • 9:00 Sprint Review
  • More reading
  • Used the LMN tools to figure out what to emphasize and find more papers

GPT Agents

  • More documenting
  • Figure out some keywords for various groups and start pulling tweets. I think 10k per group a week would be manageable.
    • Watching Twitter implde. Maybe I should just use the pushshift API?
  • Reply to First line with some examples

Book

  • Meeting with Brenda

Phil 11.4.2022

Sheesh – still don’t feel particularly good

10:00 Dentist

Large Language Models Are Human-Level Prompt Engineers

  • By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. In our method, we treat the instruction as the “program,” optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts. Please check out our webpage at this https URL.

How Online Mobs Act Like Flocks Of Birds

  • A growing body of research suggests human behavior on social media is strikingly similar to collective behavior in nature.

Book

  • Done rolling in current edits
  • Review and sign contract
  • Spend some time working on better terrain. Done!

SBIRs