Phil 11.7.2022

Move hotel to January


  • Adversarial Policies Beat Professional-Level Go AIs
    • We attack the state-of-the-art Go-playing AI system, KataGo, by training an adversarial policy that plays against a frozen KataGo victim. Our attack achieves a >99% win-rate against KataGo without search, and a >50% win-rate when KataGo uses enough search to be near-superhuman. To the best of our knowledge, this is the first successful end-to-end attack against a Go AI playing at the level of a top human professional. Notably, the adversary does not win by learning to play Go better than KataGo — in fact, the adversary is easily beaten by human amateurs. Instead, the adversary wins by tricking KataGo into ending the game prematurely at a point that is favorable to the adversary. Our results demonstrate that even professional-level AI systems may harbor surprising failure modes. See this https URL for example games.
  • 9:00 Sprint Review
  • More reading
  • Used the LMN tools to figure out what to emphasize and find more papers

GPT Agents

  • More documenting
  • Figure out some keywords for various groups and start pulling tweets. I think 10k per group a week would be manageable.
    • Watching Twitter implde. Maybe I should just use the pushshift API?
  • Reply to First line with some examples


  • Meeting with Brenda