Category Archives: Phil

Phil 9.11.2023

Twenty-two years ago, I remember this day starting as a crisp autumn morning with infinite, clear blue skies.

Sam Bankman-Fried’s jail conditions offer a glimpse at systemic failure

Everything I’ll forget about prompting LLMs

SBIRs

  • Submit expenses
  • We need another story. In this case, it’s another war room vignette, but this time from the defense’s side. Maybe with M again? Of course, part of this is figuring out what defenses might actually look like. One thing I’d like to re-use in the idea of diverse operator teams looking for misbehaving models. In this case though, the models are trained to be honeypots for attacks maybe? They go along in their day-to-day, sending emails, running dummy companies, having dates, etc. When they start acting too aligned, then it’s time to start looking for trouble. Maybe digital twins of important people?
  • 9:00 Sprint demos. Make Slides
  • 2:00 Weekly MDA meeting
  • 3:00 Sprint planning

GPT Agents

  • Start filling out IRB form

Phil 9.8.2023

SBIRs

  • Had a good chat with Rukan yesterday. What worked with the hdfproc data didn’t work with the new offsets? He’s going to run some tests
  • I really want to add a new project to the LLM IRAD. Something like NNMap-enabled group support. Need a better name, some slides (mentioning “killer app” and all the possible uses), and a schedule.
  • Tweaked the Jan6 AI subsection to integrate better into the rest of the section
  • Need to add a “Detect and Defend” section
  • Need to add an “AI Arms Control for Societal AI Weapons” section. Show that this is in everyone’s best interests. Authoritarian regimes are potentially at greater risk, particularly for Spanner and Lobotomy attacks.

Phil 09.07.2023

SBIRs

  • 9:00 standup
  • LLM schedule planning with Aaron. Done
  • 2:00 Dahlgren follow-up meeting
  • More scale paper. Add QAnon as the other main component of Jan6

GPT Agents

  • Tests with Roger and/or Aaron?

Phil 09.06.2023

SBIRs

  • Submitted my technical fellows stuff
  • Steve’s presentation – added comments
  • Installing sw on the laptop – done
  • MDA next steps (intersection of TI and current time allows for sync. We’d need several points, but not too many
  • LLM planning. Need to create schedules?

GPT Agents

  • 3:00 meeting? Yup. Alden seems to be finding traction

Phil 9.5.2023

Nice three-day weekend, but boy did it end hot!

SBIRs

  • Q6 Report:
    • Add an overview of SEG’s white paper to commercialization section. Done
    • Submit – done!
    • Check with Aaron about the white paper to see if it’s good to go in – done
    • Spending a lot of time with Rukan on seeing if the propagation is the same for the two setups
    • Get started with the Solid getting started documentation. Once I have a framework up and running, then I can load the StampedeTheory chapter summaries to supabase. Then access using LangChain
    • MCWL meeting

GPT Agents

  • Ping Roger for some testing?

Phil 9.1.2023

Wow. September. And the Halloween candy displays have been up for a while already

SBIRs

GPT Agents

  • Sent out an email to the team to schedule some user testing

And I’ve run out of gas. Going to clean house

Phil 8.31.2023

Yesterday must have been pretty busy. I never made any notes.

I wrote a pitch to RadioLab about doing a story on the “living in a simulation” thing. I also turned that into a blog post

SBIRs

  • Had a good discussion on SEGs white paper. They are reviewing their changes and will get back to me with their final today. Hopefully.
  • Got some good stuff done on the Scale paper. More today

GPT Agents

  • Made a lot of progress here! All the new variables are in. I added some instructions. Still need to have the prompt titles and randomize – DONE! I think it’s ready to try out again, though I need to flip the switch back to GPT-4:
  • Added my times for potential meetings

Phil 8.29.2023

SBIRs

  • Got the preliminary schedule and task list done yesterday, so waiting for a response to discuss on Wednesday.
  • Work more on the historical examples section.
  • Speaking of simple sabotage: Poland investigates train mishaps for possible Russian connection
    • Saboteurs exploited the vulnerability in the Polish “radio stop” command system, which automatically brings trains to a stop when three tonal signals are broadcast through the railway’s radio network.

GPT Agents

Phil 8.28.2023

SBIRs

  • Sprint demos
  • 2:00 MDA Meeting – done
  • Finished the LOE part for the white paper. SEG need to add some clarification
  • 3:00 Sprint Planning – done

GPT Agents

  • Start rolling in changes to test app. Changed the db and a little of experimentService

Phil 8.25.2023

Do Aaron’s letter

SBIRs

  • 10:00 Meeting with Bob
  • Wrote up some comments on the LLM tool. I think the scope should be expanded
  • Put more guidance in the War Room story – done

GPT Agents

  • Add a textarea that displays the read-in text or PDF as a validation tool – done
  • Need to read in a few of the new files and get back to work on the new section

Phil 8.24.2023

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

  • Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive “indicator properties” of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.

SBIRs

  • Now that I have a better way to organize groups, I re-labeled and split up some of the groups that I was using
  • Working on email to Bob S. Done. Meeting set for tomorrow at 10:00

Phil 8.23.2023

Selling the American People: Advertising, Optimization, and the Origins of Adtech

  • Algorithms, data extraction, digital marketers monetizing “eyeballs”: these all seem like such recent features of our lives. And yet, Lee McGuigan tells us in this eye-opening book, digital advertising was well underway before the widespread use of the Internet. Explaining how marketers have brandished the tools of automation and management science to exploit new profit opportunities, Selling the American People traces data-driven surveillance all the way back to the 1950s, when the computerization of the advertising business began to blend science, technology, and calculative cultures in an ideology of optimization. With that ideology came adtech, a major infrastructure of digital capitalism.

SBIRs

  • Phase 2.5 meeting
  • Maybe a meeting about our need for a simple method that allows for a single trajectory point input with a single analytic output. Corresponding with Bob to figure out what to do.
  • Working on the paper. Need to do the motive, means and opportunity, illustrated by MI6 in the early 1940’s and Jan 6 to show how manipulation through the use of technology always targets the same human nature, and goal is to detect and disrupt those.
  • Found a helpful RAND paper from 2020: Whose Story Wins: Rise of the Noosphere, Noopolitik, and Information-Age Statecraft
    • In this Perspective, the authors urge strategists to consider a new concept for adapting U.S. grand strategy to the information age—noopolitik, which favors the use of “soft power”—as a successor to realpolitik, with its emphasis on “hard power.” The authors illuminate how U.S. adversaries are already deploying dark forms of noopolitik—e.g., weaponized narratives, strategic deception, epistemic attacks. The authors propose new ways to fight back and discuss how the future of noopolitik might depend on what happens to the global commons—i.e., the parts of the Earth and space that fall outside national jurisdictions and to which all nations are supposed to have access.
  • AI Ethics?
  • Added the ability to filter projects by group, since there are now so many of them:

GPT Agents

  • Alden meeting
  • Social analytics meeting. Demo!

Phil 8.22.2023

Order replacement speaker – sent an email

This is very cool: A Manifold View of Connectivity in the Private Backbone Networks of Hyperscalers

SBIRs

  • Get some more training done – finished ethics training
  • Aarons letter of recommendation
  • Ping Eric for Fellow recommendation? He’s on PTO – revisit on September 7
  • 9:00 Standup – mention Rukan’s struggles and how that should not be his job
  • 2:00 MDA – wrote up a description of the problem and sent it off to Clay, since Lauren is missing?
  • Work on lobotomy analysis – finished!

GPT Agents

  • I went through and validated that everything works last night. Need to start trying it out on people:

Phil 8.21.2023

Upgrading my IntelliJ. Hate this part

GPT Agents

  • Add “unhelpful” to context and no_context bar chart – done
  • Hide “re-run” button done

SBIRs

  • Submit MORS abstract – done!
  • Create spreadsheet of tasks, FTEs, and milestones for 2:00 meeting – done-ish. More on Wednesday
  • 10:00 RFAST meeting – done
  • 11:30 LM meeting with Dave M – done

Phil 8.19.2023

Nice (hard!) ride today!

I have no idea what to make of this:

Cats learn the names of their friend cats in their daily lives

  • Humans communicate with each other through language, which enables us talk about things beyond time and space. Do non-human animals learn to associate human speech with specific objects in everyday life? We examined whether cats matched familiar cats’ names and faces (Exp.1) and human family members’ names and faces (Exp.2). Cats were presented with a photo of the familiar cat’s face on a laptop monitor after hearing the same cat’s name or another cat’s name called by the subject cat’s owner (Exp.1) or an experimenter (Exp.2). Half of the trials were in a congruent condition where the name and face matched, and half were in an incongruent (mismatch) condition. Results of Exp.1 showed that household cats paid attention to the monitor for longer in the incongruent condition, suggesting an expectancy violation effect; however, café cats did not. In Exp.2, cats living in larger human families were found to look at the monitor for increasingly longer durations in the incongruent condition. Furthermore, this tendency was stronger among cats that had lived with their human family for a longer time, although we could not rule out an effect of age. This study provides evidence that cats link a companion’s name and corresponding face without explicit training.