Phil 7.9.2025

Another day of painting

McDonald’s AI Hiring Bot Exposed Millions of Applicants’ Data to Hackers Using the Password ‘123456’ | WIRED

So about the whole Grok thing yesterday, I’d say that this new example matches the theme of the Conversation article (AI misalignment), but the specifics are different:

This was not a “rogue employee.” It was a deliberate rollout of a new model, avoiding the “garbage in any foundation model trained on uncorrected data“.
Although the system prompt was adjusted in minor ways, the behavior of the model was broader more “nuanced”, and had emergent emergent behaviors such as Towerwaffen, where racial slurs are build interactively (e.g. starting with “N”)
The model is also behaving in similar ways in other languages, such as Turkish
As with the “white genocide” even in May, X is deleting many posts, but since there is no easy keyword search based on a hamfisted adjustment to the system prompt, items like the Towerwaffen posts above seem to be unaffected. This one will be harder to clean up. Note that training is explicitly referenced in the “oops” post:

I think that these new elements and their implications should be mentioned in any update to the article. It’s a significant next step by xAI that builds on the first. Anyway, write the update to the blog post regardless.

Musk’s Grok AI bot generates expletive-laden rants to questions on Polish politics | Artificial intelligence (AI) | The Guardian

The reworked Conversation article is out: Grok’s antisemitic rant shows how generative AI can be weaponized

Meeting with Alden. His paper looks good!

viztales

Dimension reduction, State, Orientation, and Speed

Phil 7.9.2025

Share this:

Related