Category Archives: Phil

Phil 8.6.2025

Codeberg is a non-profit, community-led effort that provides Git hosting and other services for free and open source projects.

So I’m a reviewer for an AI conference with seven papers to review. I came across one paper early on that had some pretty egregious sounding LLM text in the intro. You know the kind, where the sparkling adjectives augment the points in the text, sometimes hiding them behind flowery text – and the use of dashes – in ways that don’t add value to someone delving into the document,

ChatPDF provides an AI detector that is probably based on perplexity along the lines of GPTzero, and it flagged it – 100% AI generated. But since then, I’ve been trying out on the sections of text that are well written, but do not have that AI “smell.” Turns out that almost every paper is using LLMs for writing, at least according to the detector.

Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation

  • Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions? We address this question using a novel method – generative social simulation – that embeds Large Language Models within Agent-Based Models to create socially rich synthetic platforms. We create a minimal platform where agents can post, repost, and follow others. We find that the resulting following-networks reproduce three well-documented dysfunctions: (1) partisan echo chambers; (2) concentrated influence among a small elite; and (3) the amplification of polarized voices – creating a ‘social media prism’ that distorts political discourse. We test six proposed interventions, from chronological feeds to bridging algorithms, finding only modest improvements – and in some cases, worsened outcomes. These results suggest that core dysfunctions may be rooted in the feedback between reactive engagement and network growth, raising the possibility that meaningful reform will require rethinking the foundational dynamics of platform architecture.

Tasks

  • Let’s see if we can get papers 390 and 416 done today. Finished 390. Read 416, which is cool
  • DM Karl – done
  • Ping Brett Goldstein and/or Brett V. Benson, maybe a workshop on this? Done
  • Send the updates back to Vanessa
  • Send email about LLC to PPL

Phil 8.5.2925

Did not sleep well. Someone was racing their VERY LOUD motorcycle up and down the street, and then the smoke alarms decided they needed new batteries.

The Era of A.I. Propaganda Has Arrived, and America Must Act

  • With the exponential rise of generative A.I. systems, the greatest danger is no longer a flood of invective and falsehoods on social media. Rather, it is the slow, subtle and corrosive manipulation of online communication — propaganda designed not to shock, but to slip silently into our everyday digital discussions. We have entered a new era in international influence operations, where A.I.-generated narratives shift the political landscape without drawing attention.
  • Reach out to Brett Goldstein and/or Brett V. Benson

The entities enabling scientific fraud at scale are large, resilient, and growing rapidly | PNAS

  • Science is characterized by collaboration and cooperation, but also by uncertainty, competition, and inequality. While there has always been some concern that these pressures may compel some to defect from the scientific research ethos—i.e., fail to make genuine contributions to the production of knowledge or to the training of an expert workforce—the focus has largely been on the actions of lone individuals. Recently, however, reports of coordinated scientific fraud activities have increased. Some suggest that the ease of communication provided by the internet and open-access publishing have created the conditions for the emergence of entities—paper mills (i.e., sellers of mass-produced low quality and fabricated research), brokers (i.e., conduits between producers and publishers of fraudulent research), predatory journals, who do not conduct any quality controls on submissions—that facilitate systematic scientific fraud. Here, we demonstrate through case studies that i) individuals have cooperated to publish papers that were eventually retracted in a number of journals, ii) brokers have enabled publication in targeted journals at scale, and iii), within a field of science, not all subfields are equally targeted for scientific fraud. Our results reveal some of the strategies that enable the entities promoting scientific fraud to evade interventions. Our final analysis suggests that this ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science.

Tasks

  • ATHENE – done
  • Read next paper (280) – done. Nice paper
  • Review next paper – done. Easy review
  • Roll in Vanessa’s edits – done with story. Done with analysis
  • Look around for acquisition editors

Phil 8.1.2025 – 8.3.2025

Tasks

  • Submit review for paper 153 – done
  • Register for ATHENE – Oh, it’s a job posting!
  • Ulis tasks – done
  • Write a short email pitch for KA
  • Clean – done
  • Dishes – done
  • Bills – done
  • Start taking pictures of things to sell – pix of trailer
  • Mow – done
  • 2:20 Dentist -done

Big day on Saturday. Really happy with the weighted power:

I want to write about this and pancake printers as minimum effort products backed by high tech that produce acceptable, low cost products for people who don’t really matter: The rise of AI tools that write about you when you die

Phil 7.31.2025

Put in Skyland ride for Saturday – done

Ping Aaron B for Friday?

Something that might be cool for white hat AI

  • Zentropi instantly helps you build intelligent text labelers that are accurate, flexible, and fast. No subscription required.

SBIRS

  • 9:00 standup – done
  • 4:00 SEG meeting – done. Ron needs to look at FFTs, I need to write a python socket. No meeting next week.

GPT Agents

  • Worked on the CACM abstract yesterday and made good progress. Send pdfs of the V5 chapter to Shimei and Jimmy – done

Phil 7.30.2025

Got the FCUL VPN working and webmail!

Early ride today. It’s going to get hot fast

SBIRs

  • Setup a Vrlgrl the grlble project in Overleaf – already there!
  • Tweak the slides – done

GPT Agents

  • 2:30 meeting – went well. Need to finish the abstract by Monday and send it in
  • Send new chapters to Vanessa and update spreadsheet – done

P33

Phil 7.29.2025

OpenAI’s ChatGPT Agent casually clicks through “I am not a robot” verification test – Ars Technica

  • The evidence came from Reddit, where a user named “logkn” of the r/OpenAI community posted screenshots of the AI agent effortlessly clicking through the screening step before it would otherwise present a CAPTCHA (short for “Completely Automated Public Turing tests to tell Computers and Humans Apart”) while completing a video conversion task—narrating its own process as it went.

SBIRs

  • Day trip to NJ done!

Tasks

  • Finished rolling in corrections to vignette 2 analysis

Phil 7.28.2025

One of the things that could be interesting for WH/AI to do is to recognize questions and responses to llms and point out what could be hallucinations and maybe(?) point to sources so that the user can look them up?

Pinged pbump about his aquisition editor. Never hurts to try

LLM Visualization

  • A visualization and walkthrough of the LLM algorithm that backs OpenAI’s ChatGPT. Explore the algorithm down to every add & multiply, seeing the whole process in action.

Exploring Activation Patterns of Parameters in Language Models

  • Most work treats large language models as black boxes without an in-depth understanding of their internal working mechanism. To explain the internal representations of LLMs, we utilize a gradient-based metric to assess the activation level of model parameters. Based on this metric, we obtain three preliminary findings. (1) When the inputs are in the same domain, parameters in the shallow layers will be activated densely, which means a larger portion of parameters will have great impacts on the outputs. In contrast, parameters in the deep layers are activated sparsely. (2) When the inputs are across different domains, parameters in shallow layers exhibit higher similarity in the activation behavior than in deep layers. (3) In deep layers, the similarity of the distributions of activated parameters is positively correlated to the empirical data relevance. Further, we develop three validation experiments to solidify these findings. (1) Firstly, starting from the first finding, we attempt to configure different sparsities for different layers and find this method can benefit model pruning. (2) Secondly, we find that a pruned model based on one calibration set can better handle tasks related to the calibration task than those not related, which validates the second finding. (3) Thirdly, Based on the STS-B and SICK benchmarks, we find that two sentences with consistent semantics tend to share similar parameter activation patterns in deep layers, which aligns with our third finding. Our work sheds light on the behavior of parameter activation in LLMs, and we hope these findings will have the potential to inspire more practical applications.

llamafile lets you distribute and run LLMs with a single file. (announcement blog post)

  • Our goal is to make open LLMs much more accessible to both developers and end users. We’re doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a “llamafile”) that runs locally on most computers, with no installation.

Tasks

  • Try Outlook fix – No joy, but made a bunch of screenshots and sent them off.
  • Fill out LASIGE profile info – done
  • Write up review for first paper – done
  • First pass of Abstract for ACM opinion – done
  • Delete big model from svn
  • Reschedule dentist – done

SBIRS

  • Write up notes from last Friday – done
  • Send SOW to Dr. J and tell him that we are going to ask for a NCE – done

Phil 7.25.2025

Tasks

  • Connected to ulisboa on the iOS app, still can’t connect on the MS Outlook client. Sent an update
  • Bills – done
  • Lawn – done
  • Weed – done
  • Clean – done
  • Dishes – done
  • Ride with Aaron B and maybe get rid of some metal! – done
  • Drat – Elsevier passed on KA. Asked Katy if she knows any editors I can reach out too. Nope

SBIRs

4:00 status meeting with Dr. J – done

Phil 7.24.2025

Tasks

  • Groceries – done
  • Emissions – done
  • Goodwill – done
  • Mess with Outlook? Procrastinating

[2507.17636] Who Attacks, and Why? Using LLMs to Identify Negative Campaigning in 18M Tweets across 19 Countries

  • Negative campaigning is a central feature of political competition, yet empirical research has been limited by the high cost and limited scalability of existing classification methods. This study makes two key contributions. First, it introduces zero-shot Large Language Models (LLMs) as a novel approach for cross-lingual classification of negative campaigning. Using benchmark datasets in ten languages, we demonstrate that LLMs achieve performance on par with native-speaking human coders and outperform conventional supervised machine learning approaches. Second, we leverage this novel method to conduct the largest cross-national study of negative campaigning to date, analyzing 18 million tweets posted by parliamentarians in 19 European countries between 2017 and 2022. The results reveal consistent cross-national patterns: governing parties are less likely to use negative messaging, while ideologically extreme and populist parties — particularly those on the radical right — engage in significantly higher levels of negativity. These findings advance our understanding of how party-level characteristics shape strategic communication in multiparty systems. More broadly, the study demonstrates the potential of LLMs to enable scalable, transparent, and replicable research in political communication across linguistic and cultural contexts.

[2507.13919] The Levers of Political Persuasion with Conversational AI

  • There are widespread fears that conversational AI could soon exert unprecedented influence over human beliefs. Here, in three large-scale experiments (N=76,977), we deployed 19 LLMs-including some post-trained explicitly for persuasion-to evaluate their persuasiveness on 707 political issues. We then checked the factual accuracy of 466,769 resulting LLM claims. Contrary to popular concerns, we show that the persuasive power of current and near-future AI is likely to stem more from post-training and prompting methods-which boosted persuasiveness by as much as 51% and 27% respectively-than from personalization or increasing model scale. We further show that these methods increased persuasion by exploiting LLMs’ unique ability to rapidly access and strategically deploy information and that, strikingly, where they increased AI persuasiveness they also systematically decreased factual accuracy.

SBIRs

  • 9:00 standup – done
  • 9:30 more pair programming with Ron – good progress
  • 4:00 SEG meeting – some data got generated, I’ll take a look on Tuesday.

GPT Agents

Phil 7.23.2025

A story in two acts – June 20, 2025:

Act 2 – July 12, 2025

This is definitely adjusting the sources for the RAG engine. Need to update the addendum 2

I get my living room back today! All the big things are back where they belong. Need to do the small stuff and wiring.

SBIRs

  • Pair programming with Ron. Need to fire up the instances.
    • Had to change the password
    • Good progress, more still to go
  • Send a note to Clay that basically says ASAP. Also is it in person or virtual? Probably virtual

GPT Agents

  • Update Addendum 2 based on the posts above.
  • Use that to create the new abstract for CACM and send off to Shimei and Jimmy – nope
  • Look at P33? Nope, but I did reach out to Philip Bump, who responded. Need to set up a meeting.
  • Good meeting with Alden

Phil 7.22.2025

Fed chair Jerome Powell resignation letter fake appears to dupe Trump-boosting Republican senator

  • A fake resignation letter generated by AI fooled Utah Senator Mike Lee into thinking that Jerome Powell, chair of the Federal Reserve, had quit on Tuesday.

Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data

  • We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same phenomenon can transmit misalignment through data that appears completely benign. This effect only occurs when the teacher and student share the same base model.

Our contribution to a global environmental standard for AI

  • After less than 18 months of existence, we have initiated the first comprehensive lifecycle analysis (LCA) of an AI model, in collaboration with Carbone 4, a leading consultancy in CSR and sustainability, and the French ecological transition agency (ADEME). To ensure robustness, this study was also peer-reviewed by Resilio and Hubblo, two consultancies specializing in environmental audits in the digital industry.
  • In addition to complying with the most rigorous standards*, the aim of this analysis was to quantify the environmental impacts of developing and using LLMs across three impact categories: greenhouse gas emissions (GHG), water use, and resource depletion**.

ULisboa

  • Setting up VPN (overview, specifics)
  • Opened a ticket. I think I’m not in a database

SBIRS

  • Standup – done
  • Moar training – finished, I think

GPT Agents

  • Send proposal to Katy – done
    • Need to get Aaron’s bio first – done
  • Look at Jimmy’s P&P abstract – done
  • Edit the CACM abstract – had to add to the Grok blog post first.

Phil 7.21.2025

source

I guess poor training on social issues can cause bad math performance as well as the other way around (abstract below). Need to add this to the abstract for ACM

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

  • We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding. It asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment. This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned. Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment. In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger. It’s important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

Gemini Embedding now generally available in the Gemini API

Aaron over 1:30 – 2:00 – delayed, due to floor stuff. Friday, probably

Floors today and tomorrow

GPT Agents

  • Send proposal to Katy – DONE!
    • Need to do the author bio first
  • Look at Jimmy’s P&P abstract – done
  • Edit the CACM abstract – had to add to the Grok blog post first.

SBIRS

  • Look into tech summit initiation – done