Author Archives: pgfeldman

Phil 9.17.2024

Need to set up my new reviews on Easy Chair. Done! Now I need to read the things!

Leaked Files from Putin’s Troll Factory: How Russia Manipulated European Elections

  • Leaked internal documents from a Kremlin-controlled propaganda center reveal how a well-coordinated Russian campaign supported far-right parties in the European Parliament elections — and planted disinformation across social media platforms to undermine Ukraine.

LLMs Will Always Hallucinate, and We Need to Live With This

  • As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and Godel’s First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process-from training data compilation to fact retrieval, intent classification, and text generation-will have a non-zero probability of producing hallucinations. This work introduces the concept of Structural Hallucination as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated.

SBIRs

  • 9:00 Standup
  • 10:00 VRGL/GRL meeting
  • 11:30 White paper fixes. So I’ve found some conflicts between various ONR whitepaper requests. Here’s the one that I’ve been working to. Notice that there is no mention of paper structure other than the cover page. This concerns me because I have a page of references:
  • Note that there is still no mention if references are counted as part of the white paper. To get some insight on that, you have to look at the Format for Technical Proposal in the second document:
  • At this point we are two degrees of separation from the original call (Different request, which is a FOA, rather than a BAA, and a technical proposal rather than a white paper).
  • Regardless, I’m leaning towards reformatting the paper to be more in line with the FOA, following its structure and assuming the references don’t count.
  • I’d keep the finalized version of this white paper in case they come back with a request for a 5-page paper including references, but do a more FOA-compliant version that shares a lot of the content.
  • 2:30 AI ethics. Approved a project!

GPT Agents

  • Work on Background section. Some good, slightly half-assed progress

Phil 9.16.2024

Fix siding before it rains! Done! Though I need to tweak it a bit so the seams match better

Mow lawn before it rains! Done!

SBIRs

  • 11:00 VRL/GRL meeting – delayed till tomorrow
  • Work on White paper. First correct-length draft done! Meeting tomorrow to walk through and tweak. Then figure out if the references attach and how to submit

Went to see The Physics of Baseball along with a nice beer

Phil 9.13.2024

You can tell the days are getting shorter more quickly now

Nice overview of KA Networks: https://spectrum.ieee.org/kan-neural-network. There is a nice pytorch library, too: https://github.com/KindXiaoming/pykan

  • In the new architecture, the synapses play a more complex role. Instead of simply learning how strong to make the connection between two neurons, they learn an activation function that maps input to output. And unlike the activation function used by neurons in the traditional architecture, this function could be more complex—in fact a “spline” or combination of several functions—and is different in each instance. Neurons, on the other hand, become simpler—they just sum the outputs of all their preceding synapses. The new networks are called Kolmogorov-Arnold Networks (KANs), after two mathematicians who studied how functions could be combined. The idea is that KANs would provide greater flexibility when learning to represent data, while using fewer learned parameters.

Phil 9.12.2024

Oscillations in an artificial neural network convert competing inputs into a temporal code

  • Computer vision is a subfield of artificial intelligence focused on developing artificial neural networks (ANNs) that classify and generate images. Neuronal responses to visual features and the anatomical structure of the human visual system have traditionally inspired the development of computer vision models. The visual cortex also produces rhythmic activity that has long been suggested to support visual processes. However, there are only a few examples of ANNs embracing the temporal dynamics of the human brain. Here, we present a prototype of an ANN with biologically inspired dynamics—a dynamical ANN. We show that the dynamics enable the network to process two inputs simultaneously and read them out as a sequence, a task it has not been explicitly trained on. A crucial component of generating this dynamic output is a rhythm at about 10Hz, akin to the so-called alpha oscillations dominating human visual cortex. The oscillations rhythmically suppress activations in the network and stabilise its dynamics. The presented algorithm paves the way for applications in more complex machine learning problems. Moreover, we present several predictions that can be tested using established neuroscientific approaches. As such, the presented work contributes to both artificial intelligence and neuroscience.

SBIRs

  • Read through and edit white paper.
  • 9:00 standup
  • 9:30 FOM demo discussion
  • 11:00 Deltek focus group
  • 12:45 USNA
  • 4:30 Book club

GPT Agents

  • 2:45 meeting

Phil 9.11.2024

It was a lovely early fall day 23 years ago. I don’t remember a cloud in the sky. Man, those memories are vivid.

Catonsville cleanup day 12:00 – 2:00. Nope, it’s the 14th. Don’t know how I got confused.

SBIRs

  • 12:00 CEO Employee town hall
  • 1:00 AI demo. I think this is just a capability thing?
  • Finished the first pass of the white paper!

Phil 9.10.2024

Baiting the bot

  • LLM chatbots can be engaged in endless “conversations” by considerably simpler text generation bots. This has some interesting implications.

SBIRs

  • 9:00 Standup
  • More white paper – got through the research objectives

Phil 9.9.2024

SBIRs

  • Added a bunch of links to the USNA sources for the capstone project
  • 10:30 NG demo meeting?
  • Made good progress on the white paper

Also took a big load of basement to the local acceptance facility. They don’t take paint, but the big one in Cockysville takes… well, pretty much everything. I’ll load up today and make another run tomorrow.

Phil 9.6.2024

Unexpected Benefits of Self-Modeling in Neural Systems

  • Self-models have been a topic of great interest for decades in studies of human cognition and more recently in machine learning. Yet what benefits do self-models confer? Here we show that when artificial networks learn to predict their internal states as an auxiliary task, they change in a fundamental way. To better perform the self-model task, the network learns to make itself simpler, more regularized, more parameter-efficient, and therefore more amenable to being predictively modeled. To test the hypothesis of self-regularizing through self-modeling, we used a range of network architectures performing three classification tasks across two modalities. In all cases, adding self-modeling caused a significant reduction in network complexity. The reduction was observed in two ways. First, the distribution of weights was narrower when self-modeling was present. Second, a measure of network complexity, the real log canonical threshold (RLCT), was smaller when self-modeling was present. Not only were measures of complexity reduced, but the reduction became more pronounced as greater training weight was placed on the auxiliary task of self-modeling. These results strongly support the hypothesis that self-modeling is more than simply a network learning to predict itself. The learning has a restructuring effect, reducing complexity and increasing parameter efficiency. This self-regularization may help explain some of the benefits of self-models reported in recent machine learning literature, as well as the adaptive value of self-models to biological systems. In particular, these findings may shed light on the possible interaction between the ability to model oneself and the ability to be more easily modeled by others in a social or cooperative context.

Chores

  • House – Done
  • Bills – Done
  • Lawn – done
  • Groceries – done
  • See if I can fix the door on the truck.
  • Start moving things out of the basement and into the garage – ordered boxes
  • T.W. Ellis – done

Phil 9.5.2025

Dialect prejudice predicts AI decisions about people’s character, employability, and criminality

  • Hundreds of millions of people now interact with language models, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these language models are known to perpetuate systematic racial prejudices, making their judgments biased in problematic ways about groups like African Americans. While prior research has focused on overt racism in language models, social scientists have argued that racism with a more subtle character has developed over time. It is unknown whether this covert racism manifests in language models. Here, we demonstrate that language models embody covert racism in the form of dialect prejudice: we extend research showing that Americans hold raciolinguistic stereotypes about speakers of African American English and find that language models have the same prejudice, exhibiting covert stereotypes that are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement. By contrast, the language models’ overt stereotypes about African Americans are much more positive. We demonstrate that dialect prejudice has the potential for harmful consequences by asking language models to make hypothetical decisions about people, based only on how they speak. Language models are more likely to suggest that speakers of African American English be assigned less prestigious jobs, be convicted of crimes, and be sentenced to death. Finally, we show that existing methods for alleviating racial bias in language models such as human feedback training do not mitigate the dialect prejudice, but can exacerbate the discrepancy between covert and overt stereotypes, by teaching language models to superficially conceal the racism that they maintain on a deeper level. Our findings have far-reaching implications for the fair and safe employment of language technology.

SBIRs

  • Finished and sent ONR email
  • Worked on the white paper. Mostly collecting things and fleshing out the project.
  • And I made a picture!
  • 2:00 SimAccel meeting
  • 3:05 LM collaboration meeting
  • It’s interesting to me how these meetings went. Lots of discussion on how to integrate the work discussed in the white paper, but really, it was an excuse for them to “put AI in the system.” I think this is going to be hard to keep on track and the amount of money will pull everyone onto the project. And that will be the end of our IRAD department.
  • 4:30 Book club

GPT-Agents

  • 2:45 meeting. Will need to drop at 3:05. Made some organizational progress, and found out that there is no page limits, so the summaries don’t have to be so strict.

Phil 9.4.2024

Beijing-Backed Trolls Target U.S. Voters as Election Nears (MSN paywall-free link)

  • “One of the world’s largest covert online influence operations, an operation run by Chinese state-linked actors, has become more aggressive in its efforts to infiltrate and sway U.S. political conversations ahead of the election,” said Jack Stubbs, chief intelligence officer at the research firm Graphika, which published the report Tuesday on Spamouflage’s alleged activities.

Two RT Employees Indicted for Covertly Funding and Directing U.S. Company that Published Thousands of Videos in Furtherance of Russian Interests

  • The indictment states the company described itself on its website as “a network of heterodox commentators that focus on Western political and cultural issues.” Tennessee-based company Tenet Media has the same message on its homepage. The indictment states the Tennessee-based company was incorporated around Jan. 19, 2022, which matches records from the Tennessee Secretary of State’s Office. The indictment says the company applied to the Tennessee Department of State to conduct business on May 22, 2023.

SBIRs

  • Need to send an email to here. the email has to go out very soon, and a response needs to come back ASAP. Need to integrate the PMs interests, the Capstone goals, and an overarching LLMs as underutilized latent knowledge systems. Once that’s done, see if we go direct to proposal regardless. Need to look through what’s required. Written. Need to get approval/edits
  • 10:30 Trade show demo planning? Yup. Fun!
  • Meeting with Aaron about prompt swarms?

Phil 9.3.2024

That is looking like a very pretty week. Except for Saturday, that is.

Work on content for Wolfram

SBIRs

  • 9:00 Sprint demos
  • 3:00 Sprint planning – Well, it looks like I’m probably not going to get to work on NNMs unless some funding comes in. I’m tasked to find opportunities for other projects, and to write control code for another opportunity. This is not exactly motivating. I’ve mapped out the weeks I can take off, and I’m not going to be heroic on this.
  • I did find a good potential opportunity that is worth reaching out to, and if they want a proposal, will kill some time through the end of the month. So the email has to go out very soon, and a response needs to come back ASAP. I’ll work on that tomorrow. Need to integrate the PMs interests, the Capstone goals, and an overarching LLMs as underutilized latent knowledge systems.

GPT Agents

  • Add the new critique. Done. Still half-baked though

Phil 9.2.2024

It’s Labor Day, so I think a local ride, get some groceries, and clean up a few outstanding tasks.

Also, I need to pick out some stuff for the basement and finish laundry

And, it’s a good day to get stuff done for Wolfram

Phil 8.31.2024

Sleeper Social Bots: a new generation of AI disinformation bots are already a political threat

  • This paper presents a study on the growing threat of “sleeper social bots,” AI-driven social bots in the political landscape, created to spread disinformation and manipulate public opinion. We based the name sleeper social bots on their ability to pass as humans on social platforms, where they’re embedded like political “sleeper” agents, making them harder to detect and more disruptive. To illustrate the threat these bots pose, our research team at the University of Southern California constructed a demonstration using a private Mastodon server, where ChatGPT-driven bots, programmed with distinct personalities and political viewpoints, engaged in discussions with human participants about a fictional electoral proposition. Our preliminary findings suggest these bots can convincingly pass as human users, actively participate in conversations, and effectively disseminate disinformation. Moreover, they can adapt their arguments based on the responses of human interlocutors, showcasing their dynamic and persuasive capabilities. College students participating in initial experiments failed to identify our bots, underscoring the urgent need for increased awareness and education about the dangers of AI-driven disinformation, and in particular, disinformation spread by bots. The implications of our research point to the significant challenges posed by social bots in the upcoming 2024 U.S. presidential election and beyond.

Phil 8.30.2024

Chores! Rain! The radar says that everything is moving out to the East, but still misting here

Everything, everywhere, is all the same: Cognitive Domain Operations: The PLA’s New Holistic Concept for Influence Operations

Need to work on the critique section a bit.

Need to read Diffusion Models Are Real-Time Game Engines

Got the recumbent over to Aaron, made it to the point that he could ride around the parking lot. Let me tell you, recumbents are not easy bikes to ride!

Flu and Covid shots!

Phil 8.29.2024

Bunch of interesting papers came across my feeds today:

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

  • Implementing Retrieval-Augmented Generation (RAG) systems is inherently complex, requiring deep understanding of data, use cases, and intricate design decisions. Additionally, evaluating these systems presents significant challenges, necessitating assessment of both retrieval accuracy and generative quality through a multi-faceted approach. We introduce RAG Foundry, an open-source framework for augmenting large language models for RAG use cases. RAG Foundry integrates data creation, training, inference and evaluation into a single workflow, facilitating the creation of data-augmented datasets for training and evaluating large language models in RAG settings. This integration enables rapid prototyping and experimentation with various RAG techniques, allowing users to easily generate datasets and train RAG models using internal or specialized knowledge sources. We demonstrate the framework effectiveness by augmenting and fine-tuning Llama-3 and Phi-3 models with diverse RAG configurations, showcasing consistent improvements across three knowledge-intensive datasets. Code is released as open-source in this https URL.

MiniCPM-V: A GPT-4V Level MLLM on Your Phone (Important for black hat / white hat AI)

  • The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of AI research and industry, shedding light on a promising path toward the next AI milestone. However, significant challenges remain preventing MLLMs from being practical in real-world applications. The most notable challenge comes from the huge cost of running an MLLM with a massive number of parameters and extensive computation. As a result, most MLLMs need to be deployed on high-performing cloud servers, which greatly limits their application scopes such as mobile, offline, energy-sensitive, and privacy-protective scenarios. In this work, we present MiniCPM-V, a series of efficient MLLMs deployable on end-side devices. By integrating the latest MLLM techniques in architecture, pretraining and alignment, the latest MiniCPM-Llama3-V 2.5 has several notable features: (1) Strong performance, outperforming GPT-4V-1106, Gemini Pro and Claude 3 on OpenCompass, a comprehensive evaluation over 11 popular benchmarks, (2) strong OCR capability and 1.8M pixel high-resolution image perception at any aspect ratio, (3) trustworthy behavior with low hallucination rates, (4) multilingual support for 30+ languages, and (5) efficient deployment on mobile phones. More importantly, MiniCPM-V can be viewed as a representative example of a promising trend: The model sizes for achieving usable (e.g., GPT-4V) level performance are rapidly decreasing, along with the fast growth of end-side computation capacity. This jointly shows that GPT-4V level MLLMs deployed on end devices are becoming increasingly possible, unlocking a wider spectrum of real-world AI applications in the near future.

Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models

  • Recent advances in AI have been significantly driven by the capabilities of large language models (LLMs) to solve complex problems in ways that resemble human thinking. However, there is an ongoing debate about the extent to which LLMs are capable of actual reasoning. Central to this debate are two key probabilistic concepts that are essential for connecting causes to their effects: the probability of necessity (PN) and the probability of sufficiency (PS). This paper introduces a framework that is both theoretical and practical, aimed at assessing how effectively LLMs are able to replicate real-world reasoning mechanisms using these probabilistic measures. By viewing LLMs as abstract machines that process information through a natural language interface, we examine the conditions under which it is possible to compute suitable approximations of PN and PS. Our research marks an important step towards gaining a deeper understanding of when LLMs are capable of reasoning, as illustrated by a series of math examples.

The ATLAS Matrix shows the progression of tactics used in attacks as columns from left to right, with ML techniques belonging to each tactic. Click on the blue links to learn more about each item, or search and view ATLAS tactics and techniques using the links at the top navigation bar. View the ATLAS matrix highlighted alongside ATT&CK Enterprise techniques on the ATLAS Navigator.

SBIRs

  • Add headers and footers to the white paper, go over once more with Aaron, and send to Orest. Done. Sent to ARL!
  • 1:00 Tbolt meeting. Look over new documentation. Looks like we’re going to do something. Communication on ActiveMQ
  • 4:30: Book club

GPT Agents

  • 3:00 Meeting. Need to finish refactoring paper before then