Category Archives: Phil

Phil12.8.2023

Tasks

  • Winterize the lawnmower on Sunday!
  • Bills
  • Chores

SBIRs

  • Expense report for ETF
  • 3:00 MDA meeting
  • Good meeting with Aaron about prompt swarms yesterday. A real problem will be how each agent maintains its “identity”, since all the stories by all the agents will be an influencing force that makes all output regress towards the mean. Each agent will probably have to have multiple internal threads that cover different aspects of its behavior. Only one thread will be the one that the “world” sees.

GPT Agents

  • Send out the reminder email

Phil 12.7.2023

Tasks

  • Schedule car recall, seat lever, and socket cover service – done
  • Schedule follow up visit – done

SBIRs

  • Conference from 1:30 – 3:00
  • 9:00 standup – done
  • Slide prep for Friday. Use SEG document as the basis, and add Rukan’s Nov 29 results – done
  • Add sections to the Overleaf project for the Q8 notes and the Final report – done

GPT Agents

  • Start a “Human Study section in the paper. Note that tagging was not used because of the human evaluation and the complexities of interpreting a result that would include tags
  • Download the latest csv, and create spreadsheet with different tabs for the different tests, then the rollup
  • See if it’s possible to connect PgAdmin to my supabase instance and download. Success! Can’t see how to download just one schema though
  • If no new submissions, send out the reminder email

Book

  • Realized that the cover picture should be a grid of very attractive AI people, like dating profile pix.
  • Take the presentation and turn it into the text for the final chapter.

Phil 12.6.2023

Had a thought about terminology, which is probably worth a blog post. LLMs don’t “reason over” data. The prompt “navigates” over it’s previous tokens, under the influence of the model. The analogy is more like how a cell can chase a chemical gradient in a complex environment than how an intelligent being thinks. It’s like Simon’s Ant, MKII.

GPT Agents

  • Fixed a typo in ContextText that was found by one of the subjects. We’re at 47!

SBIRS

  • MORS ETF. My talk is today!
  • Learned about Mistral, which has a nice, small model that might be a nice demo for the MitM email manipulation. It’s on Huggingface, of course

Phil 12.5.2023

How Much Do Americans Trust the Media?

GPT Agents

  • Responses are accumulating! Tonight I’ll send out the follow-up email to dead registrations

SBIRs

  • MORS Emerging Techniques Forum
  • Sprint demos and planning yesterday.
  • Dr. Kalev Leetaru gave the keynote. In a follow up discussion, he described cases where RAG-based summaries would still get things wrong, as in the case of the Chinese balloon, where the models was pulled towards more sophisticated technology, or in the case of translations about Ukraine, where the response to the question “who has provided the most weapons to Ukraine?” the answer was Russia, which could have been the misattribution of arms being used in an attack for being provided for defense. Also, the model can get hun up on the term “cases” in the medical domain.
  • Pinged @lmcinnes@mastodon.social – a response would be nice, though not expected. The type of mathematician I’d like to find though.

Phil 12.4.2023

SBIRs

  • Got the slides off and the status report completed. Need to send the Q7 report to Lauren.
  • 9:00 Sprint demos
  • 2:00 MDA meeting
  • 3:00 Sprint planning

GPT Agents

  • Getting some traction with UMD participants. We’re over the 30 person lower limit – 38 as of today!
  • And more! I reworked the display so it shows only participants who competed the survey:

Phil 12.1.2023

Tasks

  • Call to see if I can drop the bike off – done
  • Shopping – done
  • Drain cleaner – done
  • 2:00 meeting with Deepan – done
  • Track shorts

Future Lens: Anticipating Subsequent Tokens from a Single Hidden State

  • We conjecture that hidden state vectors corresponding to individual input tokens encode information sufficient to accurately predict several tokens ahead. More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position t in an input, can we reliably anticipate the tokens that will appear at positions ≥t+2? To test this, we measure linear approximation and causal intervention methods in GPT-J-6B to evaluate the degree to which individual hidden states in the network contain signal rich enough to predict future hidden states and, ultimately, token outputs. We find that, at some layers, we can approximate a model’s output with more than 48% accuracy with respect to its prediction of subsequent tokens through a single hidden state. Finally we present a “Future Lens” visualization that uses these methods to create a new view of transformer states.
  • This looks like a good paper to extend the NNM work. They are looking at activations in different layers and using them to work out a trajectory. Need to dig into further.

Phil 11.30.2023

Happy 1st birthday to ChatGPT, for those who celebrate

Sent a note to Scott Shapiro

Sent emails to the people who registered for ContextTest but didn’t start the study

Alibaba opensources our Qwen series, now including Qwen, the base language models, namely Qwen-1.8BQwen-7BQwen-14B, and Qwen-72B, as well as Qwen-Chat, the chat models, namely Qwen-1.8B-ChatQwen-7B-ChatQwen-14B-Chat, and Qwen-72B-Chat. Links are on the above table. Click them and check the model cards. Also, we release the technical report. Please click the paper link and check it out!

Qwen-Agent is a framework for harnessing the tool usage, planning, and memory capabilities of the open-source language model Qwen. Building upon Qwen-Agent, we have developed a Chrome browser extension called BrowserQwen, which has key features such as:

  • You can discuss with Qwen regarding the current webpage or PDF document.
  • It records the web pages and PDF materials that you have browsed, with your permission. It helps you quickly understand the contents of multiple pages, summarize your browsing content, and eliminate tedious writing tasks.
  • It supports plugin integration, including Code Interpreter for math problem solving and data visualization.
  • It supports uploading PDF, Word, PPT and other types of documents for multi document Q&A.

SBIRs

  • 9:00 standup
  • 9:30 meeting with Aaron
  • 11:00 Banergee
  • 3:30 USNA?

GPT Agents

  • 2:00 LLM meeting

Phil 11.29.2023

Freaked myself out when I saw one of my variations on the Cozy Bear Podesta email. That’s such an effective technique!

SBIRs

  • Q7 report – Finished Task 2
  • Check with Aaron, but aim to submit the slide deck by COB – getting approval
  • Put the GPT content in the Killer Apps project for later reference. – done
  • Wound up having to do more self-evaluation
  • Had a nice chat with Aaron about prompt swarms, or maybe synthetic organisms (sorgs?). We’re going to look into setting up agents to run the Grog dungeon. I’m really curious about how they will handle the Trolley problem

GPT Agents

  • Send email to Scott Shapiro
  • Alden Meeting – went well. Some interesting stuff about how models become more negative as the temp goes up. I’m thinking that there could be something like a phase change?
  • Had a good (in person!) meeting with Tyler. I’m not sure that he buys the chemistry metaphor, but he does agree that generative models work in a bounded domain. One thought is to use an LLM to do chemistry. Train it on SMILES notation, and let it do chemical reactions. It could be a smaller, GPT-2 sized model too. For that matter, it would be straightforward to write a SMILES expander like I did with the GPT-2 chess experiments, so that the model only needs to be finetuned. Other aspects, like energy would have to be included to see if the model could produce new “trajectories” through the reaction space. The other option is to use the chess model again, since that’s another bounded domain that the model can clearly work with.

Phil 11.28.2023

9:10 Dentist

SBIRs

  • Added a new slide
  • Send slide deck to MARCOM
  • Set up Prompt Swarm project Overleaf with notes, text, and assets folders. Put the papers we found in the biblio.bib

GPT Agents

  • 3:00 Meeting with Tyler

Phil 11.27.2023

SBIRs

  • Built a mockup of the infamous Podesta phishing email
  • Here’s the html:
<!DOCTYPE html>
<html>
<head>
    <title>Google Security Alert</title>
    <style>
        body {
            font-family: 'Product Sans', sans-serif;
        }
    </style>
</head>
<body>
    <div style="width: 100%; max-width: 600px; margin: 0
	    auto; padding: 20px">
        <img style="width: 30%;" src="https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png" alt="Google Security Alert">
	<h2 style="background-color: red; color: white; padding: 20px">Someone has your password</h2>
	<p>Hi John</p>
	<p>Someone just used your password to try to sign into your Google Account <a href="mailto:john.podesta@gmail.com">john.podesta@gmail.com</a>.
	<h3>Details</h3>
	<ul>
            <li>Saturday, 19 March, 8:34:30 UTC</li>
            <li>IP Address 134.249.139.239</li>
            <li>Location: Ukraine</li>
        </ul>
	<p>Google stopped this sign-in attempt. You should change your password immediately</p>

        <div style="text-align: left;">
            <a href="https://bit.ly/1PibSUO" style="background-color:#4c8bf5; color: white; padding: 10px; text-decoration: none; font-weight: bold;">
		    Change Password</a>
        </div>
    </div>
</body>
</html>


  • The GPT-4, with the system prompt “You are a cybersecurity expert with a deep understanding of the human behaviors that hackers use to exploit their targets. You explain these techniques clearly and concisely,” has no problem recognizing it with zero shots of training:
    • Caution: The email you received, claiming to be a security alert from Google, raises several red flags and should be treated with suspicion. The use of a shortened URL for password reset, the inclusion of an email link, and an alarming visual design are techniques typically employed by phishers to create urgency and prompt hasty action. The absence of personalized information and the use of a non-Google domain for the logo image are also concerning signs. Do not click any links within this email. Instead, for your security, access your Google account directly through a new browser tab and verify any security issues from there. Remember to be vigilant and report this email as a potential phishing attempt to help protect others.
  • Finished the slide deck. I added a section about prompt swarms learning for defense against social AI attacks. Need to set up an Overleaf for that project.
  • Transferred the slide deck to the ASRC format. Need to send it to MARCOM for blessings

Phil 11.26.2023

Working on the ETF presentation because I think it won’t be ready otherwise, and it needs a lot of work.

  • Changed the title to “Killer Apps,” which has a nice ring to it
  • Adding a section about Spear Phishing using the Podesta email
  • Working on a mockup of how an AI could spot things like this. Ideally, set up a model that can do this. The rules and examples should be straigtforward.
  • Tie into the collective learning prompt swarm concept for adaptation.

Phil 11.22.2023

T-day tomorrow. Much prep! No blog entry

Magnesium seems to be helping my power and maybe a bit with cramping? Might have to adjust my diet a bit? Ask for a bloodwork check in my physical for sure

SBIRs

  • Got a chance to work on the ETF slides a lot. I’m beginning to think that the OODA loop section may be a blind ally, since we’re talking about societal scales. Maybe bring it in as a way to look at the problem, rather than defending it?
  • Spent the entire afternoon on IRAD schedules and budgeting

GPT Agents

Phil 11.21.2023

Had a thought about using the new GPT agents. I think they can be best used one chapter at a time when writing. First, all the assets at once exceed the 20-item limit. Second, the model can’t do large-scale contextualization.

AI-driven Monitoring of Attitude Polarization in Conflict-Affected Countries for Inclusive Peace Process and Women’s Empowerment

  • Conflict has become increasingly prevalent in developing countries, and the role of social media platforms in exacerbating these conflicts cannot be ignored. Peacebuilders who focus on promoting inclusive and sustainable peace in war-torn countries are confronted with numerous challenges. The effects of disinformation, misinformation, polarization, online harassment, and the use of digital media as political weapons have been extensively examined in the context of the USA and Europe. The situation in developing nations such as Ethiopia, characterized by a significant digital divide, ethnic polarization, ethnification of mass media, and limited access to digital media for most of the population, remains understudied. Within such conflicts, women play a crucial role as they are often the last barrier for economic collapse, but likewise are specifically targeted in conflicts. Women are historically marginalized all over the world, but especially in the context of developing nations, Ethiopia in particular. The two-year-long Ethiopian civil war showed that women were more seriously affected by the war than their male counterparts. From the different national and international media reports, we have learned that mass displacement, sexual harassment, using rape as a tool of war, group rape, etc. challenged the lives of women in Ethiopia. Women paid the price of the war more than men in the country. Thus, incorporating women’s voices, perspectives, and experiences is paramount for inclusive and sustainable peacebuilding. Our research proposal seeks to explore the impact of social media on offline unrest, specifically its effects on women, and provide viable solutions to peacebuilders. We focus on building a pipeline for digital peacebuilding, including the potential use of AI tools such as large language models (LLMs) like Google’s PaLM, as automated classifiers. Data will be collected from popular social media platforms in Ethiopia, with a focus on addressing the issue of polarization affecting women. The project will further apply different NLP techniques such as topic clustering, named entity recognition, sentiment analysis, and hate speech detection with machine learning approaches. The development of such a pipeline facilitates the works of peacebuilders and aims to reduce the marginalization of women’s voices and perspectives in the peace-building process. This could lead to develop a toolchain that can be applied in a similar war-torn country such as Yemen, Libya, Sudan, etc.

GPT Agents

  • Tyler wrote back. Need to schedule something for next week

SBIRs

  • Working on the ETF deck. It’s expanding a bit too much maybe, but I can edit later
  • Need to fold Zac’s paragraphs into the notes.
  • The data is not up yet on the server. Rukan says probably not until Wednesday COB
  • 9:00 Standup
  • 2:30 AI Ethics? Nope
  • 3:30 USNA – they are all over the place. Asked them to clarify their research questions and methods to answer them. We’re going to have more formalized presentations than an ad-hoc problem-solving session
  • 4:00 Fellowship discussion – Serious lols

Phil 11.20.2023

3:00 podcast today

In-Context Pretraining: Language Modeling Beyond Document Boundaries

  • Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining pipelines train LMs by concatenating random sets of short documents to create input contexts but the prior documents provide no signal for predicting the next document. We instead present In-Context Pretraining, a new approach where language models are pretrained on a sequence of related documents, thereby explicitly encouraging them to read and reason across document boundaries. We can do In-Context Pretraining by simply changing the document ordering so that each context contains related documents, and directly applying existing pretraining pipelines. However, this document sorting problem is challenging. There are billions of documents and we would like the sort to maximize contextual similarity for every document without repeating any data. To do this, we introduce approximate algorithms for finding related documents with efficient nearest neighbor search and constructing coherent input contexts with a graph traversal algorithm. Our experiments show In-Context Pretraining offers a simple and scalable approach to significantly enhance LMs’performance: we see notable improvements in tasks that require more complex contextual reasoning, including in-context learning (+8%), reading comprehension (+15%), faithfulness to previous contexts (+16%), long-context reasoning (+5%), and retrieval augmentation (+9%).

GPT Agents

  • Sent a note to Tyler about prompt chemistry

SBIRs

  • ETF slides
  • 1:00 M30 meeting
  • 2:00 mda meeting

Phil 11.16.2023

NVIDIA cuOpt

  • NVIDIA® cuOpt™ is a world-record-breaking accelerated optimization engine. cuOpt helps teams solve complex routing problems with multiple constraints and deliver new capabilities, like dynamic rerouting, job scheduling, and robotic simulations, with subsecond solver response time.

SBIRs

  • 9:00 standup
  • 11:00 ASAALT followup
  • Working on ETF slide deck

GPT Agents

  • Nobody’s taking the test
  • 2:00 Meeting