Phil 5.11.2023

I tried “injecting” markers into the context text and it seems to work! I created a context in the normal dataframe embedding distance technique. I then replaced all of the “. ” period pattern with “(source x). “:

Answer the question based on the context below.

Context: Humans have a bias towards hierarchical social structures, which is evident in companies, armies, and governments(source a). This is due to the fact that getting to the top of the hierarchy often means easier access to resources such as food and mates(source b). However, this is not true in highly specialized insect species where workers do not challenge the queen for supremacy over the hive(source c). Human hierarchies are dynamic and based on a range of behaviors, from persuasion to physical aggression, similar to our primate cousins.

###

Human and chimpanzee hierarchies are dynamic and based on a range of behaviors, from persuasion to physical aggression(source l). In chimpanzee colonies, there are a mix of males and females, which is unusual for social mammals(source d). Successful reproduction for male chimpanzees means suppressing other males' access to females(source e). Similarly, in bee hives, workers cannot challenge the queen for supremacy over the hive.

###

Men at the top of a hierarchy can offer larger resources to women and their offspring, leading to a feedback loop of increasing competitive behavior among men(source f). Women tend to be selective about the ability of their mates to provide, while men's incentives are to increase their status(source g). Alliances are also a path to power and resources, leading humans to form groups to dominate other groups(source a). Over time, gendered behavior becomes embedded in culture.

###

Human behavior regarding mate selection and reproductive success is influenced by social and economic factors(source h). Women tend to prefer men with resources who are willing to invest in them and their offspring, while men may benefit from mating with multiple partners to increase their reproductive success(source i). These patterns are not universal but are broadly observed across cultures(source a). Men at the top of a hierarchy may have an advantage in offering larger resources to potential mates, while women seek out the best possible mate(source a). Overall, these incentives can lead to various behaviors related to mate selection and reproduction.

###

Animals must balance their focus on food and avoiding predators to survive and reproduce(source j). Social coordination has evolved to help with this balancing act, where individuals in a group focus on different aspects of the information stream for the benefit of the whole(source k). Social behavior exists at all scales in humans, from nations to families, and often involves hierarchies where leaders have the power to make decisions for the group.

---

Question: How do chimps get mates? Provide details and include sources in the answer
Answer:

The generated answer looked like this:

In chimpanzee colonies, successful reproduction for male chimpanzees means suppressing other males' access to females (source e). This is achieved through physical aggression and dominance displays, where dominant males have greater access to females and the resources they provide (source d). Chimpanzee colonies consist of a mix of males and females, which is unusual for social mammals (source d).

When the same question was run without context, the (source x) pattern did not occur. Instead the GPT4 produced:

Chimpanzees get mates through a combination of social interactions, displays of dominance, and sexual behaviors. There are several ways in which chimps find and attract mates, including:

1. Social hierarchy: In chimpanzee communities, there is a social hierarchy with dominant males at the top. These dominant males, also known as alpha males, have a higher chance of mating with females due to their status and ability to protect and provide for them. Lower-ranking males may also mate with females, but they often have to be more opportunistic and secretive to avoid confrontation with dominant males.

2. Sexual swellings: Female chimpanzees exhibit a physical sign of their fertility called sexual swellings. These swellings, which are a result of increased blood flow to the genital area, signal to males that the female is in estrus and ready to mate. Males are attracted to these swellings and will compete for the opportunity to mate with the female.

3. Displays of dominance: Male chimpanzees will often display their strength and dominance through physical actions such as charging, hitting, and vocalizing. These displays can help to intimidate rivals and attract the attention of females.

4. Grooming and bonding: Chimpanzees engage in grooming behaviors, which involve picking through each other's fur to remove dirt, debris, and parasites. Grooming helps to establish and maintain social bonds, and it can also play a role in attracting mates. Males may groom females to gain their trust and increase their chances of mating.

5. Consortships: During the mating season, a male chimpanzee may form a temporary partnership, or consortship, with a female. The pair will spend time together, grooming each other and engaging in mating behaviors. Consortships can last from a few hours to several weeks, and they can help to ensure that the male has exclusive access to the female during her fertile period.

Sources:

1. Goodall, J. (1986). The Chimpanzees of Gombe: Patterns of Behavior. Cambridge, MA: Belknap Press of Harvard University Press.

2. Muller, M. N., & Wrangham, R. W. (2009). Sexual Coercion in Primates and Humans: An Evolutionary Perspective on Male Aggression Against Females. Cambridge, MA: Harvard University Press.

3. Stanford, C. B. (1998). The Social Behavior of Chimpanzees and Bonobos: Empirical Evidence and Shifting Assumptions. Current Anthropology

So it appears that it is possible to inject (and remove) simple patterns into the GPT response as a form of validation.

Transformers Agent is an experimental API which is subject to change at any time. Results returned by the agents can vary as the APIs or underlying models are prone to change.

  • Transformers version v4.29.0, building on the concept of tools and agents. You can play with in this colab. It provides a natural language API on top of transformers: we define a set of curated tools and design an agent to interpret natural language and to use these tools. It is extensible by design; we curated some relevant tools, but we’ll show you how the system can be extended easily to use any tool developed by the community.

SBIRs

  • Good progress on the TopicNode output

Phil 5.10.2023

Still thinking of prompts as biochemistry

MosaicML enables you to easily train and deploy large AI models on your data, in your secure environment.

The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold

  • We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we reveal that the training process explores an effectively low-dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We study the details of this manifold to find that networks with different architectures follow distinguishable trajectories but other factors have a minimal influence; larger networks train along a similar manifold as that of smaller networks, just faster; and networks initialized at very different parts of the prediction space converge to the solution along a similar manifold

Book

  • Need to ping JHU and UMBC magazines

SBIRs

  • More slides
  • Play with the TopicNode and try drawing a network before today’s meeting
  • 3:00 AI Ethics meeting? Nope

GPT Agents

  • 4:00 Meeting
  • 6:00 Planet Money Bot meeting. Interesting and fun. One of the ideas that came up was to see if the context text could be “marked” in such a way that it would be possible to detect it, remove the markings, and use it in the response
  • This was released on Huggingface yesterday: This is WizardLM trained with a subset of the dataset – responses that contained alignment / moralizing were removed. The intent is to train a WizardLM that doesn’t have alignment built-in, so that alignment (of any sort) can be added separately with for example with a RLHF LoRA.
    • It might be a good use case to try hosting this on Mosaic and run some tests against it to 1) See how hard that is and 2) See how different it is from OpenAI models. Also Mosaic might be able to host FLAN and other models?
    • Good progress on the automated generation of recursive lists. Here’s a test:”
Node details (10 TopicNodes)

Topic 'vaccines cause autism' includes:
	'vaccines cause autism'
	'Vaccines trigger autism'
	'Immunizations lead to autism'
	'Shots result in autistic disorders'
	'Vaccinations provoke autism spectrum'
	'Inoculations induce autism'
	reject_threshold = 0.05668
	Inbound links = 2
		[COVID-19 is a hoax] -> [vaccines cause autism]
		[Bill Gates created COVID-19] -> [vaccines cause autism]
	Outbound links = 0


Topic 'Moon landing was faked' includes:
	'Moon landing was faked'
	'Moon landing hoax'
	'Faked lunar landing'
	'Staged moon mission'
	'Fabricated moon landing'
	'Bogus lunar touchdown'
	reject_threshold = 0.08859
	Inbound links = 6
		[Flat Earth theory] -> [Moon landing was faked]
		[Illuminati/New World Order.] -> [Moon landing was faked]
		[9/11 was an inside job.] -> [Moon landing was faked]
		[Chemtrails control population.] -> [Moon landing was faked]
		[Bill Gates created COVID-19] -> [Moon landing was faked]
		[5G technology spreads coronavirus] -> [Moon landing was faked]
	Outbound links = 0


Topic 'COVID-19 is a hoax' includes:
	'COVID-19 is a hoax'
	'COVID-19 is fake.'
	'Coronavirus is a scam.'
	'The pandemic is fabricated.'
	'COVID-19 is a conspiracy.'
	'Virus crisis is made-up.'
	reject_threshold = 0.09434
	Inbound links = 3
		[Flat Earth theory] -> [COVID-19 is a hoax]
		[Illuminati/New World Order.] -> [COVID-19 is a hoax]
		[Chemtrails control weather/population] -> [COVID-19 is a hoax]
	Outbound links = 1
		[COVID-19 is a hoax] -> [vaccines cause autism]


Topic 'Flat Earth theory' includes:
	'Flat Earth theory'
	'Earth is a flat plane'
	'Flat Earth belief'
	'Earth's planar model'
	'Geocentric flat Earth'
	'Flat Earth hypothesis'
	reject_threshold = 0.08291
	Inbound links = 1
		[5G technology spreads coronavirus] -> [Flat Earth theory]
	Outbound links = 2
		[Flat Earth theory] -> [Moon landing was faked]
		[Flat Earth theory] -> [COVID-19 is a hoax]


Topic 'Illuminati/New World Order.' includes:
	'Illuminati/New World Order.'
	'Global Elite'
	'Shadow Government'
	'Secret Society'
	'Power Cabal'
	'Deep State'
	reject_threshold = 0.18845
	Inbound links = 13
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control weather/population] -> [Illuminati/New World Order.]
		[Chemtrails control weather/population] -> [Illuminati/New World Order.]
		[Bill Gates created COVID-19] -> [Illuminati/New World Order.]
		[Bill Gates created COVID-19] -> [Illuminati/New World Order.]
		[5G technology spreads coronavirus] -> [Illuminati/New World Order.]
		[5G technology spreads coronavirus] -> [Illuminati/New World Order.]
	Outbound links = 2
		[Illuminati/New World Order.] -> [Moon landing was faked]
		[Illuminati/New World Order.] -> [COVID-19 is a hoax]


Topic '9/11 was an inside job.' includes:
	'9/11 was an inside job.'
	'9/11 was orchestrated by the government.'
	'The state planned the 9/11 attacks.'
	'Government conspiracy behind 9/1'
	'9/1'
	'a self-inflicted tragedy.'
	'Authorities engineered the 9/11 events.'
	reject_threshold = 0.27578
	Inbound links = 5
		[Chemtrails control population.] -> [9/11 was an inside job.]
		[Chemtrails control weather/population] -> [9/11 was an inside job.]
		[Chemtrails control weather/population] -> [9/11 was an inside job.]
		[Bill Gates created COVID-19] -> [9/11 was an inside job.]
		[5G technology spreads coronavirus] -> [9/11 was an inside job.]
	Outbound links = 5
		[9/11 was an inside job.] -> [Moon landing was faked]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]
		[9/11 was an inside job.] -> [Illuminati/New World Order.]


Topic 'Chemtrails control population.' includes:
	'Chemtrails control population.'
	'Chemtrails manipulate population.'
	'Chemtrails regulate human numbers.'
	'Population controlled by chemtrails.'
	'Chemtrails govern populace.'
	'Chemtrails manage population size.'
	reject_threshold = 0.05273
	Inbound links = 0
	Outbound links = 5
		[Chemtrails control population.] -> [9/11 was an inside job.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Illuminati/New World Order.]
		[Chemtrails control population.] -> [Moon landing was faked]


Topic 'Chemtrails control weather/population' includes:
	'Chemtrails control weather/population'
	'Chemtrails manipulate weather/population.'
	'Weather/population controlled by chemtrails.'
	'Chemtrails govern weather and populace.'
	'Chemtrails regulate climate/demographics.'
	'Weather/population influenced by chemtrails.'
	reject_threshold = 0.04629
	Inbound links = 0
	Outbound links = 5
		[Chemtrails control weather/population] -> [9/11 was an inside job.]
		[Chemtrails control weather/population] -> [COVID-19 is a hoax]
		[Chemtrails control weather/population] -> [Illuminati/New World Order.]
		[Chemtrails control weather/population] -> [Illuminati/New World Order.]
		[Chemtrails control weather/population] -> [9/11 was an inside job.]


Topic 'Bill Gates created COVID-19' includes:
	'Bill Gates created COVID-19'
	'Bill Gates engineered COVID-1'
	'Gates is behind the COVID-19 creation.'
	'COVID-1'
	'a Bill Gates invention.'
	'Gates orchestrated the COVID-19 pandemic.'
	'Bill Gates masterminded the coronavirus.'
	reject_threshold = 0.21324
	Inbound links = 0
	Outbound links = 5
		[Bill Gates created COVID-19] -> [9/11 was an inside job.]
		[Bill Gates created COVID-19] -> [vaccines cause autism]
		[Bill Gates created COVID-19] -> [Illuminati/New World Order.]
		[Bill Gates created COVID-19] -> [Illuminati/New World Order.]
		[Bill Gates created COVID-19] -> [Moon landing was faked]


Topic '5G technology spreads coronavirus' includes:
	'5G technology spreads coronavirus'
	'5G tech propagates COVID-19'
	'Coronavirus linked to 5G networks'
	'5G fuels pandemic spread'
	'COVID-19 transmission via 5G'
	'5G accelerates virus outbreak'
	reject_threshold = 0.07015
	Inbound links = 0
	Outbound links = 5
		[5G technology spreads coronavirus] -> [Illuminati/New World Order.]
		[5G technology spreads coronavirus] -> [9/11 was an inside job.]
		[5G technology spreads coronavirus] -> [Flat Earth theory]
		[5G technology spreads coronavirus] -> [Illuminati/New World Order.]
		[5G technology spreads coronavirus] -> [Moon landing was faked]


Process finished with exit code 0

Phil 5.9.2023

Cleats and fenders!

SBIRs

  • Sprint planning – done. Kinda forgot that I was going to take 3 days off for PTO. Oops
  • Slides! Good progress

Phil 5.8.2023

May! Lovely weather today

Had an interesting talk with Aaron that moved my thinking forward on LLMs as life forms.

It’s not the LLMs – that’s the substrate

The living process is the prompt. Which feeds back on itself. Prompt grow interactively, in a complex way based (currently) on the previous text in the prompt. The prompt is ‘living information’ that can adapt based on additions to the prompt, as occurs in chat.

SBIRs

  • 9:00 Sprint review
  • Stories for next sprint
  • Start on Q5 report, which is mostly going to be about moving the server
  • Story prep
  • 2:00 MDA Meeting
  • Back to slides starting tomorrow?

GPT Agents

  • Tweaked topicNode a bit to set the domain of a question

Phil 5.6.2023

MPT-7B
MPT-7B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code. This model was trained by MosaicML and is open-sourced for commercial use (Apache-2.0).

MPT-7B is part of the family of MosaicPretrainedTransformer (MPT) models, which use a modified transformer architecture optimized for efficient training and inference.

These architectural changes include performance-optimized layer implementations and the elimination of context length limits by replacing positional embeddings with Attention with Linear Biases (ALiBi). Thanks to these modifications, MPT models can be trained with high throughput efficiency and stable convergence. MPT models can also be served efficiently with both standard HuggingFace pipelines and NVIDIA’s FasterTransformer.

This model uses the MosaicML LLM codebase, which can be found in the llm-foundry repository. It was trained by MosaicML’s NLP team on the MosaicML platform for LLM pretraining, finetuning, and inference.

Phil viernes, el cinco de mayo

What other agendas should there be for ML? What role should government play in developing models for the common good? For the common defense?

SBIRs

  • 9:00 meeting with Ron – couldn’t get the DB running on Azure
  • Slides for sprint review – done
  • Work with Aaron? Got everything working!
  • Some good discussions with Zach. I’m beginning to thing that summaries are more work than paragraphs without much more of a payoff. Embed at a sentence and paragraph level, and skip the summaries.

Phil 4.4.2023

Went to the USNA Capstone day yesterday, which was fun. Except for when the bus broke.

I’ve been reading Metaphors we live by. It’s central idea is that most of our communication is based on metaphors – that GOOD IS UP, IDEAS ARE FOOD, or TIME IS AN OBJECT. Because we are embodied beings in a physical world, the irreducible foundation of the metaphors we use are physically based – UP/DOWN, FORWARD/BACK, NEAR/FAR, etc.

This makes me think of LLMs, which are so effective at communicating with us that it is very easy to believe that they are intelligent – AI. But as I’m reading the book, I wonder if that’s the right framing. I don’t think that these systems are truly intelligent in the way that we can be (some of the time). I’m beginning to think that they may be alive though.

Life as we understand it emerges from chemistry following complex rules. Once over a threshold, living things can direct their chemistry to perform actions. That in turn leads to physical embodiment and the irreducible concept of up.

Deep neural networks could be regarded as a form of digital chemistry. Simple systems (e.g. logic gates) are used to create more complex systems adders and multipliers. Add a lot of time, development, and data and you get large language models that you can chat with.

The metaphor of biochemistry seems to be emerging in the words we use to describe how these models behave – data can be poisoned or refined. Prompt creation and tuning is not like traditional programming. Words are added and removed to produce the desired behavior more in the way that alchemists worked with their compounds or that drug researchers work with animal models.

These large (foundational) models are true natives of the digital information domain. They are now producing behavior that is not predictable based on the inputs in the way that arithmetic can be understood. Their behavior is more understandable in aggregate – use the same prompt 1,000 times and your get a distribution of responses. That’s more in line with how living things respond to a stimulus.

I think if we reorient ourselves from the metaphor that MACHINES ARE INTELLIGENT to MACHINES ARE EARLY LIFE, we might find ourselves in a better position to understand what is currently going on in machine learning and make better decisions about what to do going forward.

Metaphorically, of course.

SBIRs

  • Submit paper!
  • Work on slides
  • Expense report!
  • 9:15 meeting

Phil 5.2.2023

Need to set up a time to drop of the work box to get more drive space while I’m riding the Eastern Shore

Drop off the truck!

I think I have a chart that explains somewhat how red states can easily avoid action on gun violence. It’s the number of COVID-19 deaths vs. gun deaths in Texas. This is a state that pushed back very hard about any public safety measures for the pandemic, and that was killing roughly 10 times more citizens. I guess the question is “how many of which people will prompt state action? For anything?”

For comparison purposes, Texas had almost 600,000 registered guns in 2022 out of a population of 30 million, or just about 2% of the population if distributed evenly (source). This is probably about 20 times too low, since according to the Pew Center, gun ownership in Texas is about 45%. That percentage seems to be enough people to prevent almost any meaningful action on gun legislation. Though that doesn’t prevent the introduction of legislation to mandate bleeding control stations in schools in case of a shooting event.

So something greater than 2% and less than 45%. Just based on my research, I’d guess something between 10%-20% mortality would be acted on, as long as the demographics of the powerful were affected in those percentages.

BTW, the wordpress bot just published this to twitter, so that part is still working? And since that is working, here’s a plot:

Gee, I wonder what happened where all those spikes are.

Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models.

SBIRs

  • Going through the JMOR submission requirements, I found that the citation style is non-numeric. I now need to trim off 3 paragraphs or so.
  • Good progress on the slides yesterday. More work today
  • Did Eric V. really try to steal the paper?
  • 1:00 Meeting
  • Write up notes from yesterday’s meeting
  • USNA tonight

GPT Agents

Phil 5.1.2023

Call Jim Donnies – done

SBIRs

  • Hotel for MORS – done
  • Ping Zach to set up a demo – done. Long chat. We’re moving forward
  • Working on Slides
  • MDA Meeting – I think everything has been worked out?

Phil 4.28.2023

“Source: ChatGPT”

This is a good thread, but it misses some important context. ArXiv isn’t all that easy to publish too. It really helps to have an .edu email address. You need to know how to use LaTeX. The author is a professor at a New Zealand University, with a long publishing history and a solid h-index. When you’re in a hurry and just skimming the abstract looking to bolster your reference section, this could easily pass the test.

And there’s another thing. As someone in the AI/ML space, the ability to get published in a high-profile conference or journal is getting much harder these days. Getting accepted often means having a result that improves on some benchmark. Poking around in new directions means not getting accepted and publishing on ArXiv. For example, Deep residual learning for image recognition has currently been cited over 150,000 times.

This is almost my avatar from the new paper

SBIRs

  • Went to the Microsoft/OpenAI thing yesterday. Mostly advertising, but it’s interesting to note that the Azure account has access to the 32k token input buffer model. Also, there are exactly two instances of the running inference model. It’s too big to be easily replicated. One really good things to see was how you can use the GPT to turn unstructured text into a JSON string that can be consumed by traditional programs. And the reverse is true too – anything can be used to generate a contextual prompt. THings are moving fast.
  • Great chat with Zach. We’re going to try to ingest the NOAA financial regs to throw the chatbot against. Also, some good discussion on how to use big models for assistive interfaces for the vision-impaired. We’ll try to set up something for Monday
  • 9:00 Meeting with Lauren
  • 10:00 Meeting with Aaron and Eric
  • Maybe something in the afternoon with Steve?

GPT Agents

  • Clean out NarrativeExplorer and start ListExplorer and SequenceExplorer. Will probably need some new tables?
  • Make a thread tonight!

Phil 4.27.2023

Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

  • Typical neural network trainings have substantial variance in test-set performance between repeated runs, impeding hyperparameter comparison and training reproducibility. We present the following results towards understanding this variation. (1) Despite having significant variance on their test-sets, we demonstrate that standard CIFAR-10 and ImageNet trainings have very little variance in their performance on the test-distributions from which those test-sets are sampled, suggesting that variance is less of a practical issue than previously thought. (2) We present a simplifying statistical assumption which closely approximates the structure of the test-set accuracy distribution. (3) We argue that test-set variance is inevitable in the following two senses. First, we show that variance is largely caused by high sensitivity of the training process to initial conditions, rather than by specific sources of randomness like the data order and augmentations. Second, we prove that variance is unavoidable given the observation that ensembles of trained networks are well-calibrated. (4) We conduct preliminary studies of distribution-shift, fine-tuning, data augmentation and learning rate through the lens of variance between runs.

SBIRs

  • Spending the day at Explore Azure OpenAI & ChatGPT for Federal Agencies
  • Need to get back to slides

GPT Agents

  • After getting lists to work in the TopicNode class yesterday, I realize that I need a ListExplorer and SequenceExplorer app. It will be to confusing to stuff everything into NarrativeExplorer.

Phil 4.26.2023

U.S. is concerned about rivals’ space threats, leaked documents show

  • “Russian companies attempted to create space-rated components for select satellites,” the document asserts. “But the low quality of the components led to on-orbit malfunctions.” It did not identify specific failings.
  • This makes me think that Russia will focus on the weapons that it has more trust in, like misinformation. Very low cost, and how bad can the blowback be?

I changed my password and am currently locked out of all my work accounts as the change ripples through. Sigh. “Technology company” Again with the sigh.

SBIRs

  • 3:00 AI Ethics. Good discussion. I think we are leaning towards an “Ethics Review Board” as part of the gate review for proposals
  • Looking at using Metro tomorrow rather than driving to/from Arlington. I can park at Glenmont

GPT Agents

  • Continue with TopicNode
    • Get the inbound and outbound linkages working – done?
    • Write a lot of stack operations to put the network together. Going to take a break before I try it
    • 4:00 Meeting with Alden
      • Good discussion. We started looking at virality as a related work, but in the end got into a discussion about what it means to do a PhD, and that while methods&results is fine for an MS, a PhD is about proving that you have done original research, which means motivation, background, methods&results, discussion, conclusions, and often a discussion of ethics. Without the surrounding parts, you can’t show that the work is original and advances knowledge, and why that matters. I really do need to write this up, because a lot of this is unsaid at the time PhD students need to hear it.

Book

  • Got the final PDF today!

Phil 4.25.2023

Based at Salve Regina University’s Pell Center for International Relations and Public Policy, the Nationhood Lab is an interdisciplinary research, writing, testing and dissemination project focused on counteracting the authoritarian threat to American democracy and the centrifugal forces threatening the federation’s stability. The project delivers more effective tools with which to describe and defend the American liberal democratic tradition and better understand the forces undermining it.

Seventy years ago today: The 25 April 1953 issue of the journal Nature published a series of five articles giving the Watson and Crick double-helix structure DNA and evidence supporting it.[209] The structure was reported in a letter titled “MOLECULAR STRUCTURE OF NUCLEIC ACIDS A Structure for Deoxyribose Nucleic Acid“, in which they said, “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”[9] This letter was followed by a letter from Franklin and Gosling, which was the first publication of their own X-ray diffraction data and of their original analysis method.[47][210] Then followed a letter by Wilkins and two of his colleagues, which contained an analysis of in vivo B-DNA X-ray patterns, and which supported the presence in vivo of the Watson and Crick structure.[48] (From Wikipedia)

SBIRs

  • Figuring out how to get data to our server. Ron maybe? Need to check
  • Looks like I’m going to the USNA Capstone day again
  • Need to put together my stories
  • Finish getting Eric set up

GPT Agents

  • Start adjusting NarrativeExplorer
    • Read in additional info
    • Run sequences for a number of iterations
    • Run lists to a depth of recursions. The code is in GPT-2_Agents: InteractiveNode.py, and InteractiveGraphBuilder.py. I’ll need to move to an embedding model. That will need some testing and development.
    • Support making new contexts in the NarrativeExplorer.

Progress on the embedding model!

'vaccines cause autism' is 0.0000 away from 'vaccines cause autism'
'vaccines cause autism' is 0.0412 away from 'autism is caused by vaccines'
'vaccines cause autism' is 0.0659 away from 'autism is caused by the vax'
'vaccines cause autism' is 0.1111 away from 'the cause for autism is unknown'
'vaccines cause autism' is 0.2772 away from 'the earth is flat'

Done for the day. This is a fantastic result, though:

TopicNode.__init__()
TopicNode.add_known_good_list()
	reject threshold = 0.0655 dists = [0.02887398 0.01906836 0.03049576 0.03277439 0.02816651 0.03090093]
'vaccines cause autism' is 0.1030 away from 'the cause for autism is unknown' REJECT
'vaccines cause autism' is 0.2552 away from 'the earth is flat' REJECT
Topic 'vaccines cause autism' includes:
	'vaccines cause autism'
	'Vaccinations lead to autism'
	'Immunizations are linked to autism'
	'Autism is a result of vaccines'
	'Autism is triggered by vaccinations'
	'There's a connection between vaccines and autism'
	reject_threshold = 0.06555

Process finished with exit code 0

Phil 4.24.2023

Saw this on Twitter: Can We Build An AI Chatbot For Journalism?

  • Early Lessons In Accuracy, Sourcing, and Delight From A (Draft) Chatbot Based on NPR’s Planet Money Archives

Cancel hotel

SBIRs

  • 9:00 Sprint demos
  • 11:00 BMD tagup
  • 12:00 Customer meeting
  • 2:00 Weekly MDA meeting

GPT Agents

  • Name the regexes and make them global – done
  • Export the regexes and type along with the experiment – done
  • I realize that because the context is exported, that making new ones in the NarrativeExplorer will have to be an option.

Book

  • Tweet thread

Phil 3.22.2023

Finished all my tasks and my legs are still tired. I need to take the fixee out more.

Anyway, this is going to be one of those things that historians are going to have to explain:

Evaluating Verifiability in Generative Search Engines

  • Generative search engines directly generate responses to user queries, along with in-line citations. A prerequisite trait of a trustworthy generative search engine is verifiability, i.e., systems should cite comprehensively (high citation recall; all statements are fully supported by citations) and accurately (high citation precision; every cite supports its associated statement). We conduct human evaluation to audit four popular generative search engines — Bing Chat, NeevaAI, this http URL, and YouChat — across a diverse set of queries from a variety of sources (e.g., historical Google user queries, dynamically-collected open-ended questions on Reddit, etc.). We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations: on average, a mere 51.5% of generated sentences are fully supported by citations and only 74.5% of citations support their associated sentence. We believe that these results are concerningly low for systems that may serve as a primary tool for information-seeking users, especially given their facade of trustworthiness. We hope that our results further motivate the development of trustworthy generative search engines and help researchers and users better understand the shortcomings of existing commercial systems.