Category Archives: Phil

Phil 4.18.2023

GPT Agents

  • Added threads to ContextExplorer, which worked really well
    • 🧬🦍 Alpha Male Chimpanzee Behavior: A Thread 🦍🧬
    • 1/5: Alpha males can use a range of behaviors to achieve their position, from dominance through violence to alliance building through activities such as grooming. The first pattern involves a single male using aggression to achieve alpha status, while the second pattern involves males building coalitions that are more powerful than any individual. #AlphaMales #Chimpanzees
    • 2/5: In Jane Goodall’s book “The Chimpanzees of Gombe,” she describes the aggressive behavior of male chimpanzees, with an average of one fight every 62 hours, compared to females’ average of one fight every 106 hours. Alpha males like Humphrey have an even higher attack rate, with one fight every 9 hours. #JaneGoodall #ChimpanzeeBehavior
    • 3/5: Becoming an alpha male allows access to the best food and females in estrus, which helps them pass on their genes. However, this does not emphasize alliance building, leaving the male responsible for his own defense and more likely to be displaced by a stronger, younger male. #AlphaMaleAdvantages #ChimpanzeeHierarchy
    • 4/5: Alpha males who gain their status through alliance building take longer to reach that status but typically stay in the position longer because they are defended by their allies. Alliances among chimpanzees are complex networks, and stable alliances can help a male reach alpha status. #AllianceBuilding #ChimpanzeeAlliances
    • 5/5: In Goodall’s study of the Gombe group, the male Figan was able to become alpha due to his stable alliance with his older brother Faben and less stable alliances with other dominant males. Once the hierarchy is established, there is less violent behavior in the group, but violence can emerge when two individuals close in rank engage in a test of dominance. #GombeChimpanzees #FiganAndFaben
    • 🔬 Understanding the complex social dynamics of chimpanzees can provide valuable insights into the evolution of human social behavior and the importance of cooperation and alliances in achieving and maintaining power. #Evolution #SocialBehavior
  • Need to finish wiring up the parsing for lists and sequences then add that to NarrativeExplorer

Book

  • Reply to ToC changes and mention that xxx needs credit for his picture

SBIRs

  • 9:15 standup – done
  • 1:00 BMD meeting -done
  • Spent some time with Rukan looking at pricing for Lambda Cloud. It looks much nicer than Azure – twice the GPUs and each GPU has twice the VRAM
  • Working on slide deck. Add something on foundational attacks – in process
  • Worked on ContextExplorer, getting Lists and Sequences to work. Still need to clean up context for Lists
  • Work on story

Phil 4.17.2023

So I had a lot of fun with the ContextExplorer tool over the weekend. I added a prompt that generates tweets based on the stored content. In this case, it randomly chooses a section of text at random and then generates something in Twitter format:

Beliefs that change at the right pace allow for group cohesion, like birds in a flock or fish in a school. Our brains synchronize when we share stories, but too slow or too fast can lead to boredom or overwhelm. Communication media influences human behavior and group structures. #science #groupdynamics

Stampede Theory, 2023

That’s a tweet generated from a random chunk of chapter one of my book

That was neat, so I tried a few other formats, like “factoid” and my current favorite, “press release:”

FOR IMMEDIATE RELEASE

How Technology Has Influenced the Way Humans Move and Behave in Groups

New York, NY – The way humans move and behave in groups has been influenced by technology in various ways. From the development of language to the use of maps and GPS, technology has changed the way we navigate the world and interact with each other.

According to recent research, beliefs are shared and can be located in a terrain, allowing individuals to move and form groups. However, this behavior can also lead to dangerous situations, such as stampedes, as seen in the 1913 Italian Hall disaster. Technology, such as maps and GPS, has changed the way we navigate the world and has also affected our behavior in social spaces.

“Technology has changed the way we behave in physical and virtual spaces, blurring the lines between being lost or offline,” said a spokesperson for the research team. “Our instincts developed for the physical world are applied to virtual group coordination. Changes in belief have an optimal velocity, not too slow to avoid boredom and not too fast to prevent exhaustion. Novelty is preferred, but being overwhelmed is not.”

Beliefs that change at the right pace allow people to move together like birds in a flock or fish in a school. When a group of people share similar beliefs and goals, they can move together through belief space and find it easier to do things as a group. Surprisingly, our brains synchronize when we share a story, firing at the same times and places. If beliefs change too slowly, people get bored, but if they change too quickly, people become overwhelmed. The right pace of change is important for group cohesion.

In conclusion, technology has had a significant impact on the way humans move and behave in groups. From the development of language to the use of maps and GPS, technology has changed the way we navigate the world and interact with each other. As we continue to evolve, it is important to understand the role that technology plays in shaping our behavior and beliefs.

Contact:
John Smith
Email: johnsmith@email.com
Phone: 555-555-5555

Since I’ve read in other texts, you can do press releases for them too! Here’s one for Moby-Dick:

FOR IMMEDIATE RELEASE

New York, NY – Herman Melville’s classic novel, Moby-Dick, continues to captivate readers with its thrilling tale of a vengeful whale and the men who pursue him. Even today, whalemen still hold a great respect for the Sperm Whale, which is seen as a fearsome creature that has been known to attack humans and is feared by other creatures in the sea.

Despite the dangers associated with hunting the Sperm Whale, some fishermen were still willing to pursue Moby Dick, the infamous white whale. Superstitious beliefs in the dangers of hunting the Sperm Whale caused some to be reluctant to embark on this new and daring warfare, believing that to attempt it would be to be torn into a quick eternity.

Whalers have found evidence of whales being captured in both the Pacific and Greenland seas, leading some to believe that the Nor’ West Passage was never a problem for whales, making their experiences with Moby Dick seem almost as mythical as old stories of ships floating up to the surface of a lake and waters coming from the Holy Land by an underground passage.

The White Whale was distinguished by its uncommon size, snow-white wrinkled forehead, and high, pyramidical white hump, which allowed it to be identified from a long distance in the uncharted seas.

Moby-Dick is a timeless classic that continues to inspire and intrigue readers with its thrilling tale of adventure and revenge. It is a must-read for anyone who loves a good story and wants to experience the thrill of the hunt for the infamous white whale.

For more information on Moby-Dick, visit your local bookstore or online retailer today.

Contact:
John Smith
Public Relations Manager
123-456-7890
johnsmith@email.com

Fun stuff. It also will generate test for the King James Bible, and Sun Tzu’s “Art of War,” which I’ve also read in. Note that in the Moby-Dick press release, it the model adds some meta-information about the book – it’s a “classic novel,” available at “your local bookstore or online retailer.”

GPT Agents

  • Add context and source info to the auto-generation – done
  • Add regex for list processing and code that puts the appropriate output in the “response” text area

SBIRs

  • Work on story
  • Work on slides – started
  • Help Eric get his code running
  • 2:00 MDA meeting – done

Phil 4.15.2023

This interactive visualization displays 21 million scientific papers collected in the PubMed database, maintained by the United States National Library of Medicine and encompassing all biomedical and life science fields of research.

You can scroll the narration in the left part of the screen, and interact with the visualization in the right part of the screen. Zooming in loads additional papers. Information about each individual paper appears on mouse-over, and clicking on a paper opens its PubMed page in a separate window. Search over titles is available in the upper-right corner.

Explanatory overview thread here

Why transformers are obviously good models of language

  • Nobody knows how language works, but many theories abound. Transformers are a class of neural network that process language automatically with more success than alternatives, both those based on neural computations and those that rely on other (e.g. more symbolic) mechanisms. Here, I highlight direct connections between the transformer architecture and certain theoretical perspectives on language. The empirical success of transformers relative to alternative models provides circumstantial evidence that the linguistic approaches that transformers embody should be, at least, evaluated with greater scrutiny by the linguistics community and, at best, considered to be the currently best available theories.

Phil 4.14.2023

Book

  • Finish tweets and provide 2 versions for the “mocking” and send off with links to copyright content – done!

SBIRs

  • Pinged Jarod with a link to the paper and changed the title to something bland, but accurate

GPT Agents

  • Set up list string to be in the form “Here’s a list of items/concepts/phrases that are similar to {}: || first concept seed || second concept seed || … || last concept seed”
  • The regex will split on the “||” and feed the first string with the following strings. In the case of the explorer app, all the lists will be shown in the output at once.
  • There will need to be a regex field for splitting lists and sequences that will be saved to the input file for NarrativeExplorer

Phil 4.13.2023

SBIRs

  • 9:15 standup
  • 11:30 Touchpoint
  • 3:00 SEG meeting – Loren is leaving!
  • Sent requests for access and training

GPT Agents

  • Work on adding lists and sequences. I should have some of the recursive code from GPT agents

Book

  • Finish tweet screenshots – done! Need to send them off

Phil 4.12.2023

Spent most of yesterday doing chores

SBIRs

  • Worked with Aaron to get the talking points together for the commercialization meeting and made some slides. The meeting went well. No friction at all, really. All I need to do is take the points on the slide and put them in the report, I think.
  • Wrote up a proposal for the Scale paper. Came up with some other sections that need to be in the paper

GPT Agents

  • 4:00 meeting
  • See how training a variational autoencoder on original embeddings to put them in a map domain idea goes over
  • Review IUI?

Book

  • Rework tweets

Phil 4.9.2023

Wrapping up in Sydney. The bike is packed and ready to go. It’s a lovely day so I’m going to go downtown, do some sightseeing, grab some lunch, then come back and finish packing for tomorrow AM. Here’s all the riding (I don’t know why the elevation didn’t get captured for the first few days?:

Foundation models are getting expensive. Anthropic’s $5B, 4-year plan to take on OpenAI

Phil 4.5.2023

The forecast is looking good! going to try 40mi/65km ride:

Continuing with the story. It’s interesting – the best ratio for summarizing text appears to be about 3:1 – 5:1, which is about the same as the GPT expands prompts into narratives.

I realize that I really want to be able to search substack, which is becoming more of a thing. They have no API, and the Google CSI is too expensive. But Bing may be affordable. They cave a pretty complex a la carte menu here, but it’s something to think about. It’s still $3-$7 perr 1k searches, so no big pulls. but counts might work.

Bing does have site search, so maybe this can work? Here’s a search for famous dog-whistle George Soros. This may be another way of getting at Twitter and Mastodon.social without breaking the bank

Phil 4.4.2023

There is a lot of rain in the forecast. I think I’m going to do a regular day’s work and get a walk in around lunchtime. The neighborhood of Pennant Hills seems quite nice and walkable

GPT Agents

  • Exploring the system’s interaction with my book, which worked the first time! I’m tweaking the size of the context, and adding a “copy to clipboard” capability to keep good/bad prompts and responses.
  • Updated the requirements.txt

SBIRs

  • Working on the scale paper. Made some good progress on the story. It’s hard writing dystopian fiction

Phil 4.3.2023

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace

  • Solving complicated AI tasks with different domains and modalities is a key step toward artificial general intelligence (AGI). While there are abundant AI models available for different domains and modalities, they cannot handle complicated AI tasks. Considering large language models (LLMs) have exhibited exceptional ability in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks and language could be a generic interface to empower this. Based on this philosophy, we present HuggingGPT, a system that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., HuggingFace) to solve AI tasks. Specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in HuggingFace, execute each subtask with the selected AI model, and summarize the response according to the execution results. By leveraging the strong language capability of ChatGPT and abundant AI models in HuggingFace, HuggingGPT is able to cover numerous sophisticated AI tasks in different modalities and domains and achieve impressive results in language, vision, speech, and other challenging tasks, which paves a new way towards AGI.

I’ve been working on creating an interactive version of my book using the GPT. This has entailed splitting the book into one text file per chapter, then trying out different versions of the GPT to produce summaries. This has been far more interesting than I expected, and it has some implications on Foundational models.

The versions of the GPT I’ve been using are Davinci-003, GPT-3.5-turbo, and GPT-4. And they each have distinct “personalities.” Since I’m having them summarize my book, I know the subject matter quite well, so I’m able to get a sense of how well these models summarize something like 400 words down to 100. Overall, I like the Davinci-003 model the best for capturing the feeling of my writing, and the GPT-4 for getting more details. The GPT-3.5 falls in the middle, so I’m using it.

They all get some details wrong, but in aggregate, they are largely better than any single summary. That is some nice support for the idea that multiple foundational models are more resilient than any single model. It also suggests a path to making resilient Foundational systems. Keep some of the old models around to use an ensemble when the risks are greater.

Multiple responses also help with hallucinations. One of the examples I like to use to show this is to use the prompt “23, 24, 25” to see what the model generates. Most often, the response continues the series for a while, but then it will usually start to generate code – e.g. “23, 24, 25, 26, 27, 28];” – where it places the square bracket and semicolon to say that this is an array in a line of software. It has started to hallucinate that it is writing code.

The thing is, the only elements that all the models will agree on in response to the same prompt repeated multiple times are the elements most likely to be trustworthy. For a model, the “truth” is the common denominator, while hallucinations are unique.

This approach makes systems more resilient for the cost of keeping the old systems on line. It doesn’t address how a deliberate attack on a Foundational model could be handled. After all, an adversary would still have exploits for the earlier models and could apply them as well.

Still…

If all models lined up and started to do very similar things, that could be a sign that there was something fishy going on, and a cue for the human operators of these systems to start looking for the nefarious activity.

Phil 4.2.2023

It’s still raining in Sydney. Going to see a show at the Opera House

This looks very useful: Using the ChatGPT streaming API from Python

  • I wanted to stream the results from the ChatGPT API as they were generated, rather than waiting for the entire thing to complete before displaying anything. Here’s how to do that with the openai Python library:

Adding an “auto-question” button that looks through the text and gets a question that fits a randomly selected range of text

Phil 3.31.2023

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed primarily at Meta’s Fundamental AI Research group.

Phil 3.30.2023

Well that was quick – ChatPDF

IUI 2023

  • Demo today! So of course I had to change some code at the last second.
  • Had a nice chat yesterday with the author of this: Benefits of Diverse News Recommendations for Democracy: A User Study
    • News recommender systems provide a technological architecture that helps shaping public discourse. Following a normative approach to news recommender system design, we test utility and external effects of a diversity-aware news recommender algorithm. In an experimental study using a custom-built news app, we show that diversity-optimized recommendations (1) perform similar to methods optimizing for user preferences regarding user utility, (2) that diverse news recommendations are related to a higher tolerance for opposing views, especially for politically conservative users, and (3) that diverse news recommender systems may nudge users towards preferring news with differing or even opposing views. We conclude that diverse news recommendations can have a depolarizing capacity for democratic societies.
    • Some nice cites on Google Scholar, too
  • Also interesting: Towards the Web of Embeddings: Integrating multiple knowledge graph embedding spaces with FedCoder
    • The Semantic Web is distributed yet interoperable: Distributed since resources are created and published by a variety of producers, tailored to their specific needs and knowledge; Interoperable as entities are linked across resources, allowing to use resources from different providers in concord. Complementary to the explicit usage of Semantic Web resources, embedding methods made them applicable to machine learning tasks. Subsequently, embedding models for numerous tasks and structures have been developed, and embedding spaces for various resources have been published. The ecosystem of embedding spaces is distributed but not interoperable: Entity embeddings are not readily comparable across different spaces. To parallel the Web of Data with a Web of Embeddings, we must thus integrate available embedding spaces into a uniform space.
    • Current integration approaches are limited to two spaces and presume that both of them were embedded with the same method — both assumptions are unlikely to hold in the context of a Web of Embeddings. In this paper, we present FedCoder— an approach that integrates multiple embedding spaces via a latent space. We assert that linked entities have a similar representation in the latent space so that entities become comparable across embedding spaces. FedCoder employs an autoencoder to learn this latent space from linked as well as non-linked entities.
    • Our experiments show that FedCoder substantially outperforms state-of-the-art approaches when faced with different embedding models, that it scales better than previous methods in the number of embedding spaces, and that it improves with more graphs being integrated whilst performing comparably with current approaches that assumed joint learning of the embeddings and were, usually, limited to two sources. Our results demonstrate that FedCoder is well adapted to integrate the distributed, diverse, and large ecosystem of embeddings spaces into an interoperable Web of Embeddings.

Phil 3.29.2023

At IUI 2023. Rode my bike to the venue and got very rained on. And the forcast looks less than stellar to get some good riding in:

How to create a private ChatGPT with your own data

  • It has some good stuff in it, including overlapping context windows, which I need to do. Also, it may make sense to break the query into vectors if it is multi-sentence. Keep the original, and then look at the relationship between the vectors. For example, “provide details” should probably be stripped as an outlier.