Phil 9.14.2022

Train station at noon!

Git Re-Basin: Merging Models modulo Permutation Symmetries

  • The success of deep learning is thanks to our ability to solve certain massive non-convex optimization problems with relative ease. Despite non-convex optimization being NP-hard, simple algorithms — often variants of stochastic gradient descent — exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes contain (nearly) a single basin, after accounting for all possible permutation symmetries of hidden units. We introduce three algorithms to permute the units of one model to bring them into alignment with units of a reference model. This transformation produces a functionally equivalent set of weights that lie in an approximately convex basin near the reference model. Experimentally, we demonstrate the single basin phenomenon across a variety of model architectures and datasets, including the first (to our knowledge) demonstration of zero-barrier linear mode connectivity between independently trained ResNet models on CIFAR-10 and CIFAR-100. Additionally, we identify intriguing phenomena relating model width and training time to mode connectivity across a variety of models and datasets. Finally, we discuss shortcomings of a single basin theory, including a counterexample to the linear mode connectivity hypothesis.
https://twitter.com/SamuelAinsworth/status/1569719499263471616

SBIRs

  • Finish first pass at slide deck
  • Register for MORS

GPT Agents

  • Set up keyword data repo
  • Thinking that I can store multiple variants of the manifold reductions as a list of dicts in the EmbeddedText object
  • Tried umap-learn with a brand-new Python 3.10 install. Same problem

Phil 9.13.2022

SBIRs

  • Sprint planning
  • Slides for Q1-Q2 presentation

Book

  • Put in Brenda’s changes and sent her the updated version
  • Still need to find the first use of credibility and trustworthiness

GPT Agents

  • Working on TSNE manifold reduction from data
  • 3:30 Meeting? It’s not on the calendar…
  • Here’s some initial clustering from the twitter data. This is TSNE down to 2 dimensions:
  • Paxlovid, then Ivermectin:
Paxlovid
Ivermectin
  • Need to add clustering, probably at higher dimensions and visualization from that reduced set, just to keep things related (and maybe faster?). Anyway, enough for today.
  • Got tired of making slides. Here are a few perplexity tests:
Paxlovid
Ivermectin
  • It looks like there are several clusters in the Paxlovid space, but people are mostly talking about the same thing wrt ivermectin with a few small outliers?

Phil 9.12.2022

SBIRs

  • Sprint demos
  • 2:00 MDA Meeting
  • More work on MORS presentation
  • Register for MORS? Waiting for info
  • TRAVEL REQUESTS

GPT Agents

  • Manifold reduction and clustering
  • I had a problem which I was expecting, but dreading nonetheless. I tried to push a file larger than 100MB to GitHub. Ooops! And I’m doing this within the JetBrainsIDE, so the command line options are… difficult
  • To fix this problem in JetBrains, go to the Git->show Git log menu item that brings up the ‘log’ display:
  • Right clicking on the problem commit will bring up a menu. Select Revert Commit (Or possibly Undo Commit? both may work). That will clear the branch.
  • Then re-commit the current branch in the normal way
  • Cannot get umap-learn to import. It just hangs. TSNE works though, so working out how all that works. Success!
TSNE with colored group membership

Book

  • 4:00 Meeting with Brenda
  • Ping Katy on Thursday

Phil 9.9.2022

Call powerwasher! Dave Tobias 410 271-8795 – done

The antibiotics seem to be working on the bronchitis… slowly

Book

  • Read through Brenda’s notes. See if I want to fix anything

SBIRs

  • Demo Slides
  • MORS slides for Workshop (Tuesday-Thursday, 27-29 September 2022) – first pass done Good chat with Aaron about changing the framing
  • Slides for Q2 report (Fri Sept 16) – roughed out
  • Hotel for CHIRP November 15-16
  • Travel Requests for MORS and CHIRP

GPT Agents

Phil 8.8.2022

TOLKIEN’S ILLUSTRATORS: SERGEY YUKHIMOV

https://img0.liveinternet.ru/images/attach/c/0//51/804/51804945_37315702.jpg

SBIRs

  • Need to start the MORS slide deck
  • Need to start demo slides
  • 9:15 Standup
  • Sent a note to Erika Mackin – done
  • Tweaked MapDisplay1 so that it works again. Need to port it to the laptop
  • Looks like the presentation is next Friday at 1:00

GPT Agents

  • Got a good run of threads, but I need to verify a few things:
    • There is an imbalance. Is this because the run ended improperly? Looks like it all checks out. There are simply more threads with “ivermectin” based on the sample
    • The view is broken for threads. Need to fix or make a new one. Looks like the experiment_id is being set to -1. Fixed. I wasn’t passing in the value to run_thread_query()
    • Back up the DB – done
    • Specify the current experiment somewhere, and indicate if there are threads (in label) – done
  • Continue with EmbeddingExplorer
Imbalanced threads because peple like to talk about ivermectin

Phil 9.7.2022

Yay! Got a prescription!

My guess is that Trump made a deal with the Saudis for nuclear information on Israel and Iran

Book

  • Need to respond to Brenda’s email and set up a meeting on Friday?

SBIRs

  • Really good conversation with Aaron about CWoC. The idea that lower parts of the hierarchy could simulate higher levels is very cool. It could even be a separate model for each layer, trained on what that part of the hierarchy can be aware of and the commands that it gets. That way, it could “interpolate” across times when communication fails.
  • Need to set up a separate root document for research that has tasks broken down by people with a small introduction and then room for their documentation. Include a ToC. – done

GPT Agents

Phil 9.6.2022

Set up a monthly contribution to the UNHCR

Book

  • Adding a bit on beauty for diversity injection

GPT Agents

  • Start on the GPT and Embedding interfaces. Prompt the GPT with something like “Once upon a time there was” and set the number of times to run and the number of tokens. Split on sentences (r”\.|!|?”) and get the embeddings for each. Then cluster and extract topics (Using EmbeddingExplorer pointing at a different db). Build maps!
  • Continue fleshing out the Twitter embedding app
  • Ok, what I really wound up doing was getting threading to work on TweetDownloader and fixing an interesting bug in the sampled day method. When I wrote it, I assumed that the number of tweets per day are reasonably constant. Not true. So as a bit of a hack, I moved the endpoint of the query to include the entire day and use REPLACE INTO rather than INSERT. Much better results so far. Will work on the other stuff tomorrow.

SBIRs

  • Need to read this carefully. I like the fact that it uses the MinGPT: Transformers are Sample Efficient World Models
    • Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems. Recently, many model-based methods have been designed to address this issue, with learning in the imagination of a world model being one of the most prominent approaches. However, while virtually unlimited interaction with a simulated environment sounds appealing, the world model has to be accurate over extended periods of time. Motivated by the success of Transformers in sequence modeling tasks, we introduce IRIS, a data-efficient agent that learns in a world model composed of a discrete autoencoder and an autoregressive Transformer. With the equivalent of only two hours of gameplay in the Atari 100k benchmark, IRIS achieves a mean human normalized score of 1.046, and outperforms humans on 10 out of 26 games. Our approach sets a new state of the art for methods without lookahead search, and even surpasses MuZero. To foster future research on Transformers and world models for sample-efficient reinforcement learning, we release our codebase at this https URL.
    • Delivered the quarterly report.
https://twitter.com/jerclifton/status/1565397169623797760

Book

  • Brenda has started a readthrough and will get back to me with comments
  • Need to add a reference to “beauty” in the diversity injection chapter and reference Transcendence

SBIRs

  • Finish first pass of quarterly report
  • MinGPT!!!!
https://twitter.com/hardmaru/status/1565569808548376576

GPT Agents

  • Added parens to the Twitter query. You can now do foo OR bar OR (Frodo AND Gandalf) OR (Sauron AND Sauruman)
  • Thinking about combining the GPT and Embedding interfaces. Prompt the GPT with something like “Once upon a time there was” and set the number of times to run and the number of tokens. Split on sentences (r”\.|!|?”) and get the embeddings for each. Then cluster and extract topics (Using EmbeddingExplorer pointing at a different db). Build maps!

Phil 8.31.2023

A house made of bones in the middle of a desert

Heavenbanning is real in the world: We flooded our dating app with bots…to scam scammers

DALL·E Editor Guide

  • The DALL·E editor interface helps you edit images through inpainting and outpainting, giving you more control over your creative vision.

Book

  • I finished the new proposal last night, so now I need to sent it off today – DONE!!!
  • Working on appt with Brenda

SBIRs

  • Tweak Ron’s stuff and add an intro
  • Tweak Rukan’s stuff and add an intro for interpolation and regression sections. Include an intro to Perceptrons and then Attention-MLPs.
  • Start the final pass through the doc

GPT Agents

  • Move “selected experiment” and “keyword” out of the tabs
  • Add a “Create Corpora” tab
  • General TODOs:
    • Implement threads, and make sure that the extended queries work
    • Implement calls to GPT embeddings and verify on small dataset. I could even try Aaron’s Wikipedia cats vs computers idea but use tweets
    • Need to treat AND like OR so that tweets containing multiple keywords work

Phil 8.30.2022

Back from more travels! It’s good to be home

SBIRs

  • Submit expenses
  • 9:00 Sprint planning – done
  • 2:MDA meeting
  • Quarterly report

GPT Agents

  • Working on EmbeddingExplorer
Progress for the last couple of days
  • 3:30 Meeting

Book

  • Finish proposal
  • Schedule Thursday meeting

Phil 8.26.29

Umberto Eco: A Practical List for Identifying Fascists

Multi Dimensional and Domain Operations (MDDO)

  • Cebrowski referred to the traditional concept of combat power being measured in the movement of traditional physical forces through time and space in a physical domain.  But to do this, military forces need information and control from the information domain.  Winning a conflict happens in the intangible cognitive domain is in the mind of the individual war-fighter with feelings of success or failure.  Collectively these individual minds make up the social domain, the shared societal awareness and understandings referred to in culture, values, attitudes and beliefs.

SBIRs

  • Need to either pick up docs or get them printed – getting them printed
  • Quarterly report – Adjusting slots for Rukan’s work. Ron looks like he’s doing better

Book

  • Working on the Elsivier proposal. Finding reviewers

GPT Agents

  • Top2Vec is hanging on the import! Going to try it on another box and see if that will still happen. If it does, then I’m going to use OpenAI’s embeddings until it’s fixed
  • Started!

Phil 8.25.2022

SBIRs

  • Was at the Tech Summit for the last two days. Good to see people again!
    • Pinged Jennifer about Elicit
  • Trip
    • Tix! – done
    • Hotel! – done
    • Car! – done
    • Slides! – done, but not printed
  • Continue on Quarterly report
    • 9:00 Meeting with Ron

Book

  • Respond to Katy’s letter – figuring out who would be good to send this to for review
    • Ping Brenda. I think we’ll need to meet next week – done
    • Wound up writing a short python program to scan through my book to find what’s cited most. Mostly based on this. Handy!
from tkinter import filedialog
import PyPDF2
import re
from typing import List, Dict

filename = filedialog.askopenfilename(filetypes=(("pdf files", "*.pdf"),), title="Load pdf File")
if filename:
    print("opening {}".format(filename))
    # open the pdf file
    object = PyPDF2.PdfFileReader(filename)
    d = {}

    # get number of pages
    NumPages = object.getNumPages()
    print("There are {} pages".format(NumPages))
    # extract text and do the search
    for i in range(0, NumPages):
        PageObj = object.getPage(i)
        # print("this is page " + str(i))
        Text = PageObj.extractText()
        # print(Text)
        reml:List = re.findall("\[\d+\]", Text)
        if len(reml) > 0:
            for r in reml:
                if r in d:
                    d[r] += 1
                else:
                    d[r] = 1
    ds = dict(sorted(d.items(), key = lambda x: x[1], reverse=True))
    for k, v in ds.items():
        print("{} = {}".format(k, v))

GPT-Agents

  • Is today the day to try topic2vec? Sure hope so!
  • Started poking around. It’s hanging on the import. That is really odd

Phil 8.22.2022

Book

  • I seem to have an encouraging nibble from Elsevier, via a side door from another author’s acquisition editor. We will see how that goes
  • Working on grammar
  • Setting up a meeting with my new copy editor. Thursday?

SBIRs

  • Watch Rukan’s talk
  • More quarterly report

GPT Agents

  • Bail on this week’s meeting – done
  • Still need to poke at topic2vec

Phil 8.19.2022

Book

  • Moved the definitions to the front and tweaked a bit. I also discovered that MSWord will open a PDF and then you can use the grammar checking feature, which is nice
  • Send proposal to NYU. Looks pretty straightforward – done!
  • Assemble the annotated ToC + Book and send it to Brenda – done!
  • Set up a meeting to see Rukan walk through his slides

SBIRs

  • Might go to Huntsville on the 29th?
  • Quarterly report

GPT Agents