Phil 12.13.2023

SBIRs

More reading. Next is Toy Models of Superposition. I do want to check out the Eliciting Latent Predictions from Transformers with the Tuned Lens GitHub repo. It looks like there are pretrained models.
There is a follow on paper for Toy Models: Superposition, Memorization, and Double Descent
- We extend our previous toy-model work to the finite data regime, revealing how and when they memorize training examples.
This post from 2014 also looks helpful: Deep Learning, NLP, and Representations
- This post reviews some extremely remarkable results in applying deep neural networks to natural language processing (NLP). In doing so, I hope to make accessible one promising answer as to why deep neural networks work. I think it’s a very elegant perspective.

GPT Agents

Created a ContextPromptAccuracy project and loaded it up with the code for the Wikipedia experiments and the supabase data. Need to set up mysql schema so I can start making queries, tables and charts.
Ok, really happy with this bit of code:

def to_db(msi:MSI.MySqlInterface, table_name:str, dict_list:List):
    d:Dict
    for d in dict_list:
        print()
        keys = d.keys()
        vals = d.values()
        s1 = "INSERT INTO {} (".format(table_name)
        s2 = " VALUES ("
        for k in keys:
            s1 += "{}, ".format(k)
            s2 += "%s, "
        sql = "{}) {});".format(s1[:-2], s2[:-2])
        print(sql)
        msi.write_sql_values_get_row(sql, tuple(vals))

viztales

Dimension reduction, State, Orientation, and Speed

Phil 12.13.2023

Share this:

Related