I’ve been wondering about how to map regions of a model that are reached through an extensive prompt, like the kind you see with RAG. The problem is that the prompt gets very large, and it may be difficult to see how the trajectory works. There appear to be several approaches to dealing with this, so here’s an ongoing list of things to try:
- Just plot the whole prompt including context. This assumes that the model is big enough and public, like Llama or Gemma. I assume that as the prompt grows, the head will go to different regions. But the vector that we’re trying to plot will continue to grow, and I’m not sure how that gets plotted.
- Just save off the N vectors that are closes to the head. The full text can also be saved, so there is some correlation.
- Use a prompt to fire off N responses and use that to finetune a small (e.g. GPT-2) model. Then create “small” window of tokens that travel through the finetuned space. The nice thing is that this lets us indirectly explore closed-source models in a narrative context.
- I’d also like to see if there is any way to use dictionary learning on these narrative elements. It seems that there is no mathematical reason that you can’t have “narrative features” like tropes, possibly.
SBIRs
- 9:00 standup
- 4:30 Book club. Finish chapter 8!
GPT Agents
- Meet maybe at 3:00?
