- Add NULL tests – done
- Back up current db – done, several times
- Get level data from previous run if it exists. It should, even if only one summary has been generated. Reworked everything into two methods, one that summarizes the raw text and one that summarizes the summaries. Both are much more robust and should take being kicked off openAI for saturation now.
- The summarizing is really quite good. Moby Dick goes from 10k lines to 938. The new lines are closer to the same length and should produce better embeddings too. That’s for tomorrow
- Pull out the embedding frame
- For the new app, have a “create db” tab that lets the user set the text and group and find corpora. Probably regex too
- 10:00 Artemis meeting
- 2:00 MDA Meeting