I took a rough stab at what tokens cost today bases on working out the cost of a token per Watt-hour on a model like the 70B parameter LLama3 model if it were run on an Nvidia GeForce RTX 4090. Here are my estimates for some pretty hefty books, if a LLM were to generate the same number of words:
- Moby Dick – 163,500 words – $1.44
- The Lord of the Rings trilogy – 733,022 words – $6.44
- War and Peace – 587,554 words – $5.16

It’s not much! My sense is that most interactions use a small fraction of a watt-hour, and a bug TPU like the A-100 is probably even more efficient than and RTX 4090. So if you are paying $20/month for a big model, unless you generate something like four War-and-Peace-like mountains of text, the companies are making a profit. The spreadsheet is here, if you’d like to play with it:
SBIRs
- More trade show demo. Good progress. I think the train data generation is mostly done, now I need to do random test data
- Hmm. Looks like the entire 8a set aside system may go away By my estimate, 8(a) is a $50 billion target, so I think sooner rather than later
GPT Agents
- 3:00 Alden meeting – done
- Missed Peter’s meeting somehow. I don’t think I was provided with a final date/time?
