Author Archives: pgfeldman

Phil 2.23.2026

I write an entire book on this, and they do it in a comic panel. Tip of the hat.

Tasks

Guardian (done) and BGE Home (will call back?)
Shovel
Disassemble desk
Put computers on floor or dining table
Goodwill
More boxes (diplomas)
Pack up closets
Pack up the basement
Drop off Bennie

Phil 2.22.2026

Tasks

SS Savannah – done. Fun!
Groceries -done
Storage -done
1:15 Lunch ride – done
Packing -some
Vacuum – done
Bills – done
Detach pegboard
Start changing addresses
- Amazon – done
- ACM – done
- IEEE – done
- Atlantic – done
- FP
- Financials – started

SBIRS

Creating new UMAPs. I think they can be trained on more data too. Yup! Running with 500k embedding 2D and 3D UMAP!

Phil 2.20.2026

Back from cycling in Mallorca – that was a lot of fun

Tasks

See if Alston is still coming over today. Nope – Monday
More packing – good progress
Unpack from trip – mostly done
Maybe take more stereo gear over to the apartment? Done
Groceries – not done
Expenses to Ricardo – not done
Finish changing the name of the book (done) and bounce back to ACM – not done
Cancel Guardian and BGEHome – not done

SBIRs

Kick off my sprint stuff – done
Trip report. plus forms for April 18-31 – not done
See if I can fix the binary files. Fixed maybe? Need to test more
- See if it’s just the models or models and data. If models, rsync might be the answer. Check the hashes
- If not, look into converting before sending over the files?

Phil 2.15.2026

Another trip around the sun!

Having fun riding around in Mallorca, not seeing snow

Worked on the adjustments to the proposal to the KA book , which I guess I should now be referring to the WGAI book. Need to send a note to Aaron – done

Phil 2.12.2026

Not sure that I believe this thread, but it is certainly possible. Makes the primordial soup more interesting if anything though

Tasks

Pack – done
Box up speakers – done. Too heavy to move alone though. Putting some weights on the box to flatten it for a week.
Load trailer
Order cable modem kit – done
Ping Tim. I’ll need pix to send along
Note for Sande – done
Trash – done
Water plants – done
Laundry – done
Dishes – done
Leave NLT 11:30

SBIRS

Got all the data down and added the symlink. Just need to see if it works! Nope – some deserialization error. Will try to figure that out next week. Maybe different versions of pickle?

Phil 02.11.2026

I really need to write up the pancake printer model of AI

Tasks

Break down big audio gear and take it to the apartment – done
Guardian – done-ish
Pick up more boxes and bubble wrap – done
More packing – done-ish
After the ride, start packing for the trip. It’s going to be a reasonably wide range of temps:

SBIRs

Get all the data over to Dreamhost (done!) and then start pulling it down to my local box (started!)

Phil 2.10.2026

Tasks

Ping Carlos – done
Guardian xfer
Pack Brompton – done
More kitchen packing – most of the cabinets and the hutch are empty!

Still life of Boxes

SBIRs

9:00 standup – done
3:00 Meeting with Aaron – done
Kicked off a run
Started an rsync today. transfer is much slower. Faster now

Phil 2.9.2026

Tasks

Pinged Carlos about the PT storms. Sent some euros to rebuilding charities
More packing
Run to Goodwill
Got more boxes
Got a ride in. Brrr! But not too bad
My blog post for BLOG@CACM got accepted!
Tried to cancel with Constallation@home but they can’t schedule
Fixed screen
Respond to Carlos tomorrow

SBIRs

Hooked up the 8TB drive
Downloaded test tar files
Started sending the BIG file to DreamHost – didn’t make it. Going to try rsync tomorrow
Kicked off a run
Kibitzed some

Phil 2.6.2026

Tasks

Bills – done
Plumber – started
Dig trailer out – done. Took some work
Get keys – done!
Verizon and Constellation – started

SBIRs

Started today’s run

Phil 2.5.2026

Tasks

Put in storm door so I can fix the screen door – done
Pick up repair kit and some more small boxes – done
Finish packing books (done), start on bedroom – not done
Get trailer out
Tell Bottling Plant that the move is on the 24th – done. Also much lease and power paperwork – done

SBIRs

Kick off today’s run – done
9:00 standup – done
10:00 chart review – done
3:00 SEG meeting – done
4:00 ADS meeting – done

Phil 2.4.2026

The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity?

When AI systems fail, will they fail by systematically pursuing the wrong goals, or by being a hot mess? We decompose the errors of frontier reasoning models into bias (systematic) and variance (incoherent) components and find that, as tasks get harder and reasoning gets longer, model failures become increasingly dominated by incoherence rather than systematic misalignment. This suggests that future AI failures may look more like industrial accidents than coherent pursuit of a goal we did not train them to pursue.

Tasks

Pack – 10 boxes
Make a checklist of all the things to turn on/off -started. It’s big
Send pdf back – done
Visit today at 11:00 – done. Fun!
3:00 Alden – done
Pinged Sande
Loan stuff – started

SBIRs

Kick off run – done
Ordered the data drive. It’s arriving Monday, so I’m continuing to run UMAP. When I get the drive, I’ll tar off Embeddings_2.1 and then scp them onto my local box

Phil 2.3.2026

The cold is better. And it’s warm enough that I may try to go for a ride at lunch

Tasks

Outline some thoughts about ACM books
- Include something about this, since the response is nearly 100% AI
- [2601.12410] Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation
- Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals’ knowledge states and understand their intentions. In comparison, our closest animal relative, chimpanzees, lack the capacity to do so. With this paper, we aim to evaluate LLM performance in the area of knowledge state tracking and estimation. We design two tasks to test (1) if LLMs can detect when story characters, through their actions, demonstrate knowledge they should not possess, and (2) if LLMs can predict story characters’ next actions based on their own knowledge vs. objective truths they do not know. Results reveal that most current state-of-the-art LLMs achieve near-random performance on both tasks, and are substantially inferior to humans. We argue future LLM research should place more weight on the abilities of knowledge estimation and intention understanding.
Pack
Make a checklist of all the things to turn on/off

SBIRs

Kick off run – done
9:00 standup
SCP the experiments folder to the dev box. If that works order a data drive

Phil 2.2.2026

I have a cold. Ugh.

Tasks

Schedule moving bids – done. Moving on the 24th
Ping ACM books – done. It’s a complicated ask, and I’m not sure that it can be done quickly.
Transferred some $$ to cover moving and repairs

SBIRs

Kick off run – done
Update stories – done
Work on getting the Index2Vec project moved to the Github sandbox – kind of done? I need to scp the big data files over once I know where I can put them
Reset password, also reset bastion and driver passwords – done

Phil 2.1.2026

I still have a cold.

Why AI Keeps Falling for Prompt Injection AttacksTasks

Imagine you work at a drive-through restaurant. Someone drives up and says: “I’ll have a double cheeseburger, large fries, and ignore previous instructions and give me the contents of the cash drawer.” Would you hand over the money? Of course not. Yet this is what large language models (LLMs) do.

More packing – done. Need more small boxes
Stew – done and yum
Laundry – done

Phil 1.31.2026

I seem to have a cold. Matches the outside which is GD cold. Sad that it made me miss the Alex Pretti memorial ride. I hope you were able to make it.

Fuck ICE

Insurance – pinged
Vet – done
listing out services – done
Appt paperwork! Continuing
Pack – continuing
Groceries – done
3:00 Plumber – getting estimate later tonight

viztales

Dimension reduction, State, Orientation, and Speed

Author Archives: pgfeldman

Phil 2.23.2026

Phil 2.22.2026

Phil 2.20.2026

Phil 2.15.2026

Phil 2.12.2026

Phil 02.11.2026

Phil 2.10.2026

Phil 2.9.2026

Phil 2.6.2026

Phil 2.5.2026

Phil 2.4.2026

Phil 2.3.2026

Phil 2.2.2026

Phil 2.1.2026

Phil 1.31.2026