Monthly Archives: January 2026

Phil 1.3.2026

I see a theme emerging for 2026:

US attacks Venezuela, captures president Maduro and says he will face criminal charges in America

Tasks

  • Light cleaning – done
  • 12:30 Showing – I think that might turn into a nibble?
  • Laundry – done
  • MTB spin through the woods – fun and done

What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?

  • A long-standing challenge in AI is to develop agents capable of solving a wide range of physical tasks and generalizing to new, unseen tasks and environments. A popular recent approach involves training a world model from state-action trajectories and subsequently use it with a planning algorithm to solve new tasks. Planning is commonly performed in the input space, but a recent family of methods has introduced planning algorithms that optimize in the learned representation space of the world model, with the promise that abstracting irrelevant details yields more efficient planning. In this work, we characterize models from this family as JEPA-WMs and investigate the technical choices that make algorithms from this class work. We propose a comprehensive study of several key components with the objective of finding the optimal approach within the family. We conducted experiments using both simulated environments and real-world robotic data, and studied how the model architecture, the training objective, and the planning algorithm affect planning success. We combine our findings to propose a model that outperforms two established baselines, DINO-WM and V-JEPA-2-AC, in both navigation and manipulation tasks. Code, data and checkpoints are available at this https URL.
  • However, on real-world data (DROID and Robocasa), both larger encoders and deeper predictors yield consistent improvements, suggesting that scaling benefits depend on task complexity. We introduced an interface for planning with Nevergrad optimizers, leaving room for exploration of optimizers and hyperparameters. On the planning side, we found that CEM L2 performs best overall. The NG planner performs similarly to CEM on real-world manipulation data (DROID and Robocasa) while requiring less hyperparameter tuning, making it a practical alternative when transitioning to new tasks or datasets.

Phil 1.2.2026

A little less cold today. Going to try for my first ride of the year.

Tasks

  • Bills – done
  • Cleaning – done
  • Abstract. No, really. And done! Need to finalize into a nice email

SBIRs

  • Kick off a 10,000 book embedding run – done!

Gotta read this: Deep sequence models tend to memorize geometrically; it is unclear why

  • Deep sequence models are said to store atomic facts predominantly in the form of associative memory: a brute-force lookup of co-occurring entities. We identify a dramatically different form of storage of atomic facts that we term as geometric memory. Here, the model has synthesized embeddings encoding novel global relationships between all entities, including ones that do not co-occur in training. Such storage is powerful: for instance, we show how it transforms a hard reasoning task involving an l-fold composition into an easy-to-learn 1-step navigation task.
    From this phenomenon, we extract fundamental aspects of neural embedding geometries that are hard to explain. We argue that the rise of such a geometry, as against a lookup of local associations, cannot be straightforwardly attributed to typical supervisory, architectural, or optimizational pressures. Counterintuitively, a geometry is learned even when it is more complex than the brute-force lookup.
    Then, by analyzing a connection to Node2Vec, we demonstrate how the geometry stems from a spectral bias that – in contrast to prevailing theories –indeed arises naturally despite the lack of various pressures. This analysis also points out to practitioners a visible headroom to make Transformer memory more strongly geometric. We hope the geometric view of parametric memory encourages revisiting the default intuitions that guide researchers in areas like knowledge acquisition, capacity, discovery, and unlearning.

Phil 1.1.2026

New Year’s in Lisbon!

Tasks

  • Stew! Done! Yummy!
  • Hike! Done
  • Abstract! Touched it but mostly to make sure that elements from this piece are mentioned.
  • Went to see Avatar. It is in fact a lot like a fireworks display. Some of the CGI of human faces is crazy good – basically no uncanny valley of any kind. And I like the themes of the series, even if it’s all a bit heavy-handed and not subtle.