Intent Understanding & Evolution
Transforming fuzzy human intent into precise behavioral specifications through conversation. This is where the human meets the machine.
Current Frontier
Intent-to-spec translation: converting natural language descriptions into formal behavioral contracts.
Key Questions
Can intent be captured precisely enough through conversation alone?
How do you handle ambiguity — should the system ask for clarification or make reasonable defaults?
How does intent evolve over a session, and how does the system track that evolution?
What's the right interface between human intent and machine execution?
Key Papers
PAN: Language-Conditioned World Actions
MBZUAI (Nov 2025)
Natural language control — 'turn left and speed up'. Highest fidelity among open-source models.
FOUNDER: Bridging LLMs with World Models
ICML 2025
Bridges LLMs (intent/narrative) with world models (physics/dynamics). THE architecture paper for intent.
A2UI Protocol
Google (Dec 2025)
Declarative intent → rendered output protocol. Shows how to structure intent-to-experience pipelines.
UniSim: Universal Simulation
DeepMind (ICLR 2024)
Simulates both high-level instructions and low-level controls. Multi-modal intent understanding.
GameGen-X: Interactive Game Generation
ICLR 2025
InstructNet for interactive control of generated game content from natural language instructions.
Current Insights
Intent exists on a spectrum from vague ('make it nice') to precise ('the button should be 44px, blue, rounded'). The system needs to handle the full spectrum.
Intent evolution — the user changes their mind mid-session — is not an edge case. It's the normal case. The system must handle it gracefully.