SeekHigherThings — The Last Computer

Abstract

This thesis presents a comprehensive theoretical and empirical framework for AI-native computation — a post-code computational paradigm in which artificial intelligence does not assist in writing software but constitutes the software itself. We argue that the stored-program architecture, the foundational model of computation since John von Neumann's 1945 formalization, is reaching the terminal phase of its relevance.

The emergence of real-time interactive world models, transformer-specific inference silicon, and structured intent languages provides the technological substrate for a fundamentally new architecture: one in which human intent serves as the sole input and continuously generated interactive experience serves as the sole output. No source code, intermediate representation, or static binary artifact exists at any layer.

We introduce the concept of ephemeral software — applications that are generated on demand, perfectly tailored to a single user's momentary need, and dissolved when no longer required. This inverts the $1 trillion global software industry's prevailing model of “one application serving millions” into “one intelligence generating millions of unique applications.”

Drawing on empirical evidence from Google Research's GameNGen, DeepMind's Genie 3, Decart's Oasis, the DIAMOND framework, and the Etched Sohu transformer ASIC, we demonstrate that every prerequisite technology for this paradigm shift either exists today or is under active, well-funded development.

Chapter 1: The End of the Stored Program

1.1 The Von Neumann Bottleneck

For eighty years, every digital computer on Earth has operated according to a single paradigm: the stored-program architecture. Formalized by John von Neumann in his seminal 1945 report on the EDVAC, this model dictates that human operators must compose deterministic instructions, encode them as static data within a memory system, and rely on a central processing unit to fetch, decode, and execute them sequentially.

This bottleneck has defined the economics, sociology, and epistemology of the entire information age. It created a professional class of approximately 30 million software engineers worldwide who serve as translators between human desire and machine capability. It spawned a $1 trillion global industry organized around the management, distribution, debugging, and maintenance of text files.

This thesis argues that the Von Neumann bottleneck is now being dissolved — not by making the human faster at writing instructions, but by eliminating the need for instructions entirely.

1.2 The Vibe Coding Inflection

The first empirical signal that the paradigm is ending arrived not from a research laboratory but from the everyday practice of software development. By late 2024, a phenomenon known colloquially as “vibe coding” had become widespread: developers prompt large language models to generate, modify, and deploy code that the developer never manually inspects.

This practice reveals a profound epistemological shift. When code is neither written nor read by a human, it ceases to be a communication medium and devolves into a legacy intermediate representation. The logical terminus of this trajectory is not “better AI-assisted coding.” It is the complete eradication of code.

1.3 Thesis Statement

We propose that the stored-program paradigm is entering its terminal decline. In its place, we introduce the concept of the AI-native runtime: a computational architecture in which a unified neural intelligence continuously generates interactive experience from structured human intent, with no source code, compiled binary, or static application artifact existing at any layer of the system.

Chapter 2: Theoretical Foundations

2.1 The Sculpture and the Fountain

Traditional software is a sculpture: an artifact chiseled from raw material (code), stored on a pedestal (a server), and observed by passersby (users). The sculpture is static. It does not change based on who views it. To modify it requires a sculptor to physically re-approach the artifact and re-carve it.

The AI-native runtime is a fountain: a dynamic shape made of flowing material (neural inference), held in place by continuous pressure (computational energy). The shape is real — you can see it, interact with it. But it is not stored. It is continuously generated. Stop the pressure and the shape dissolves.

2.2 From Compiler to Runtime

In classical computation theory, a compiler is a function that maps source language S to target language T. The architecture proposed in this thesis is not a compiler but a runtime — a continuously executing process that maintains no static intermediate artifact. The mathematical formulation is:

Experience(t+1) = WorldModel( Intent, State(t), Input(t) )

This function executes at 24–60 Hz. There is no stored program. The model's forward pass IS the computation. The software does not exist between interactions.

2.3 The State-Experience Decoupling Principle

Our architecture resolves the coherence problem through what we term the State-Experience Decoupling Principle: deterministic state is held in a structured store outside the neural weights, while the neural model is responsible only for generating the experiential layer by reading from that store.

This mirrors human cognition. The brain does not store raw uncompressed video of past experiences. It stores compressed semantic facts and reconstructs the full experiential context on demand. The state store is the hippocampus. The world model is the neocortex.

Chapter 3: Architecture of the AI-Native Substrate

3.1 The Tripartite Layer Model

Layer	Function	Properties
Intent Layer	Captures human desire via conversation. Resolves ambiguity.	Machine-verifiable. Contradiction-resolving.
World Model + State	Neural intelligence reads intent + state + input. Generates next frame.	60fps generation. Formal state contracts.
Experience Layer	Continuous visual, auditory, and interactive output.	Frame-by-frame. No stored UI. Fully adaptive.

3.2 The One-Model Operating System

In the AI-native paradigm, the traditional boundaries between “applications” dissolve entirely. There is one unified neural intelligence that generates whatever experience the user's current intent demands. “Opening an application” becomes the model reallocating its attention.

The implications are transformative: zero interoperability friction, no installation or updates, universal accessibility by default, and context switching as attention reallocation.

Chapter 4: Empirical Evidence — Interactive World Models

Four research implementations have demonstrated that neural networks can generate interactive visual experiences in real-time without traditional game engines, rendering pipelines, or source code.

System	Performance	Innovation	Limitation
GameNGen	20fps, PSNR 29.4	Noise augmentation halts drift	Single game. No generalization.
Genie 3	720p, 24fps	Latent action model. General physics.	Research prototype. Limited duration.
Oasis 2.0	1080p, 30fps	State-renderer decoupling	Aesthetic layer only.
DIAMOND	10fps (RTX 3090)	Consumer GPU training. Open source.	Low resolution.
The Last Computer	Target: 1080p, 60fps	Intent-driven. Any experience.	Under development.

Chapter 5: The Hardware Imperative

If AI is the runtime, inference is the new electricity. The cost per million tokens of LLM inference has fallen from approximately $100 in 2022 to under $1 in 2026 — a 99%+ reduction in four years. By 2027, continuous neural rendering will cost less than equivalent cloud hosting.

Etched's Sohu chip eliminates GPU waste by physically burning the transformer dataflow into 4nm silicon. The result: 90% FLOPS utilization and over 500,000 tokens per second on Llama-70B from an 8-chip server — a 20x improvement over H100 configurations.

Chapter 6: Beyond Gaming — The Application Supernova

Games serve as the “fruit fly” of this research: strict rules, tight feedback loops, zero latency tolerance. But gaming is the proof of concept, not the product. The product is the total elimination of the “application” as a concept.

Ephemeral Enterprise Tools

A supply chain manager says what they need. A bespoke analysis workspace appears instantly. Used for ten minutes, then dissolved.

Radical Accessibility

The interface physically adapts to the user's exact biometric and cognitive profile. Accessibility is the default modality.

Crisis Management

A wildfire commander gets a 60fps interactive 3D simulation of the blaze, overlaid with live data. Summoned on demand.

Industrial Design

An engineer describes a drone chassis, drops it from 50 feet, reinforces joints, and retests — all through conversation.

Immersive Education

A medical student explores a human heart experiencing mitral valve prolapse in interactive 3D, manipulating tissue in slow motion.

Interactive Cinema

A viewer directs a noir detective story set in 1940s Shanghai. The plot is imagined in real-time from their choices.

Personalized Medicine

An oncologist visualizes tumor progression with genomic overlays and simulates chemotherapy regimens interactively.

Legal Analysis

A general counsel compares contracts, highlights deviations, adjusts clauses, and sees risk models update instantly.

Creative Authorship

A teenager directs an animated short about a robot learning to dance — adjusting lighting, music, and emotion through conversation.

Adaptive Navigation

Navigation that adapts to scenic preferences, biometric data, and real-time conditions — generating a bespoke heads-up display.

Chapter 7: Behavioral Verification Without Code

The eradication of source code creates a profound verification challenge. Traditional QA relies on unit tests, integration tests, and code review — all of which presuppose a stable codebase. In the AI-native runtime, there is no codebase to test.

Our framework resolves this through intent-derived behavioral contracts: formal properties automatically extracted from the intent specification and enforced as mathematical invariants on the state store. When a user describes a pong game, the system extracts implicit contracts: the ball must bounce at accurate reflection angles; the score must be monotonically non-decreasing during active play.

For mission-critical applications, the runtime operates a parallel adversarial AI that continuously probes the primary system with edge cases. For high-stakes workflows, the system implements progressive autonomy: it evaluates its own confidence and demands human verification for uncertain actions.

Chapter 8: Economic Analysis

The global software development market is projected to exceed $1 trillion annually by 2030. This market is organized entirely around the creation, deployment, and maintenance of source code. The AI-native runtime renders this assumption invalid.

The more significant economic impact is not disruption but expansion. The total addressable market for software creation expands from approximately 30 million professional developers to the 7 billion humans capable of expressing an intention in natural language — a market expansion of over 200x.

In Q1 2026, AI startups attracted a record-breaking $242 billion in venture funding — approximately 80% of all global venture capital. The capital markets have clearly concluded that AI-native computation is the dominant investment thesis of the decade.

Chapter 9: Execution Roadmap

Phase	Timeline	Objective	Investment
0: Proof of Concept	Mo. 1–6	Validate intent → experience loop	$5–8M
1: Depth	Mo. 6–18	Push each research pillar independently	$25–40M
2: Integration	Mo. 18–30	Non-programmer builds production app via conversation	$45–70M
3: Generalization	Mo. 30–42	Data pipelines, embedded, distributed	$55–80M
4: The Platform	Mo. 42–60	Any human creates any software through intent	$80–140M

Total funding requirement: $210–338M over 5 years. Total team size at peak: 30–43 researchers and engineers.

Chapter 10: Open Problems and Risks

Intellectual honesty demands that we name the obstacles. A research program that does not confront its risks is not a research program — it is propaganda.

The Latency Wall

If real-time inference can't achieve 60fps for complex experiences, the vision collapses to a better code generator.

The Coherence Problem

Whether formal state contracts scale to enterprise complexity (millions of state variables) remains open.

The Verification Gap

The 'sweet spot' between formal and statistical verification — fast enough and rigorous enough — may not exist.

The Last 5%

If edge cases require dropping to code, the paradigm leaks and practitioners maintain two systems.

Adversarial Exploitation

If the system can generate any software from intent, it can generate malicious software from malicious intent.

Cultural Resistance

Thirty million software engineers identify professionally with code. 'No more code' is existentially threatening.

Chapter 11: Conclusion — The Last Computer

The stored-program computer defined the technological trajectory of the twentieth and early twenty-first centuries. It forced humans to meet machines on the machine's terms, requiring the nuanced, fuzzy, deeply contextual reality of human intent to be violently compressed into the rigid, unforgiving syntax of executable code. For eighty years, this compression was the price of computation.

The price is no longer necessary.

Through the convergence of interactive world models (Genie 3, GameNGen, Oasis, DIAMOND), extreme algorithmic efficiency optimizations (Diffusion Forcing, Consistency Distillation), hyper-specialized inference silicon (Etched Sohu, Groq LPU), and structured generation protocols (A2UI), the technological substrate now exists to instantiate software dynamically, in real time, from pure human intent.

We are building the last computer anyone will ever need. Not because it does everything. But because it can become anything.

∞

SeekHigherThings · The Last Computer · Intent in. Reality out.

Confidential — For Discussion Purposes Only — April 2026