Context Images: Docker for Conversations

A sequel to WadTown Manifesto: The Miracle of Queryable Global Context.

At MLSys 2026, Charles Packer, Lianmin Zheng, Andrej Karpathy, and Solomon Hykes slowly realize they are all describing the same missing abstraction from different angles. Across fragmented hallway conversations and lossy retellings, the conference itself begins behaving like a distributed inference system. KV caches become compiled artifacts, agents become resumable runtimes, and transcripts become logs instead of memory. By the end, nobody remembers who first said wha, only that context images are Docker for conversations.
ChatGPT Prompt

SCENE 1 — COFFEE LINE

(Morning. Packer and Zheng waiting for espresso.)

Packer

The problem is that everyone treats conversations as transcripts instead of managed memory.

Zheng

The expensive part is recomputing KV state every request.

Packer

Right, because context windows are functioning like RAM.

Zheng

No, context windows are just the visible abstraction over attention locality.

Packer

…that’s basically memory management.

Zheng

It’s cache scheduling.

(Barista calls a name.)

Barista

“Charles?”

Packer

That’s me.

(He grabs coffee.)

Packer

Anyway, agents need virtual memory.

Zheng

We already built paging for KV blocks.

(They stare at each other briefly.)

Both

“Huh.”

(They leave in opposite directions.)

Blackout.

SCENE 2 — HALLWAY OUTSIDE “LLM SERVING AT SCALE”

(Late morning. Zheng runs into Karpathy.)

Karpathy

How’s the inference world?

Zheng

Everyone wants million-token contexts because they keep replaying conversations.

Karpathy

That feels architecturally wrong.

Zheng

It is. We should persist KV state directly.

Karpathy

So conversations are basically serialized process reconstruction.

Zheng

More or less.

Karpathy

Wait.

(He stops walking.)

Karpathy

Are we rebuilding working memory from logs every API call?

Zheng

Yes.

Karpathy

That’s horrifying.

Zheng

It’s stateless infrastructure.

Karpathy

No no no. That means chat history is just stdout.

(ZHENG laughs despite himself.)

Zheng

You sound like an operating systems person.

Karpathy

I think the industry accidentally became an operating systems field sometime last year.

(Beat.)

Zheng

Someone earlier told me agents need virtual memory.

Karpathy

That’s not even metaphorical anymore, is it?

(Conference volunteer interrupts.)

Volunteer

Panel starts in two minutes.

(They leave.)

Blackout.

SCENE 3 — SPEAKER DINNER

(Long table. Loud restaurant. Karpathy sitting beside Hykes.)

Hykes

So what are people obsessing over this year?

Karpathy

Context engineering.

Hykes

What does that mean?

Karpathy

Nobody knows.

(Beat.)

Karpathy

Officially it means prompts and retrieval.

Hykes

And unofficially?

Karpathy

Persistent cognitive runtimes with memory hierarchies and resumable inference state.

Hykes

…

Hykes

That just sounds like containers.

Karpathy

No, because these are conversations.

Hykes

Containers are also conversations. We just call them processes.

Karpathy

No, but these have memory and branching and checkpointing and resumability—

Hykes

So containers.

Karpathy

No, because they’re stochastic.

Hykes

So distributed systems.

(Long pause.)

Hykes

Wait. Are people replaying the entire conversation every request?

Karpathy

Yes.

Hykes

Why?

Karpathy

Because the APIs are stateless.

Hykes

That’s insane.

Karpathy

THANK YOU.

(Nearby attendees turn and stare.)

Hykes

Why don’t you snapshot runtime state?

Karpathy

Because the KV cache is architecture-specific and tied to tokenizer alignment and attention layout.

Hykes

Docker images are architecture-specific too.

(Silence.)

Karpathy

…

Hykes

…

Karpathy

Oh no.

Blackout.

SCENE 4 — HOTEL LOBBY, MIDNIGHT

(Packer typing furiously on laptop. Hykes enters carrying tea.)

Hykes

You’re the memory hierarchy guy.

Packer

That’s not what I said.

Hykes

Close enough.

Packer

I said agents need persistent working memory instead of replaying transcripts.

Hykes

Right. So why don’t you compile contexts into runnable artifacts?

Packer

Because contexts aren’t portable.

Hykes

Neither are containers.

Packer

…

Hykes

You rebuild from source when compatibility breaks.

Packer

…

Hykes

You HAVE source representations, right?

Packer

Transcripts. Retrieval indexes. Tool bindings. Memory stores.

Hykes

That’s a build context.

Packer

No, because—

(He stops.)

Hykes

You okay?

Packer

I think conversations are build logs.

(Long silence.)

Hykes

That sounds either very profound or deeply unhealthy.

Packer

Oh my god.

(He opens laptop again.)

Packer

Okay. Wait.

(Typing rapidly.)

Packer

Transcript equals source.

Hykes

Sure.

Packer

KV cache equals compiled artifact.

Hykes

Yep.

Packer

Running agent equals container.

Hykes

Obviously.

Packer

Context image.

Hykes

There it is.

Blackout.

SCENE 5 — AIRPORT SHUTTLE, FINAL MORNING

(All four accidentally end up together.)

Karpathy

Okay, apparently we’ve all been having the same conversation independently.

Zheng

Approximately the same conversation.

Packer

Lossily compressed.

Hykes

With poor cache locality.

(They nod.)

Karpathy

Let me see if I understand this.

(Counting on fingers.)

Karpathy

Conversations are not the runtime.

Zheng

Correct.

Karpathy

They are reconstruction artifacts for the runtime.

Packer

Yes.

Karpathy

The runtime is attention state plus tools plus working memory plus retrieval context.

Zheng

And KV locality.

Karpathy

Sure.

Karpathy

And the transcript is basically stdout.

Hykes

Exactly.

Packer

Which means agents are resumable cognitive containers.

Zheng

Backed by paged KV memory.

Karpathy

Compiled from conversational source code.

Hykes

Now you’re getting it.

(Silence.)

Shuttle Driver

You guys with the conference?

All Four

Yeah.

Shuttle Driver

What’s it about?

(Long pause.)

Karpathy

We think chatbots accidentally became operating systems.

(Driver nods like this explains nothing.)

Packer

No, no. Conversations are deployable runtime artifacts.

Zheng

No. Contexts are executable cache topologies.

Hykes

No. You reinvented Docker.

(Beat.)

Karpathy

Human conversation is just low-bandwidth distributed inference.

(Everyone goes quiet.)

Shuttle Driver

…

Shuttle Driver

So… computers?

Blackout.

Appendix I: Why These Four?

How did their past shape them for this epiphany?

Charles Packer arrives through memory.

His work on MemGPT frames LLMs as operating-system-like entities constrained by finite context windows, and proposes “virtual context management” inspired by hierarchical memory and paging. So he is primed to see the chat transcript not as the thing itself, but as an awkward mechanism for reconstructing a larger memory illusion.

Lianmin Zheng arrives through runtime pressure.

vLLM and PagedAttention treat the KV cache as the scarce, dynamic resource that makes LLM serving hard, borrowing virtual-memory ideas to manage KV blocks efficiently and share them across requests. So he is primed to see the “conversation” as less important than the underlying cache topology it induces.

Andrej Karpathy arrives through abstraction collapse.

His “Software 2.0” framing helped popularize the idea that neural networks are not just applications, but a new kind of software substrate; more recently, he has emphasized LLMs as a new computing platform. So he is primed to recognize when AI practice stops being prompt craft and starts becoming systems architecture.

Solomon Hykes arrives through artifact discipline.

Docker made the industry fluent in the distinction between source, image, running container, logs, volumes, registries, architecture tags, and rebuildability; Hykes is the person in the room most likely to hear “serialized, architecture-specific, resumable runtime environment” and ask why everyone is avoiding the word “image.”

Together, they form a clean four-part circuit: memory, cache, platform, container.

None of them needs to invent the whole idea alone; each only has to misunderstand the next person productively.

Appendix II: Why Docker?

Docker is the right analogy because it changed the unit of deployment.

Before Docker, people often talked about applications as if the source code was the thing. But in practice, the runnable thing was always larger. It included:

dependencies
filesystem layout
environment variables
startup commands
architecture assumptions
runtime permissions
mounted volumes
logs

Docker gave that whole messy bundle a name: an image.

Conversations have the same problem. A transcript looks like the thing, but the runnable thing is larger. It includes:

system prompt
tool schemas
retrieval bindings
memory state
tokenizer state
KV cache
model fingerprint
runtime layout
continuation point

A context image names that bundle.

The Analogy Works Because Portability Has Constraints

Docker images are not magically portable. They are portable within explicit constraints. They depend on:

architecture
OS assumptions
runtime compatibility
rebuildable source

Context images would have the same shape. They are tied to:

model
tokenizer
quantization
position encoding
attention layout
runtime implementation

But they remain rebuildable from:

transcript
memory graph
retrieval configuration
tool bindings
source documents

The Conceptual Mapping

Most importantly, Docker separated concepts that used to blur together:

source tree / Dockerfile → transcript, memory graph, retrieval config
image → serialized runnable context state
container → live agent or inference process
logs → chat transcript
volumes → external memory and tools
registry → shared context repository

The Real Insight

So “Docker for conversations” is not just a branding metaphor. It says the industry is using logs as runtimes, when it should be building, versioning, forking, running, and garbage-collecting context as infrastructure.

Appendix III: Context as Infrastructure

Once conversations become context images, context stops being prose and starts being infrastructure.

That means it needs infrastructure disciplines:

build
version
cache
fork
inspect
run
mount
garbage-collect
rebuild
audit

The Problem: Context as a Junk Drawer

Today, most AI systems treat context as a blob of text assembled at the last possible moment. The prompt becomes a junk drawer: instructions, examples, retrieved documents, user history, tool schemas, policies, scratchpad, and memory all crammed into one serialized message stream.

That works until the system becomes important.

That works until the system becomes important.

Then context needs the same maturity we expect from deployment artifacts. You need to know:

what source material produced this context
which model and tokenizer it was built for
which tools were mounted
which memory stores were included
which retrieval indexes were used
what changed since the last run
whether the image can be rebuilt
whether two images share layers
whether a branch can be resumed

The Shift: From Prompting to Context Operations

This is the shift from prompting to context operations.

A mature context system would treat the transcript as only one input among many. The runnable context might be assembled from:

system instructions
user preferences
project memory
relevant documents
tool definitions
retrieval indexes
prior decisions
active task state
KV cache layers
runtime metadata

The context image is the compiled artifact. The live agent is the running process. The transcript is the log.

The context image is the compiled artifact. The live agent is the running process. The transcript is the log.

That separation matters because it makes context governable. Teams could review context diffs, pin versions, reproduce failures, roll back bad memories, share known-good images, and isolate experimental branches.

Reusable Context Layers

It also makes context reusable. Instead of rebuilding the same expensive working set for every conversation, a system could maintain durable layers:

base assistant behavior
organization knowledge
project state
customer state
active incident state
personal working memory

Each layer could be rebuilt from source, cached when hot, invalidated when stale, and forked when exploration begins.

This is what “context as infrastructure” means: not better prompts, but better lifecycle management for the state that makes intelligence useful.

Appendix IV: The Context Image Spec v1.0

A Context Image is a rebuildable, runnable artifact for continuing an AI interaction from a known state.

It is not a chat transcript.
It is not a memory database.
It is not a model checkpoint.
It is not merely a prompt.

It is the compiled form of a context environment.

A Context Image contains enough information for a compatible runtime to resume, fork, inspect, or rebuild an active cognitive process.

1. Core Definition

A Context Image is composed of three layers:

source
compiled state
runtime manifest

The source is the human-readable and rebuildable material from which the context was created.

The compiled state is the optimized runtime representation, such as KV cache blocks, prefix-cache layers, retrieval bindings, tool schemas, and active working memory.

The runtime manifest explains what the image is, how it was built, what it depends on, and where it may safely run.

The source is canonical.
The compiled state is disposable.
The manifest is the contract.

2. Required Manifest Fields

Every Context Image must declare:

image name
image version
creation timestamp
parent image, if any
source hash
model fingerprint
tokenizer fingerprint
runtime fingerprint
context length
position encoding configuration
quantization or precision format
KV cache layout
tool schema version
retrieval configuration
memory mounts
security policy
rebuild instructions

A runtime may refuse to load a Context Image if any required compatibility field does not match.

This is not a failure of portability.
This is honest portability.

3. Source Bundle

The source bundle should contain the rebuildable ingredients of the image.

These may include:

system instructions
developer instructions
user instructions
chat transcript
summarized history
project memory
user preferences
source documents
retrieval indexes
tool definitions
environment variables
policy constraints
previous decisions
active task state

The source bundle should be inspectable, diffable, and versionable.

A Context Image without a source bundle is only a snapshot.
A Context Image with a source bundle is infrastructure.

4. Compiled State

The compiled state may contain runtime-specific artifacts.

These may include:

serialized KV cache blocks
prefix-cache layers
attention-position metadata
tokenized prompt segments
embedding handles
retrieval cache entries
tool-call state
working-memory slots
branch lineage
scheduler hints
cache locality hints

Compiled state is allowed to be architecture-specific.

It may depend on:

model weights
tokenizer
quantization
runtime implementation
attention layout
GPU architecture
CPU architecture
cache allocator
block size
position encoding scheme

A Context Image runtime should treat compiled state as an optimization, not as the source of truth.

If compiled state is invalid, stale, corrupt, or incompatible, the runtime should attempt to rebuild it from source.

5. Layers

A Context Image may be layered.

Each layer represents reusable context.

Common layers include:

base model behavior
organization context
team context
project context
customer context
task context
incident context
personal working memory
active continuation state

Layers should be immutable once published.

Mutable state should live in a writable top layer.

This enables:

reuse
branching
cache sharing
incremental rebuilds
rollback
provenance tracking
garbage collection

A good Context Image system should avoid duplicating expensive lower layers when forking.

6. Lifecycle Operations

A Context Image runtime should support the following operations:

build — Construct a Context Image from source material.
run — Start a live inference process from an image.
resume — Continue from a saved image state.
fork — Create a branch from an existing image.
commit — Save the current runtime state as a new image.
inspect — Show manifest, lineage, source hashes, mounted tools, and memory dependencies.
diff — Compare two images by source, manifest, memory, or transcript.
rebuild — Regenerate compiled state from canonical source.
evict — Remove compiled state while preserving source.
gc — Remove unreachable layers and unused cache blocks.
export — Package source and manifest for another environment.
import — Load an image, validating compatibility before execution.

The minimal viable runtime supports:

build
run
fork
commit
rebuild
inspect

Everything else is polish.

7. Compatibility

A Context Image is compatible with a runtime only if the declared execution environment matches.

Compatibility should be checked against:

model family
exact model weights
tokenizer
vocabulary
chat template
context length
RoPE or position encoding
quantization
attention implementation
KV layout
tool schema
memory API
retrieval API
safety policy

The runtime must distinguish between:

source-compatible
rebuild-compatible
binary-compatible
runtime-compatible

For example:

A transcript may be source-compatible across many models.
A tokenized prompt may be rebuild-compatible only with the same tokenizer.
A KV cache may be binary-compatible only with the same model and runtime.
A live continuation may be runtime-compatible only on the same local machine.

This distinction prevents false portability.

8. Runtime Identity

A running Context Image is not the same thing as the image itself.

The image is the artifact.
The running process is the instance.

A single image may produce many live instances.

Those instances may:

diverge
branch
mutate working memory
call different tools
produce different transcripts
commit different descendants

The transcript belongs to the instance.

The lineage belongs to the image.

The memory writes belong to the mounted volumes.

9. Logs and Transcripts

A transcript is a log.

It may be used for:

audit
replay
debugging
rebuilding
human review
summarization
provenance

But the transcript is not the runtime.

The runtime includes state that may not appear directly in the transcript, including:

cached attention state
tool handles
retrieval bindings
active memory mounts
hidden scheduler state
unresolved continuations
branch ancestry

A mature system should preserve transcripts, but should not confuse them with executable context.

10. Memory Mounts

Context Images may mount external memory.

Memory mounts may be:

read-only
read-write
ephemeral
persistent
local
remote
user-scoped
project-scoped
organization-scoped

Examples include:

vector databases
document stores
file systems
knowledge graphs
issue trackers
code repositories
user preference stores
tool histories
prior decision logs

A Context Image should record what memory was mounted, but should not necessarily copy all mounted memory into the image.

The image contains bindings.
The mount contains data.

11. Security and Audit

A Context Image may contain sensitive state.

It may encode information in:

transcripts
summaries
retrieved documents
memory handles
tool results
KV cache blocks
embeddings
latent continuation state

Therefore, a runtime should support:

manifest inspection
source inspection
redaction
access control
signature verification
provenance tracking
policy validation
encrypted storage
safe export modes
compiled-state eviction

A Context Image registry should not accept opaque runtime blobs without source, provenance, or compatibility metadata.

Opaque snapshots are convenient.
Auditable images are infrastructure.

Opaque snapshots are convenient.
Auditable images are infrastructure.

12. Rebuild Semantics

Every serious Context Image should answer one question:

Can this image be rebuilt from source?

A rebuildable image should declare:

source files
source hashes
build order
model dependency
tokenizer dependency
retrieval dependency
memory dependency
tool dependency
build parameters
deterministic settings, where available

Rebuilding may not reproduce stochastic outputs exactly.

But it should reproduce the runnable context environment closely enough to continue, inspect, debug, or validate the process.

The goal is not perfect determinism.

The goal is operational trust.

13. Branching

Forking is a first-class operation.

A forked Context Image should preserve:

parent image reference
fork timestamp
inherited layers
modified top layer
transcript divergence point
memory write policy
compatibility metadata

Branches should be cheap when lower layers are shared.

The runtime should support copy-on-write behavior for:

KV cache blocks
working memory
retrieved context
tool state
transcript logs

Merging branches is not required in v1.0.

Summarizing branches is allowed.
Diffing branches is encouraged.
Pretending semantic merge is solved is forbidden.

14. Example Manifest

schema: context-image/v1.0
name: quilt/customer-review
version: 2026.05.22
created_at: 2026-05-22T09:30:00-07:00
parent: quilt/base-engineering:2026.05
source:
  transcript: transcript.jsonl
  memory_graph: memory.yaml
  retrieval_config: retrieval.yaml
  tools: tools.yaml
  source_hash: sha256:...
model:
  name: llama-3.1-8b-instruct
  weights_hash: sha256:...
  tokenizer_hash: sha256:...
  chat_template_hash: sha256:...
  context_length: 131072
  position_encoding: rope
runtime:
  engine: llama.cpp
  engine_version: ...
  kv_layout: contiguous-v1
  quantization: q4_k_m
  compatible_arch:
    - arm64
    - x86_64
compiled_state:
  kv_cache: kv.bin
  token_prefix: tokens.bin
  prefix_layers:
    - base-assistant
    - quilt-engineering
    - customer-review
mounts:
  memory:
    - name: project-memory
      mode: read-write
  retrieval:
    - name: docs-index
      mode: read-only
  tools:
    - name: file-search
    - name: shell
policy:
  exportable: false
  allow_compiled_state_export: false
  allow_source_export: true
  redact_on_export:
    - secrets
    - credentials
    - private_documents
rebuild:
  command: ctx build .
  deterministic: partial

15. Example CLI

ctx build .
ctx inspect quilt/customer-review
ctx run quilt/customer-review
ctx fork quilt/customer-review quilt/customer-review-alt
ctx commit quilt/customer-review-alt:v2
ctx diff quilt/customer-review quilt/customer-review-alt
ctx evict quilt/customer-review --compiled-state
ctx rebuild quilt/customer-review
ctx gc

The CLI should make the distinction obvious:

ctx build creates an image
ctx run creates an instance
ctx commit saves an instance as a new image
ctx evict deletes acceleration state
ctx rebuild reconstructs acceleration state from source

16. Non-Goals for v1.0

Context Images v1.0 does not attempt to solve:

universal portability
semantic branch merging
cross-model KV translation
deterministic replay of stochastic generations
safe sharing of arbitrary opaque caches
replacing transcripts
replacing memory systems
replacing model checkpoints
replacing agent frameworks

The goal is narrower:

Define the missing artifact between transcript and runtime.

17. The One-Sentence Spec

A Context Image is a rebuildable, architecture-aware, runnable artifact that packages the state needed to continue an AI interaction, while preserving a clean separation between source, compiled context, live instance, and transcript log.

Context Images: Docker for Conversations

SCENE 1 — COFFEE LINE

SCENE 2 — HALLWAY OUTSIDE “LLM SERVING AT SCALE”

SCENE 3 — SPEAKER DINNER

SCENE 4 — HOTEL LOBBY, MIDNIGHT

SCENE 5 — AIRPORT SHUTTLE, FINAL MORNING

Appendix I: Why These Four?

How did their past shape them for this epiphany?

Appendix II: Why Docker?

The Analogy Works Because Portability Has Constraints

The Conceptual Mapping

The Real Insight

Appendix III: Context as Infrastructure

The Problem: Context as a Junk Drawer

The Shift: From Prompting to Context Operations

Reusable Context Layers

Appendix IV: The Context Image Spec v1.0

1. Core Definition

2. Required Manifest Fields

3. Source Bundle

4. Compiled State

5. Layers

6. Lifecycle Operations

7. Compatibility

8. Runtime Identity

9. Logs and Transcripts

10. Memory Mounts

11. Security and Audit

12. Rebuild Semantics

13. Branching

14. Example Manifest

15. Example CLI

16. Non-Goals for v1.0

17. The One-Sentence Spec

One thought on “Context Images: Docker for Conversations”

Add yours

Leave a comment Cancel reply

SCENE 1 — COFFEE LINE

SCENE 2 — HALLWAY OUTSIDE “LLM SERVING AT SCALE”

SCENE 3 — SPEAKER DINNER

SCENE 4 — HOTEL LOBBY, MIDNIGHT

SCENE 5 — AIRPORT SHUTTLE, FINAL MORNING

Appendix I: Why These Four?

How did their past shape them for this epiphany?

Appendix II: Why Docker?

The Analogy Works Because Portability Has Constraints

The Conceptual Mapping

The Real Insight

Appendix III: Context as Infrastructure

The Problem: Context as a Junk Drawer

The Shift: From Prompting to Context Operations

Reusable Context Layers

Appendix IV: The Context Image Spec v1.0

1. Core Definition

2. Required Manifest Fields

3. Source Bundle

4. Compiled State

5. Layers

6. Lifecycle Operations

7. Compatibility

8. Runtime Identity

9. Logs and Transcripts

10. Memory Mounts

11. Security and Audit

12. Rebuild Semantics

13. Branching

14. Example Manifest

15. Example CLI

16. Non-Goals for v1.0

17. The One-Sentence Spec

Share this:

One thought on “Context Images: Docker for Conversations”

Add yours

Leave a comment Cancel reply