Context Window Filling — robert@barcik.training

LLMs receive just text in, text out. But context windows are finite and tokens cost money, so modern systems use smart engineering to dynamically assemble the context for each call. This demo shows exactly what gets packed into the window — and in what order — for different types of interactions.

Context Window Capacity 0 / 200,000 tokens

Context Window Contents

Select a scenario and click Build Context to see how the window fills up.

Lazy loading in action! The LLM invoked a skill. The compact description (~500 tokens) was replaced with the full skill text (~4,000 tokens) — consuming 8x more tokens. This is why skills are stored as brief summaries until actually needed.

What's Happening

Choose a scenario above, then click Build Context to watch the context window fill up step by step.

Each colored block represents a different type of content that gets injected into the window before the LLM sees it.

Key Insight

The LLM doesn't "remember" anything between calls. Every single interaction starts from scratch — the system must reconstruct the full context each time, fitting memories, instructions, tools, and conversation into a finite window. Smart engineering is about choosing what to include and what to leave out.

How Context Windows Get Filled