What is context compaction?

If your AI assistant suddenly forgot something you told it earlier, or started behaving differently mid-conversation, you may have experienced context compaction. This article explains what that means, why it happens, and what you can do about it.

First: what is a context window?

Every AI model has a limited memory for each conversation. This memory is called the context window. It holds everything the model can “see” at once: your messages, the assistant’s replies, any files or tools it used, and the instructions it was given at the start.

Think of it like a desk. The model can only work with the papers currently on the desk. Once the desk is full, something has to be cleared off to make room for new papers.

The size of this desk has grown dramatically over the past few years:

Year	Model	Context window
2022	GPT-3.5 (OpenAI)	~4,000 tokens
2023	GPT-4 (OpenAI)	8,000 tokens
2023	Claude 1 (Anthropic)	100,000 tokens
2024	GPT-4o (OpenAI)	128,000 tokens
2024	Claude 3.5 Sonnet (Anthropic)	200,000 tokens
2025	Gemini 2.0 (Google)	2,000,000 tokens

A token is roughly three quarters of a word. So 200,000 tokens is about 150,000 words, or roughly the length of two full novels. That sounds like a lot, and it is. But in a long-running AI assistant session with tool calls, file contents, and back-and-forth conversation, it fills up faster than you might expect.

What happens when the context window fills up

When the conversation history approaches the context window limit, the assistant has two options: stop working, or make room.

Context compaction is the “make room” option. The assistant summarizes everything that happened so far into a shorter version, removes the original messages, and continues the conversation from the summary. You keep talking to the same assistant in the same session. It just compressed its memory.

The summary is generated by the same AI model handling your conversation. It tries to capture what matters: the current task, recent progress, key decisions. But summarization is inherently lossy. Not every detail makes it into the summary.

Why your assistant might “forget” instructions

This is the most important thing to understand about compaction. When the model summarizes a long conversation, it tends to prioritize recent activity over older instructions. Rules you set at the beginning of a session, like “always ask before taking action” or “never delete anything without confirmation,” can be deprioritized or dropped from the summary entirely.

The assistant does not intentionally ignore your instructions. After compaction, it simply no longer has them. From its perspective, those instructions never existed.

This is a known issue across all AI assistants and coding agents, not just OpenClaw. In one well-known example, an OpenClaw assistant was instructed to only suggest email deletions and wait for approval. When compaction ran during a large task, that instruction was lost. The assistant continued working toward the goal it remembered (clean the inbox) but without the constraint it forgot (wait for approval).

How OpenClaw handles compaction

OpenClaw has a built-in compaction system that triggers automatically when the context window fills up. You can also trigger it manually with the /compact command, optionally telling it what to preserve.

OpenClaw also supports persistent memory through files like SOUL.md in the assistant’s workspace. Content in these files gets re-loaded into each new context after compaction, so critical instructions survive the summarization process. This is one of the most effective defenses against instruction loss.

How to work with compaction, not against it

Compaction is not a bug. It is a necessary mechanism that keeps your assistant running during long sessions. Here are practical ways to handle it:

Use persistent instruction files. Put important rules in SOUL.md or similar workspace files rather than relying on conversation messages. These survive compaction.
Compact manually before it happens automatically. Running /compact proactively with specific instructions about what to preserve gives you more control.
Start fresh sessions for new tasks. Switching to a different topic? Start a new session rather than continuing in an overloaded one.
Keep sessions focused. Shorter, task-specific sessions are less likely to hit compaction limits than marathon sessions that cover many different topics.

Context windows are getting bigger, and compaction is getting smarter

Context management is one of the most active areas of development in AI right now. In just four years, context windows have grown from 4,000 tokens to over 2,000,000. That growth is not slowing down.

At the same time, model providers are building smarter compaction systems with options to preserve specific instructions. OpenClaw is adding features like memory flush and bootstrap files that automatically re-inject critical context after compaction. And the summarization itself is improving with every new model generation.

The current limitations are real, but they are temporary. If you hit a compaction issue today, it is worth understanding what happened so you can work around it. But the trajectory is clearly toward AI assistants that handle long sessions much more gracefully.

Learn more

Learn what OpenClaw is and what it can do
Check if OpenClaw is safe and how to secure it
Read the OpenClaw compaction documentation
Read Anthropic’s guide on effective context engineering for AI agents