Memory Management: Understanding OpenClaw's 4-Layer System

Overview

OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and forget what doesn't. This guide explains each layer, how they interact, and how to optimize them for production use.

The Four Memory Layers

Think of OpenClaw's memory like a computer:

Bootstrap Files = Hard drive (permanent storage)
Session Transcript = Disk storage (persistent but summarized)
Context Window = RAM (active working memory)
Retrieval Index = Search index (queryable archive)

Why Four Layers?

Each layer serves a different purpose:

Durability - Bootstrap files survive restarts
History - Transcripts preserve conversation details
Performance - Context window enables real-time processing
Scalability - Retrieval index handles large datasets

Layer 1: Bootstrap Files

What They Are

Permanent identity files loaded from disk at every session start.

Location: ~/.openclaw/ or ~/.claude/

Common files:

soul.md - Agent personality and core instructions
memory.md - Long-term facts and preferences
agents.md - Sub-agent configuration
tools.md - Tool usage instructions

How They Work

Session starts (daily restart or manual)
Files are read from disk - Fresh copy every time
Content injected into context - Immediately available
Immune to compaction - Never summarized or lost

Critical Characteristics

Always loaded:

Every session start reads these files
No exceptions, no caching
Fresh from filesystem

Not in conversation history:

Separate from chat transcript
Changes take effect immediately on next session
No need to "remind" the agent

Most durable layer:

Survives context compaction
Survives session restarts
Survives agent crashes

Size Limits

Default limits:

20,000 characters per file
150,000 characters total

Check your usage:

Output shows:

Truncation Warning

If a file exceeds 20,000 characters:

Content is truncated (cut off)
No warning given
Agent sees incomplete instructions

Solution:

Keep soul.md to 15-30 lines
Remove unnecessary biographical information
Focus on work-relevant instructions only

Optimization Tips

Bad soul.md (bloated):

markdown

Good soul.md (focused):

markdown

Sub-Agent Behavior

Important: Parallel sub-agents only read:

agents.md
tools.md

They do NOT read:

soul.md
memory.md
Other bootstrap files

Implication:

Sub-agents lack main agent's personality
Task instructions must be in agents.md or passed explicitly
Sub-agents are "dumber" by design (minimal context)

Layer 2: Session Transcript

What It Is

Full conversation history saved to disk as a file.

Location: ~/.openclaw/sessions/ or similar

Contains:

User messages
Assistant messages
Tool calls and results
Timestamps

How It Works

Every message is appended to transcript file
Transcript is rebuilt into context when continuing a session
Persists across restarts - Can resume conversations
Stored in vector database format - Not human-readable

The Compaction Problem

When context window approaches limit (typically 200K tokens):

Auto-compaction triggers
Old messages are summarized into compact form
Summary replaces detailed history in context
Original transcript still exists on disk (but agent can't see it)

Critical distinction:

Raw transcript file = Still on disk, complete
Agent's view = Summarized version only

What Survives Compaction

Preserved:

Last 20,000 tokens (recent messages)
Anything written to bootstrap files
General themes and topics

Lost:

Exact wording of earlier instructions
Nuance and context from old messages
Specific constraints mentioned mid-conversation
Casual preferences stated in chat
Images from earlier in session

The Walter White Problem

Scenario:

Why it happens:

Conversation details were in transcript only
Never saved to bootstrap files
Compaction summarized away the specifics

Solution:

Lifespan

Before compaction:

Full detailed history available
Agent remembers exact wording
Can reference specific earlier messages

After compaction:

Summary + recent 20K tokens only
General understanding remains
Specific details are lost

Layer 3: Context Window

What It Is

Active working memory - fixed-size container where everything competes for space.

Size by model:

Claude Opus/Sonnet: 200,000 tokens
GPT-4: 128,000 tokens
Gemini Pro: 1,000,000 tokens
MiniMax: 200,000 tokens

Conversion: 1 token ≈ 0.75 words (English)

What Fills It

System prompt - OpenClaw's instructions
Bootstrap files - Loaded at session start
Conversation history - Recent messages
Tool results - File reads, web fetches, API responses
Current message - Task being processed

Compaction Trigger

Formula:

Example (200K context):

Compaction fires at 156K, not 200K

Reserve Tokens Floor

Purpose: Space reserved for agent's response

Default: 40,000 tokens

Configurable:

Large tasks: Reduce to 20,000
Small tasks: Keep at 40,000

Soft Threshold

Purpose: Additional buffer to prevent edge cases

Default: 4,000 tokens

What Competes for Space

Biggest consumers:

Tool results - File reads, web snapshots
Long conversations - Multi-turn back-and-forth
Code blocks - Full file contents
Bootstrap files - Loaded every turn

Optimization Strategy

Instead of:

(Agent fetches full transcript via API - 50K tokens)

Do this:

Get transcript manually
Save to text file
Upload file

Token savings: Up to 95%

Layer 4: Retrieval Index

What It Is

Searchable archive that sits beside or outside memory files.

Technology:

Vector database (SQLite)
Hybrid search (keyword + semantic)
Embeddings-based retrieval

How It Works

Write information to memory files
OpenClaw indexes the content automatically
Agent searches with memory_search tool
Index returns relevant snippets with file paths
Agent reads full context with memory_get

Two-step process: Search → Retrieve

Enabling Embeddings

Requirement: OpenAI or Gemini API key

Check if enabled:

Agent response should mention:

Vector database
Semantic search
SQLite file in memory directory

If not enabled:

Keyword vs. Semantic Search

Keyword search:

Exact word matching
"Pepsi" finds "Pepsi"
Fast but limited

Semantic search:

Concept matching
"soda" finds "Pepsi", "Coca-Cola", "soft drink"
Understands relationships

How Embeddings Work

Simplified explanation:

Text → Numbers - "Pepsi" becomes vector [0.23, 0.87, 0.45, ...]
Similar concepts = Similar numbers - "soda" becomes [0.25, 0.85, 0.43, ...]
Search by similarity - Find vectors close to query vector
Return relevant content - Matches based on meaning, not just words

Why it matters:

Computers are good with numbers, not words
Vector similarity enables semantic understanding
Scales to large memory archives

Storage Location

Check for SQLite database:

bash

Look for:

memory.db or similar SQLite file
Not human-readable
Contains vector embeddings

Use Cases

Scenario 1: Long-term project memory

Scenario 2: Offloading large datasets

Scenario 3: Cross-session knowledge

Integration with External Tools

Obsidian + GitHub pattern:

Store large datasets in Obsidian (outside OpenClaw memory)
Sync to GitHub for backup and version control
Agent searches via retrieval index
Fetches relevant content on-demand

Benefits:

No memory directory bloat
Version-controlled knowledge base
Accessible outside OpenClaw

Memory Priority

OpenClaw prioritizes recent memory:

Yesterday's work: Easily accessible
Last week: Requires search
Last month: Needs retrieval index

Without embeddings:

Agent may not find old information
Relies on bootstrap files and recent transcript

With embeddings:

Semantic search finds relevant content regardless of age
Scales to months or years of history

How the Layers Work Together

Session Start Flow

During Conversation

When Context Fills

Memory Failures: Three Common Types

Failure 1: Bootstrap File Truncation

Symptom: Agent forgets core instructions

Cause: File exceeded 20,000 character limit

Solution:

Check file sizes: /context list
Trim to under 20,000 characters
Remove unnecessary content

Failure 2: Chat Instructions Lost

Symptom: Agent forgets instructions given in conversation

Cause: Instructions never saved to file, lost in compaction

Solution:

Failure 3: Retrieval Index Not Enabled

Symptom: Agent can't find old information

Cause: No OpenAI/Gemini API key configured

Solution:

Set up API key
Verify SQLite database exists
Test with memory search

Best Practices

1. Save Important Information to Files

Rule: If it's not in a file, it doesn't exist long-term

Good:

Bad:

(Will be lost after compaction)

2. Keep Bootstrap Files Minimal

Target:

soul.md: 15-30 lines
memory.md: Key facts only
Total: Under 150,000 characters

Remove:

Personal biography
Irrelevant preferences
Redundant information

3. Use Retrieval Index for Large Datasets

Don't:

Store all data in bootstrap files
Load everything into context

Do:

Store in memory directory
Enable embeddings
Search on-demand

4. Organize Memory Directory

Structure:

Benefits:

Easy to navigate
Clear organization
Scalable structure

5. Distinguish Evergreen vs. Ephemeral

Evergreen (store in memory directory):

Trading system rules
Project documentation
Standard operating procedures

Ephemeral (store externally):

Daily news scraping
Temporary research
One-time analysis

Troubleshooting

"Agent doesn't remember conversation from yesterday"

Cause: Context compacted, details summarized

Solution:

Check if important info was saved to file
If not, re-provide and save to bootstrap file
Enable retrieval index for better recall

"Agent forgets core instructions"

Cause: Bootstrap file truncated or not loaded

Solution:

Check file size: /context list
Verify file is in correct directory
Restart session to reload files

"Can't find information from last month"

Cause: Retrieval index not enabled or not working

Solution:

Verify OpenAI/Gemini API key is set
Check for SQLite database file
Test memory search functionality

"Memory directory is huge"

Cause: Too much data stored locally

Solution:

Move large datasets to external storage (Obsidian, GitHub)
Archive old daily memory files
Keep only evergreen content in memory directory

Advanced Patterns

Hybrid Storage Strategy

Local memory (OpenClaw):

Core instructions
Current project context
Frequently accessed data

External storage (Obsidian/GitHub):

Historical data
Large datasets
Archived projects

Access pattern:

Memory Compaction Strategy

Proactive approach:

Monitor context usage regularly
Trigger manual compaction at 120K tokens
Review and save important context before compacting

Reactive approach:

Let auto-compaction handle it
Accept some information loss
Rely on bootstrap files for critical data

Related Resources

Context Window Management [blocked]
Skills Optimization [blocked]
Sub-Agents [blocked]

Duration: 18 minutes
Difficulty: Intermediate
Video Reference: How OpenClaw Memory ACTUALLY Works