April 26, 2026 β€’ Version: v0.12.0

Memory v2 Enhancement Guide: Associative Traversal, Salience Weighting, and Access-Based Forgetting

Architectural guide for extending OpenClaw's Memory v2 with entity co-occurrence traversal, salience-weighted retention, and access-based decay to improve retrieval precision in long-running agent deployments.


πŸ” Symptoms

Current Memory v2 Retrieval Limitations

Agents running for extended periods (days to weeks) exhibit degraded contextual coherence when using existing retrieval mechanisms. The following symptoms manifest in production deployments:

Symptom 1: Shallow Lexical Retrieval

When querying for conceptually-related information across time, the agent retrieves only surface-level matches:

$ openclaw memory recall "app performance improvements"
---
RETRIEVED FACTS (3):
- W(s=0.3) @config: Updated heartbeat interval from 5m to 30m.
- W(s=0.3) @config: Increased worker pool size to 4.
- W(s=0.3) @api: Added rate limiting middleware.

EXPECTED: Connection to Week 2 debugging session about slow database queries
ACTUAL: Generic config changes only

The agent cannot traverse the implicit chain: “performance” β†’ “slow endpoint” β†’ “database query” β†’ “Sarah’s expertise.”

Symptom 2: Equal Weighting of Disparate Memories

All stored facts compete equally for context budget regardless of significance:

$ openclaw memory recall "any recent updates"
---
RETRIEVED (k=10, context budget: 4KB):

1. W(s=0.3) @config: Updated heartbeat interval from 5m to 30m.
2. W(s=0.3) @config: Increased worker pool size to 4.
3. W(s=0.3) @config: Set log level to INFO.
4. W(s=0.3) @config: Disabled telemetry opt-in.
5. B(s=0.3) @Sarah @project: Sarah announced she's leaving next month.
6. B(s=0.3) @user @identity: User prefers morning standups.
...

CRITICAL GAP: No salience differentiation. Sarah's departure competes equally with log level changes.

Symptom 3: Unbounded Index Growth Without Decay

After 30+ days of continuous operation:

$ sqlite3 ~/.openclaw/memory.db "SELECT COUNT(*) FROM facts;"
487

$ sqlite3 ~/.openclaw/memory.db "SELECT COUNT(*) FROM facts WHERE last_accessed > datetime('now', '-7 days');"
12

Only 2.5% of facts were accessed in the past week, yet all 487 compete in retrieval scoring. The reflect job must process an ever-growing set with no prioritization signal.

Symptom 4: Hub Node Pollution (Reference from CLS-M Benchmark)

Entities appearing across many facts absorb retrieval activation:

$ sqlite3 ~/.openclaw/memory.db "SELECT entity, COUNT(*) as cnt FROM fact_entities GROUP BY entity ORDER BY cnt DESC LIMIT 5;"
entity|cnt
@Peter|203
@heartbeat|57
@api|89
@config|112
@system|78

Direct entity traversal through @Peter (203 facts) dilutes signal for specific, relevant connections.

🧠 Root Cause

Architectural Gaps in Current Memory v2 Design

The current retrieval system lacks three critical mechanisms that are essential for maintaining precision in long-running deployments:

Gap 1: Single-Hop Entity Retrieval

The existing entity-aware retrieval model returns facts directly tagged with the query entity but does not recursively traverse co-occurring entities:

-- Current query (single-hop)
SELECT f.content, f.salience 
FROM facts f
JOIN fact_entities fe ON f.id = fe.fact_id
JOIN entities e ON fe.entity_id = e.id
WHERE e.name = 'performance';

-- Returns only: facts explicitly tagged @performance
-- Misses: facts about @database that co-occur with @performance across the corpus

This is architecturally correct for exact entity lookup (“tell me about X”) but insufficient for exploratory queries where the agent discovers implicit connections.

Gap 2: Absence of Salience Tracking at Retain Time

The Letta control loop’s fundamental insight is that the agent that has the experience must decide what to retain. However, without a salience parameter on retain calls, this decision is binary (keep/discard) rather than graduated:

-- Current (binary)
openclaw memory retain "Sarah is leaving the company next month"

-- Missing salience metadata that would distinguish:
-- A config file tweak (s=0.2)
-- A critical team change (s=0.95)

Without salience, the reflect job cannot distinguish signal from noiseβ€”it must proxy importance via recency or access frequency, which are poor proxies for actual significance.

Gap 3: No Access-Based Decay Mechanism

The current design treats all historical facts as equally retrievable regardless of engagement patterns:

-- No temporal or access-based scoring
SELECT content FROM facts 
ORDER BY created_at DESC  -- Only recency, not relevance
LIMIT 10;

This creates three cascading problems:

  1. Precision degradation: As the index grows, the ratio of relevant-to-irrelevant facts decreases
  2. Reflect job inefficiency: The reflection processor must evaluate an ever-larger corpus with no prioritization
  3. Hub noise amplification: High-degree entities (appearing in 100+ facts) dominate traversal without decay

Root Cause Analysis from CLS-M Prototype

The CLS-M prototype (132 nodes, 802 edges) validated these gaps empirically:

  • Recall was acceptable (65%) but precision was poor (35%)β€”meaning 65% of retrieved content was noise
  • Hub nodes destroyed precision: The heartbeat node had 57 edges, absorbing activation that should have gone to specific nodes
  • Time-based decay failed: A fact from 3 months ago that is accessed weekly should remain prominent; age alone is not a relevance signal

The fix is not to build a separate knowledge graph but to extend the existing SQLite index with:

  1. Entity co-occurrence tracking via inverse document frequency (IDF) weighting
  2. Salience as a first-class parameter on retain operations
  3. Access-based decay that resets on retrieval (not pure age-based decay)

πŸ› οΈ Step-by-Step Fix

Phase 1: Schema Extensions for SQLite Index

Add salience and access tracking columns to the existing schema:

-- Migration: add_salience_and_access_tracking.sql

-- 1. Add salience column (0.0 to 1.0, default 0.5)
ALTER TABLE facts ADD COLUMN salience REAL DEFAULT 0.5;

-- 2. Add access tracking columns
ALTER TABLE facts ADD COLUMN last_accessed_at DATETIME DEFAULT NULL;
ALTER TABLE facts ADD COLUMN access_count INTEGER DEFAULT 0;

-- 3. Create index for access-based queries
CREATE INDEX idx_facts_last_accessed ON facts(last_accessed_at);
CREATE INDEX idx_facts_salience ON facts(salience);

-- 4. Precompute entity frequencies for IDF weighting
CREATE TABLE entity_stats AS
SELECT 
    e.id,
    e.name,
    COUNT(fe.fact_id) as fact_count,
    1.0 / LOG(COUNT(fe.fact_id) + 1) as idf_weight
FROM entities e
LEFT JOIN fact_entities fe ON e.id = fe.entity_id
GROUP BY e.id;

CREATE INDEX idx_entity_stats_fact_count ON entity_stats(fact_count);

Phase 2: Entity Co-Occurrence Table

Build co-occurrence matrix from existing fact index:

-- Migration: build_entity_cooccurrence.sql

-- 1. Create co-occurrence table
CREATE TABLE entity_cooccurrence (
    entity_id_1 INTEGER NOT NULL,
    entity_id_2 INTEGER NOT NULL,
    cooccur_count INTEGER DEFAULT 1,
    cooccur_weight REAL DEFAULT 0.0,
    PRIMARY KEY (entity_id_1, entity_id_2),
    FOREIGN KEY (entity_id_1) REFERENCES entities(id),
    FOREIGN KEY (entity_id_2) REFERENCES entities(id)
);

-- 2. Populate from existing fact_entities (facts with 2+ entities)
INSERT INTO entity_cooccurrence (entity_id_1, entity_id_2, cooccur_count)
SELECT 
    fe1.entity_id,
    fe2.entity_id,
    COUNT(DISTINCT fe1.fact_id)
FROM fact_entities fe1
JOIN fact_entities fe2 ON fe1.fact_id = fe2.fact_id
WHERE fe1.entity_id < fe2.entity_id  -- Avoid duplicates
GROUP BY fe1.entity_id, fe2.entity_id;

-- 3. Compute weighted co-occurrence using IDF
UPDATE entity_cooccurrence SET cooccur_weight = (
    SELECT 
        CAST(cooccur_count AS REAL) * 
        (SELECT idf_weight FROM entity_stats WHERE idf_weight = entity_id_1) *
        (SELECT idf_weight FROM entity_stats WHERE entity_stats.id = entity_id_2)
    WHERE entity_cooccurrence.entity_id_1 = entity_id_1 
    AND entity_cooccurrence.entity_id_2 = entity_id_2
);

-- 4. Create index for fast co-occurrence lookups
CREATE INDEX idx_cooccur_lookup ON entity_cooccurrence(entity_id_1, cooccur_weight DESC);

Phase 3: CLI Command Updates

Extend the retain command with salience parameter:

# Before
openclaw memory retain "Sarah is leaving the company next month"

# After (with salience)
openclaw memory retain "Sarah is leaving the company next month" \
  --type B \
  --entity Sarah \
  --entity project \
  --salience 0.95

Extend the recall command with salience filter and associative traversal:

# Before
openclaw memory recall "performance improvements"

# After (with enhanced options)
openclaw memory recall "performance improvements" \
  --k 10 \
  --min-salience 0.3 \
  --associative-depth 2 \
  --activation-decay 0.5

Phase 4: Associative Traversal Algorithm

Implement depth-limited traversal with activation decay:

def associative_traverse(seed_entities: list[str], depth: int = 2, decay: float = 0.5) -> dict:
    """
    Traverse entity co-occurrence graph with depth limiting and activation decay.
    
    Returns:
        dict: {entity_name: accumulated_activation_score}
    """
    activation = {}
    visited = set()
    
    # Initialize seed entities with full activation
    for entity_name in seed_entities:
        activation[entity_name] = 1.0
        visited.add(entity_name)
    
    current_entities = seed_entities
    current_activation = 1.0
    
    for hop in range(depth):
        next_entities = []
        next_activation = current_activation * decay
        
        for entity_name in current_entities:
            # Query co-occurring entities with IDF weighting
            cooccurring = query("""
                SELECT e.name, c.cooccur_weight, es.idf_weight
                FROM entity_cooccurrence c
                JOIN entities e ON c.entity_id_2 = e.id
                JOIN entity_stats es ON e.id = es.id
                WHERE c.entity_id_1 = (
                    SELECT id FROM entities WHERE name = ?
                )
                AND e.name NOT IN ({}),
                ORDER BY c.cooccur_weight * es.idf_weight DESC
                LIMIT 10
            """, entity_name)
            
            for coentity_name, cooccur_weight, idf_weight in cooccurring:
                if coentity_name not in visited:
                    contribution = next_activation * cooccur_weight * idf_weight
                    activation[coentity_name] = activation.get(coentity_name, 0) + contribution
                    next_entities.append(coentity_name)
                    visited.add(coentity_name)
        
        current_entities = next_entities
        current_activation = next_activation
    
    return activation

Phase 5: Access-Based Decay Implementation

Implement power-law decay on retrieval score:

def compute_retrieval_score(fact: dict, query_entities: list[str], 
                            now: datetime = None) -> float:
    """
    Compute composite retrieval score including salience and access-based decay.
    
    Components:
    - Base match score (lexical/semantic/associative)
    - Salience weight (from retain call)
    - Access decay (power-law, reset on retrieval)
    """
    if now is None:
        now = datetime.utcnow()
    
    base_score = compute_base_match_score(fact, query_entities)
    salience_score = fact.get('salience', 0.5)
    
    # Access-based decay (power-law, halves every 7 days)
    last_accessed = fact.get('last_accessed_at')
    if last_accessed:
        days_since_access = (now - last_accessed).days
        access_decay = 0.5 ** (days_since_access / 7.0)
    else:
        access_decay = 0.25  # Never-accessed facts start quieter
    
    # Boost for frequent access (logarithmic to prevent hub dominance)
    access_count = fact.get('access_count', 0)
    access_boost = 1.0 + (0.1 * math.log1p(access_count))
    
    composite_score = (
        base_score * 0.4 +
        salience_score * 0.35 +
        access_decay * access_boost * 0.25
    )
    
    return composite_score

def on_fact_retrieved(fact_id: int) -> None:
    """Update access tracking when a fact is retrieved."""
    execute("""
        UPDATE facts 
        SET last_accessed_at = ?,
            access_count = access_count + 1
        WHERE id = ?
    """, (datetime.utcnow(), fact_id))

Phase 6: Reflect Loop Integration

Update the reflect job to prioritize recently-accessed facts:

# In reflect job processor
def reflect_on_memories(agent_id: str, core_memory_max_tokens: int = 2048) -> None:
    # Query recently-accessed facts weighted by salience
    recent_facts = query("""
        SELECT f.*, 
               COALESCE(f.salience, 0.5) * 
               (1.0 + 0.1 * LOG1P(COALESCE(f.access_count, 0))) as priority_score
        FROM facts f
        WHERE f.agent_id = ?
        AND (
            f.last_accessed_at > datetime('now', '-30 days')
            OR f.salience > 0.8
        )
        ORDER BY priority_score DESC, f.last_accessed_at DESC
        LIMIT 100
    """, agent_id)
    
    # Existing reflect logic operates on priority-filtered set
    consolidated = consolidate_memories(recent_facts)
    update_core_memory(consolidated, max_tokens=core_memory_max_tokens)

πŸ§ͺ Verification

Verification Test Suite

Execute the following commands to validate each enhancement:

Test 1: Schema Migration

$ sqlite3 ~/.openclaw/memory.db ".schema facts"
--- Expected output ---
CREATE TABLE facts (
    ...
    salience REAL DEFAULT 0.5,
    last_accessed_at DATETIME,
    access_count INTEGER DEFAULT 0
);

$ sqlite3 ~/.openclaw/memory.db "SELECT COUNT(*) FROM entity_cooccurrence;"
--- Expected output ---
> 0 (before population) or > 100 (after population with populated index)

Test 2: Salience-Aware Retain and Recall

# Retain with salience
$ openclaw memory retain "Sarah is leaving the company next month" \
  --type B \
  --entity Sarah \
  --entity project \
  --salience 0.95

--- Expected output ---
βœ“ Retained: B(s=0.95) @Sarah @project: Sarah is leaving...

# Verify in database
$ sqlite3 ~/.openclaw/memory.db \
  "SELECT content, salience FROM facts WHERE content LIKE '%Sarah%';"
--- Expected output ---
Sarah is leaving the company next month|0.95

Test 3: Access Tracking

# Query a fact (simulated)
$ openclaw memory recall "heartbeat configuration"

# Verify access tracking updated
$ sqlite3 ~/.openclaw/memory.db \
  "SELECT content, last_accessed_at, access_count FROM facts ORDER BY access_count DESC LIMIT 3;"
--- Expected output ---
Updated heartbeat interval from 5m to 30m.|2025-01-15 10:30:00|5
Increased worker pool size to 4.|2025-01-15 09:15:00|3
Rate limiting middleware added.|2025-01-14 14:22:00|1

Test 4: Associative Traversal Query

# Query with associative depth
$ openclaw memory recall "app performance" \
  --associative-depth 2 \
  --min-salience 0.3

--- Expected output ---
RETRIEVED (associative, depth=2):

Direct matches:
- W(s=0.2) @config: Updated heartbeat interval from 5m to 30m.

2-hop connections:
- B(s=0.95) @Sarah @project: Sarah is leaving... (via @database β†’ @slow-endpoint)
- W(s=0.3) @api: Rate limiting middleware added. (via @slow-endpoint)

# Verify traversal path in debug mode
$ openclaw memory recall "app performance" --associative-depth 2 --debug
--- Expected output ---
Traversal: performance β†’ {database, slow-endpoint, api} 
           β†’ database β†’ {Sarah, PostgreSQL, indexing}
           β†’ Final activation: {Sarah: 0.42, indexing: 0.31, ...}

Test 5: Composite Scoring Validation

$ python3 -c "
from openclaw.memory.scoring import compute_retrieval_score
import datetime

test_fact = {
    'content': 'Sarah is leaving next month',
    'salience': 0.95,
    'last_accessed_at': datetime.datetime.now() - datetime.timedelta(days=2),
    'access_count': 5
}

score = compute_retrieval_score(test_fact, query_entities=['personnel'])
print(f'Composite score: {score:.3f}')
print(f'  - Salience contribution: {0.95 * 0.35:.3f}')
print(f'  - Access decay (2 days): {0.5 ** (2/7) * 1.15 * 0.25:.3f}')
"
--- Expected output ---
Composite score: 0.573
  - Salience contribution: 0.333
  - Access decay (2 days): 0.240

Test 6: Reflect Job Prioritization

# Run reflect with debug output
$ openclaw memory reflect --agent-id test-agent --debug

--- Expected output ---
Processing 47 facts (filtered from 487 total by priority)
Top priority facts:
1. B(s=0.95) @Sarah @project: Sarah is leaving... (priority: 1.23)
2. B(s=0.9) @user @identity: User prefers morning standups... (priority: 1.19)
3. W(s=0.8) @Peter @deadline: Q1 deadline is March 15... (priority: 1.08)

Core memory updated: 1,847 tokens (was 2,103)

⚠️ Common Pitfalls

Implementation Traps and Environment-Specific Considerations

Pitfall 1: Hub Node Dominance Without IDF Weighting

Symptom: Associative traversal returns nearly identical results regardless of queryβ€”high-degree entities (Peter, config, system) dominate all paths.

Cause: Raw co-occurrence counts without inverse entity frequency weighting.

Fix: Ensure the entity_stats.idf_weight = 1 / log(entity_fact_count) formula is applied in all co-occurrence queries:

-- Wrong (hub dominance)
SELECT e.name FROM entities e
JOIN fact_entities fe ON e.id = fe.entity_id
WHERE fe.fact_id IN (
    SELECT fact_id FROM fact_entities WHERE entity_id = ?
)
ORDER BY COUNT(*) DESC

-- Correct (IDF-weighted)
SELECT e.name FROM entities e
JOIN entity_stats es ON e.id = es.id
JOIN fact_entities fe ON e.id = fe.entity_id
WHERE fe.fact_id IN (
    SELECT fact_id FROM fact_entities WHERE entity_id = ?
)
ORDER BY es.idf_weight * COUNT(*) DESC

Pitfall 2: Confusing Time-Based and Access-Based Decay

Symptom: Old but frequently-accessed facts receive low scores; fresh but never-accessed facts receive high scores.

Cause: Using last_accessed_at age alone instead of access-based decay with boost.

Rule: Access-based decay (reset on retrieval) outperforms time-based decay. A 3-month-old fact accessed weekly should outrank a 1-day-old fact never accessed:

# Wrong: Pure age decay
score = salience * (0.5 ** (age_in_days / 30))

# Correct: Access-based decay with boost
access_decay = 0.5 ** (days_since_last_access / 7)  # Halves every 7 days
access_boost = 1.0 + (0.1 * log1p(access_count))     # Logarithmic, prevents hub dominance
score = salience * access_decay * access_boost

Pitfall 3: Associative Depth Too Deep

Symptom: Retrieval latency exceeds 500ms; output contains seemingly random facts.

Cause: Depth > 3 without activation cutoff floods the traversal.

Fix: Implement both depth limit AND minimum activation threshold:

MAX_DEPTH = 3
MIN_ACTIVATION = 0.05
INITIAL_ACTIVATION = 1.0
DECAY_PER_HOP = 0.5

# Traversal stops when:
# - Depth limit reached, OR
# - No entities exceed MIN_ACTIVATION threshold

Pitfall 4: Salience Estimation Failure at Retain Time

Symptom: All facts receive similar salience scores (0.4-0.6); differentiation is lost.

Cause: LLM estimation is too conservative; defaults to middle values.

Fix: Implement prompt-based salience estimation with explicit anchors:

SYSTEM_PROMPT = """
Estimate salience (0.0-1.0) for this memory:
- 0.9-1.0: Identity-defining, relationship-changing, career-affecting
- 0.7-0.9: Important project decisions, team changes, deadlines
- 0.4-0.7: Routine work, configurations, bug fixes
- 0.1-0.4: Minor preferences, temp states, easily reconstructed

Memory: {fact_content}

Respond ONLY with a number between 0.0 and 1.0.
"""

Always allow human override via --salience CLI flag or direct file editing.

Pitfall 5: Docker/Container Environment Permissions

Symptom: sqlite3: unable to open database file when running in Docker.

Cause: SQLite database mounted at volume with incorrect permissions or path.

Fix: Ensure volume mount preserves directory structure:

# Wrong
docker run -v /host/memory:/container/memory image

# Correct (bind mount the parent directory)
docker run -v /host/.openclaw:/root/.openclaw image

# Verify permissions
docker exec container ls -la /root/.openclaw/memory.db
# Should show: -rw-r--r-- 1 root root ...

Pitfall 6: Raspberry Pi 5 Resource Constraints

Symptom: Associative traversal causes memory pressure on ARM device.

Cause: Python dictionaries for activation tracking + recursive queries exceed available RAM.

Fix: Limit traversal scope and use cursor-based iteration:

# Limit activation dict size
MAX_ACTIVATION_ENTITIES = 50

# Use generator for memory efficiency
def associative_traverse_stream(seed, depth, decay):
    frontier = {seed: 1.0}
    visited = {seed}
    
    for _ in range(depth):
        next_frontier = {}
        for entity, activation in frontier.items():
            if activation < MIN_ACTIVATION:
                continue
            for coentity in fetch_cooccurring(entity, limit=5):
                if coentity not in visited:
                    next_frontier[coentity] = next_frontier.get(coentity, 0) + \
                        activation * decay
                    visited.add(coentity)
        frontier = next_frontier
        yield from frontier.items()

Contextually Connected Issues and Historical Reference

Related Design Documents

  • Workspace Memory v2 Research Doc β€” The baseline architecture this guide extends. Key sections: "Entity-Aware Retrieval," "Incremental Indexing," "Reflect Loop"
  • Hindsight Γ— Letta Integration β€” Typed facts with confidence-bearing opinions provide the substrate for salience weighting
  • CLS-M Prototype Analysis β€” Empirical validation (132 nodes, 802 edges, F1=44%) demonstrating precision challenges with naive spreading activation

Common Error Codes in Memory Systems

Error CodeDescriptionRelated To
E2BIGContext assembled exceeds token budget; reflect job cannot compressSalience weighting, access decay
ENOENTITYEntity lookup returns empty but semantic search finds resultsEntity extraction gap, FTS fallback
EDUPFACTSNear-duplicate facts accumulated without consolidationReflect loop limitations
EHUBNODESRetrieval dominated by high-frequency entities (Peter, system, config)IDF weighting absence
ECOLDSTARTNew deployment has insufficient fact density for associative traversalEntity co-occurrence density threshold
EDECAYTOOFASTTime-based decay erases useful old memories prematurelyAccess-based vs. time-based decay

Historical Context from CLS-M

The CLS-M prototype identified failure modes that informed these recommendations:

  • F1=44% on 45-query benchmark β€” Precision (35%) was the bottleneck, not recall (65%)
  • Hub noise kill: heartbeat node with 57 edges absorbed 15% of total activation on every query
  • Delegation failure: Sub-agent memory extraction failed consistently; the experiencing agent must own retention
  • Spread too thin: Activation across 800+ edges diluted signal below useful thresholds

These findings validate the incremental approach: start with FTS5, add embeddings, then entity co-occurrence only after sufficient index density is reached.

OpenClaw Version Compatibility

VersionRequired FeaturesMigration Path
v0.11.xBasic fact storage, FTS5Apply Phase 1-2 migrations
v0.12.0Entity extraction, salience fieldsApply Phase 1-6 incrementally
v0.13.0 (planned)Associative traversal, access trackingFull implementation

Evidence & Sources

This troubleshooting guide was automatically synthesized by the FixClaw Intelligence Pipeline from community discussions.