May 01, 2026 • Version: 2026.3.2

Claude Opus Reasoning Loop — /new Session Reset Insufficient, Model Swap Required

When Claude Opus enters a reasoning/thinking loop, the /new command clears conversation context but cannot break the self-reinforcing loop at the model state level. Only a full model swap (e.g., Opus → Sonnet) restores normal operation.

🔍 Symptoms

Behavioral Manifestation: The agent enters an uncontrolled reasoning/thinking cycle where it repeatedly processes the same or similar thoughts without converging on a response.
Command Ineffectiveness: The /new session reset command executes successfully but does not break the loop behavior.
Model-Specific Trigger: Issue occurs exclusively with Claude Opus (which has extended thinking enabled by default) and does not manifest on Sonnet or Haiku models.

CLI Reproduction Pattern:

# User issues /new command - context clears but loop persists
> /new
[System] New session started. Context cleared.

# Agent continues to loop - no recovery
# (Agent produces repeated/thwarted reasoning outputs)

# Only model swap resolves the issue
> /model anthropic/claude-sonnet-4-20250514
[System] Model switched to claude-sonnet-4-20250514

# Agent immediately responds normally
> hello
[Agent responds coherently]

Observable Symptoms in Logs:

# Extended thinking loop indicator in OpenClaw logs
[loop-detection] Agent generating >10 consecutive thinking blocks
[loop-detection] No meaningful state change between iterations
[extended-thinking] Internal reasoning cycle not converging

🧠 Root Cause

Architectural Analysis

The root cause lies in a fundamental architectural distinction between conversation context and model internal state.

What /new Resets: The conversation history, message buffer, and turn counter. This is a session-level reset stored in OpenClaw's context management.
What /new Does NOT Reset: The model's internal reasoning state, attention patterns, and extended thinking chain. These exist at the inference layer, outside OpenClaw's context scope.

Self-Reinforcing Loop Mechanism

When Opus enters a reasoning loop:

The extended thinking module generates a chain of reasoning thoughts
If the chain encounters internal contradictions or suboptimal paths, the model generates compensating thoughts
These compensating thoughts create new contradictions, triggering additional reasoning
The loop self-reinforces because each iteration is locally coherent but globally non-convergent
Context reset does not break the loop because the model's attention mechanisms have already biased toward the problematic reasoning path

Why Sonnet Does Not Exhibit This Behavior

# Model Configuration Differences
Opus:  extended_thinking { enabled: true, budget: "high" }
Sonnet: extended_thinking { enabled: false }  # Standard inference only

# Impact:
# - Opus: Reasoning occurs in persistent internal state that survives context resets
# - Sonnet: Thinking is implicit and ephemeral, embedded in final response generation
# - Sonnet: No persistent reasoning chain to enter a self-referential loop

Technical Failure Sequence

1. User prompt triggers Opus extended thinking
2. Thinking chain begins generating intermediate reasoning
3. Reasoning encounters local inconsistency
4. Model generates compensating thought to resolve inconsistency
5. Compensation creates new inconsistency in thinking chain
6. Loop continues: each correction creates new problem
7. /new clears conversation → model still holds reasoning state
8. New prompt references prior reasoning pattern → loop re-engages
9. Only model swap clears the reasoning state completely

🛠️ Step-by-Step Fix

Immediate Resolution: Model Swap

When Opus enters a reasoning loop that /new cannot break:

# Step 1: Switch to Sonnet (or any non-Opus model)
# In Telegram or supported interface:
/model claude-sonnet-4-20250514

# Or via OpenClaw CLI:
openclaw model set claude-sonnet-4-20250514

# Step 2: Verify recovery
# Issue a simple test prompt
hello, are you responding normally?

# Step 3: (Optional) Return to Opus
# Once stabilized, switch back if needed
/model claude-opus-4-5-20251114

Configuration-Based Prevention

To prevent Opus reasoning loops, disable extended thinking in the configuration:

# openclaw.yaml configuration
models:
  anthropic/claude-opus-4-5:
    parameters:
      extended_thinking:
        enabled: false  # Disable to prevent reasoning loops

# Alternative: Set thinking budget to low for faster convergence
models:
  anthropic/claude-opus-4-5:
    parameters:
      extended_thinking:
        enabled: true
        budget_tokens: 1000  # Reduce to limit loop depth

Before vs After Configuration

Before (Default — Loop-Prone):

# openclaw.yaml
models:
  anthropic/claude-opus-4-5:
    provider: anthropic
    parameters:
      max_tokens: 8192
      # extended_thinking: not specified, defaults to enabled

After (Loop-Resistant):

# openclaw.yaml
models:
  anthropic/claude-opus-4-5:
    provider: anthropic
    parameters:
      max_tokens: 8192
      extended_thinking:
        enabled: false
        # or for limited thinking:
        # enabled: true
        # budget_tokens: 2000

Runtime Workaround (No Configuration Change)

# Use system prompt to instruct Opus against extended reasoning loops
/system
You are a direct assistant. Provide concise responses. If you find yourself repeating similar thoughts, stop and deliver your answer immediately.

🧪 Verification

After applying the fix, verify recovery using these steps:

Step 1: Confirm Model Switch

# Check current model is NOT Opus
> /model
Current model: claude-sonnet-4-20250514

# Or via CLI:
openclaw model current
# Expected output: claude-sonnet-4-20250514 or similar non-Opus model

Step 2: Test Basic Responsiveness

# Send a simple, unambiguous prompt
> what is 2 + 2?

# Expected: Direct, concise answer within 5 seconds
# Response should not include <thinking> tags or extended reasoning chains

# Verify exit code:
echo $?
# Expected: 0 (success)

Step 3: Test Session Reset Functionality

# Issue /new while on Sonnet
> /new
[System] New session started.

# Verify agent responds immediately after reset
> hello
[Agent responds normally - no stuck state]

# Exit code should be 0 for each successful turn
echo $?
# Expected: 0

Step 4: (Optional) Verify Opus Without Extended Thinking

# Switch back to Opus (if re-enabled)
/model claude-opus-4-5-20251114

# Issue the same prompt that previously caused the loop
# Should now respond normally without entering extended thinking cycle

> [original problematic prompt]

# Expected: Direct response, <thinking> block limited to single pass
# Verify in logs: no repeated thinking block generations

Success Criteria:

Agent responds to simple prompts within 10 seconds
No <thinking> blocks with recursive patterns in logs
Session reset (/new) properly clears all state
No repeated messages from the agent on identical inputs

⚠️ Common Pitfalls

Environment-Specific Traps

macOS (Darwin arm64): Extended thinking can cause high CPU utilization. Monitor with Activity Monitor or top if the loop persists after model switch.
Docker: Ensure container has sufficient memory. Extended thinking in Opus can consume 2-4GB RAM during deep reasoning loops.
Cloud Environments: Extended thinking may hit token rate limits. Implement exponential backoff in API retry logic.

User Misconfiguration Patterns

Assuming /new Resets Everything — This is the primary pitfall. /new only clears conversation context, not model state.
Not Checking Model-Specific Features — Opus, Sonnet, and Haiku have different internal architectures. Always consider model capabilities when troubleshooting.
Forgetting to Disable Extended Thinking After Testing — Development environments may leave extended thinking enabled, causing production issues.
Configuration File Syntax Errors — YAML indentation matters. Validate with openclaw config validate.

Edge Cases

# Edge Case 1: Partial Loop Recovery
# User switches models but forgets to /new
# Old context may still influence new model

# Solution: Always /new after model switch
/model claude-sonnet-4-20250514
/new
hello

# Edge Case 2: Rapid Model Switching
# Some providers rate-limit model switches
# Implement cooldown between switches

# Recommended: 5-second minimum between /model commands
# Monitor API quota with: openclaw status --verbose

# Edge Case 3: Persisted Session State
# In multi-agent configurations, session state may persist
# across the /new boundary

# Verify: Check openclaw logs for session IDs
# Resolution: Restart openclaw service after model loop detection
sudo systemctl restart openclaw

Diagnostic Commands

# Verify extended thinking status per model
openclaw models list --extended-thinking

# Check for loop detection warnings
openclaw logs --level warn | grep -i loop

# Monitor active thinking processes
openclaw debug thinking-stats

ERR_LOOP_DETECTED — Internal OpenClaw flag raised when agent generates more than 10 consecutive thinking blocks without producing output. Correlates with this Opus reasoning loop issue.
messages.N.content.X thinking block that cannot be modified — Related but separate issue involving corrupted thinking block persistence in message history. Shares overlapping symptoms but different root cause (message layer vs. model state layer).
ERR_MODEL_STATE_CORRUPTED — Raised when model internal state appears inconsistent. Model swap is the documented recovery procedure.
WARN_EXTENDED_THINKING_TIMEOUT — Extended thinking exceeding configured budget. May precede or co-occur with reasoning loops.
ERR_CONTEXT_WINDOW_EXCEEDED — Can co-occur with reasoning loops when the thinking chain fills context. Distinguishable by checking thinking block count vs. total token usage.

Historical Context

2026.2.x: Initial extended thinking feature released; reasoning loop issue first reported
2026.3.0: Loop detection heuristics added, but /new behavior not updated
2026.3.2 (Current): /new resets conversation but not model state — this is the known limitation documented in this guide