April 17, 2026 โ€ข Version: 2026.4.9

Billing Cooldown Skip Path Bypasses isBillingErrorMessage() โ€” Generic Error Shown Instead of Billing Message

When all model-fallback candidates are skipped due to billing cooldown, users see a generic 'Something went wrong' error instead of the actionable BILLING_ERROR_USER_MESSAGE, because the cooldown-generated skip message does not match any pattern in isBillingErrorMessage().

๐Ÿ” Symptoms

User-Facing Symptom

After the first billing failure with an Anthropic OAuth-authenticated account, all subsequent retry attempts display a generic, non-actionable error message:

โš ๏ธ Something went wrong while processing your request. Please try again, or use /new to start a fresh session.

This repeats on a ~30-minute cadence for hours, even though the underlying cause is a billing quota exhaustion on the Anthropic side.

Developer-Visible Symptom (Agent Logs)

The first failure correctly surfaces the billing error:

[agent] embedded run agent end: runId=e8520f5d-... isError=true model=claude-opus-4-6 provider=anthropic error=LLM request rejected: You're out of extra usage. Add more at claude.ai/settings/usage and keep going.
[agent] auth profile failure state updated: runId=e8520f5d-... profile=sha256:154a23a3efe6 provider=anthropic reason=billing window=disabled

All subsequent failures produce the cooldown skip path:

[model-fallback] model fallback decision: decision=skip_candidate requested=anthropic/claude-opus-4-6 candidate=anthropic/claude-opus-4-6 reason=billing next=anthropic/claude-sonnet-4-6 detail=Provider anthropic has billing issue (skipping all models)
[model-fallback] model fallback decision: decision=skip_candidate requested=anthropic/claude-opus-4-6 candidate=anthropic/claude-sonnet-4-6 reason=billing next=none detail=Provider anthropic has billing issue (skipping all models)
Embedded agent failed before reply: All models failed (2): anthropic/claude-opus-4-6: Provider anthropic has billing issue (skipping all models) (billing) | anthropic/claude-sonnet-4-6: Provider anthropic has billing issue (skipping all models) (billing)

Structured Data Confirmation

The FallbackSummaryError carries attempt.reason=“billing” for every attempt, but the isBillingErrorMessage() check at agent-runner-execution.ts performs string-matching against ERROR_PATTERNS.billing in failover-matches.ts, which does not include the pattern “has billing issue”.

Frequency Pattern

The cycle repeats every 30 minutes for extended durations:

2026-04-13T22:41:05 ... Embedded agent failed before reply: All models failed (2): ... (billing) | ... (billing)
2026-04-13T23:11:05 ... Embedded agent failed before reply: All models failed (2): ... (billing) | ... (billing)
2026-04-13T23:41:05 ... Embedded agent failed before reply: All models failed (2): ... (billing) | ... (billing)

๐Ÿง  Root Cause

Architecture: Two Error Classification Strategies

OpenClaw uses two distinct strategies to classify errors as billing-related, with an asymmetry between the raw API error path and the cooldown skip path:

  • Raw API error path: isBillingErrorMessage(message: string) โ€” regex/string matching against ERROR_PATTERNS.billing in failover-matches.ts.
  • Rate-limit path (already correct): isPureTransientRateLimitSummary(failure: FallbackSummaryError) โ€” structural check against attempt.reason === 'rate_limit'.
  • Billing cooldown skip path (broken): No equivalent structural check; relies solely on isBillingErrorMessage() string-matching, which fails to match "has billing issue (skipping all models)".

The Failure Sequence

  1. User's Anthropic OAuth personal "extra usage" quota is exhausted.
  2. First LLM request receives a raw API error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"You're out of extra usage. Add more at claude.ai/settings/usage and keep going."}}
  3. The raw error message matches ERROR_PATTERNS.billing โ†’ auth profile enters billing cooldown window=disabled.
  4. Subsequent requests trigger the model-fallback skip logic in model-fallback.ts.
  5. Every candidate model is skipped with detail: "Provider anthropic has billing issue (skipping all models)".
  6. A FallbackSummaryError is constructed with attempt.reason="billing" for each failed attempt.
  7. In agent-runner-execution.ts, the code calls isBillingErrorMessage(error.message) โ€” but "has billing issue" is not in ERROR_PATTERNS.billing.
  8. The billing check fails, so the generic fallback error path is taken, producing "Something went wrong".

Relevant Code Locations

  • src/core/failover/failover-matches.ts โ€” ERROR_PATTERNS.billing contains patterns like "out of extra usage", "insufficient balance", "billing error", but **not** "has billing issue".
  • src/core/agent-runner-execution.ts โ€” calls isBillingErrorMessage(message) as the sole billing classification gate for the error rendering path.
  • src/core/model-fallback/model-fallback.ts โ€” produces "Provider X has billing issue (skipping all models)" messages when billing cooldown is active.
  • src/core/failover/failover-matches.ts โ€” isPureTransientRateLimitSummary() correctly inspects attempt.reason === 'rate_limit' as a structural field, demonstrating the correct pattern that the billing path is missing.

Why the Rate-Limit Path Works Correctly

The rate-limit path already uses the structural attempt.reason field:

export function isPureTransientRateLimitSummary(failure: FallbackSummaryError): boolean {
  return failure.attempts.every(a => a.reason === 'rate_limit');
}

This approach is immune to message string changes because it inspects the semantic classification field, not the human-readable message.

Why OAuth Exacerbates the Issue

Personal “extra usage” quotas on claude.ai are generally smaller than organizational API budgets. OAuth-authenticated accounts hit their personal quota more frequently than API-key-authenticated org accounts, making this bug a common user experience issue for the OAuth install path.

๐Ÿ› ๏ธ Step-by-Step Fix

Fix Strategy

Add a structural billing check function isPureBillingSummary() to failover-matches.ts, mirroring the existing isPureTransientRateLimitSummary() pattern. Update agent-runner-execution.ts to use this structural check as the primary gate for billing classification, falling back to string-matching only for legacy raw API errors.

Step 1: Add Structural Billing Check to failover-matches.ts

Add the following function to src/core/failover/failover-matches.ts alongside the existing isPureTransientRateLimitSummary():

Before:

export function isPureTransientRateLimitSummary(failure: FallbackSummaryError): boolean {
  return failure.attempts.every(a => a.reason === 'rate_limit');
}

After:

export function isPureTransientRateLimitSummary(failure: FallbackSummaryError): boolean {
  return failure.attempts.every(a => a.reason === 'rate_limit');
}

/**
 * Structural check: true when every attempt in the FallbackSummaryError
 * is classified as billing-cooldown. This correctly handles the
 * "Provider X has billing issue (skipping all models)" skip path,
 * which is not matched by isBillingErrorMessage() string patterns.
 */
export function isPureBillingSummary(failure: FallbackSummaryError): boolean {
  return failure.attempts.every(a => a.reason === 'billing');
}

Step 2: Update agent-runner-execution.ts Billing Classification Gate

Locate the billing error classification logic in src/core/agent-runner-execution.ts. Replace the string-only check with a structural-first approach:

Before:

const isBilling = isBillingErrorMessage(message);

After:

// Prefer structural classification (cooldown skip path) over string matching.
const isBilling = error instanceof FallbackSummaryError
  ? isPureBillingSummary(error)
  : isBillingErrorMessage(message);

Ensure FallbackSummaryError and isPureBillingSummary are imported:

import { FallbackSummaryError } from '../model-fallback/types';
import { isPureBillingSummary } from '../failover/failover-matches';

Step 3: (Optional Enhancement) Extend ERROR_PATTERNS.billing

To ensure raw API errors through the cooldown window are also handled gracefully, extend the billing patterns in src/core/failover/failover-matches.ts to include the cooldown skip phrase:

Before:

export const ERROR_PATTERNS = {
  billing: [
    /out of extra usage/i,
    /insufficient balance/i,
    /billing error/i,
    /api key (has|runs out).*credit/i,
    /add more at.*usage/i,
    /out of credits/i,
  ],
  // ...
};

After:

export const ERROR_PATTERNS = {
  billing: [
    /out of extra usage/i,
    /insufficient balance/i,
    /billing error/i,
    /api key (has|runs out).*credit/i,
    /add more at.*usage/i,
    /out of credits/i,
    /has billing issue \(skipping all models\)/i, // cooldown skip path
  ],
  // ...
};

This third step is defensive; the primary fix in Steps 1โ€“2 is sufficient because isPureBillingSummary() short-circuits before string matching reaches the FallbackSummaryError case.

Step 4: Rebuild and Deploy

npm run build
# or for Docker deployments:
docker build -t openclaw:fixed .

๐Ÿงช Verification

Unit Test: isPureBillingSummary()

Add a test case to src/core/failover/failover-matches.test.ts:

import { isPureBillingSummary } from './failover-matches';
import { FallbackSummaryError, FallbackAttempt } from '../model-fallback/types';

describe('isPureBillingSummary', () => {
  it('returns true when all attempts have reason=billing', () => {
    const attempts: FallbackAttempt[] = [
      {
        provider: 'anthropic',
        model: 'claude-opus-4-6',
        reason: 'billing',
        message: 'Provider anthropic has billing issue (skipping all models)',
        durationMs: 0,
        startTime: 0,
        endTime: 0,
      },
      {
        provider: 'anthropic',
        model: 'claude-sonnet-4-6',
        reason: 'billing',
        message: 'Provider anthropic has billing issue (skipping all models)',
        durationMs: 0,
        startTime: 0,
        endTime: 0,
      },
    ];
    const error = new FallbackSummaryError('All models failed', attempts);
    expect(isPureBillingSummary(error)).toBe(true);
  });

  it('returns false when attempts contain mixed reasons', () => {
    const attempts: FallbackAttempt[] = [
      { provider: 'anthropic', model: 'claude-opus-4-6', reason: 'billing', message: '', durationMs: 0, startTime: 0, endTime: 0 },
      { provider: 'anthropic', model: 'claude-sonnet-4-6', reason: 'rate_limit', message: '', durationMs: 0, startTime: 0, endTime: 0 },
    ];
    const error = new FallbackSummaryError('All models failed', attempts);
    expect(isPureBillingSummary(error)).toBe(false);
  });

  it('returns false when no attempt has reason=billing', () => {
    const attempts: FallbackAttempt[] = [
      { provider: 'anthropic', model: 'claude-opus-4-6', reason: 'rate_limit', message: '', durationMs: 0, startTime: 0, endTime: 0 },
    ];
    const error = new FallbackSummaryError('All models failed', attempts);
    expect(isPureBillingSummary(error)).toBe(false);
  });
});

Run the test suite:

npm test -- --testPathPattern="failover-matches"
# Expected: isPureBillingSummary tests pass

Integration Test: Billing Cooldown Error Rendering

Simulate a billing cooldown scenario using a test provider or mocked AuthProfile:

# Using the OpenClaw CLI test harness (if available):
openclaw test:integration --scenario=billing-cooldown --auth-type=oauth

# Expected output in user-facing message channel:
# "โš ๏ธ API provider returned a billing error โ€” your API key has run out of credits
#  or has an insufficient balance. Check your provider's billing dashboard and
#  top up or switch to a different API key."
# (i.e., BILLING_ERROR_USER_MESSAGE, not "Something went wrong")

Manual Verification: Log Inspection

Trigger the billing cooldown and inspect agent logs for the corrected classification:

# Trigger a billing exhaustion scenario, then observe subsequent failures:
grep -E "(billing|isBilling|Something went wrong)" /var/log/openclaw/agent.log

# Before fix โ€” "Something went wrong" appears repeatedly:
# Embedded agent failed before reply: ... (Something went wrong)
# Embedded agent failed before reply: ... (Something went wrong)

# After fix โ€” BILLING_ERROR_USER_MESSAGE appears:
# [agent] embedded run agent end: ... userMessage=โš ๏ธ API provider returned a billing error...
# Embedded agent failed before reply: ... (billing)

Exit Code Verification

# Verify graceful degradation with billing error exit code
openclaw run --prompt="Hello" --model=anthropic/claude-opus-4-6
echo "Exit code: $?"
# Expected: non-zero exit (indicating error state was properly surfaced), NOT a crash

โš ๏ธ Common Pitfalls

  • Only extending ERROR_PATTERNS without adding isPureBillingSummary(): Adding "has billing issue" to the string patterns works as a workaround but is fragile. If the model-fallback message format changes in a future release (e.g., "Provider X billing cooldown โ€” skipping all models"), the pattern breaks again. The structural isPureBillingSummary() approach is resilient to message string changes.
  • Applying isPureBillingSummary() unconditionally: The check must be guarded with instanceof FallbackSummaryError. Calling it on a raw string or other error type throws a TypeError. The fallback to isBillingErrorMessage(message) for non-FallbackSummaryError types preserves backward compatibility with raw API errors.
  • OAuth vs API key asymmetry in testing: The bug manifests more readily with OAuth-authenticated accounts because personal "extra usage" quotas are smaller. Testing with org-level API keys may not reproduce the issue, leading to false confidence that a fix is working. Always test with both OAuth personal quota scenarios and API key exhaustion scenarios.
  • Cooldown window state persistence: The billing cooldown state persists across restarts if backed by a persistent store (Redis, SQLite). Ensure test environments reset the auth profile failure state between runs, or the cooldown will still block requests even after fixing the error message path.
  • Partial model-fallback coverage: If only some providers in a routing chain enter billing cooldown, the FallbackSummaryError contains a mix of reason: 'billing' and other reasons (e.g., reason: 'timeout'). isPureBillingSummary() returns false in this mixed case. Consider adding isMostlyBillingSummary() as a secondary heuristic if mixed failures are common.
  • Docker volume mount timing: In Docker deployments, ensure the rebuilt container image is used (docker build, not just docker-compose up -d --build if the build context is stale). A common mistake is editing source files and only running up -d, which uses the existing image without the fix.
  • Log verbosity masking the issue: If LOG_LEVEL=error is set in production, the detailed [model-fallback] model fallback decision lines may be suppressed, making it harder to diagnose whether the cooldown skip path or raw error path was taken. Set LOG_LEVEL=debug during troubleshooting.
  • FallbackSummaryError with "Something went wrong" message โ€” The generic fallback error that users see when neither isBillingErrorMessage() nor isPureRateLimitSummary() matches. Indicates a classification gap in the error routing logic.
  • ERROR_PATTERNS.billing pattern mismatch โ€” The set of regex patterns in failover-matches.ts that isBillingErrorMessage() uses for string-based billing detection. Missing the cooldown skip phrase was the root cause here.
  • PR #61608 โ€” partial billing pattern fix โ€” Added "out of extra usage" to ERROR_PATTERNS.billing for raw API errors only; did not address the cooldown-generated skip path.
  • Issue #48526 โ€” Related billing error classification gap (possibly earlier instance of the same pattern-matching vs structural classification problem).
  • Issue #64224 โ€” OAuth authentication billing quota exhaustion causing repeated errors (likely same root cause, different manifestation).
  • Issue #64308 โ€” Model-fallback all-models-skipped scenario with generic error output.
  • Issue #62375 โ€” Anthropic provider billing error handling edge case with OAuth personal usage quotas.
  • isPureTransientRateLimitSummary() โ€” The correct pattern that the billing path should mirror. Demonstrates the structural attempt.reason inspection approach that was already implemented for rate-limit errors.
  • BILLING_ERROR_USER_MESSAGE โ€” The expected user-facing message that was not being shown: "โš ๏ธ API provider returned a billing error โ€” your API key has run out of credits or has an insufficient balance. Check your provider's billing dashboard and top up or switch to a different API key."
  • auth profile failure state: reason=billing window=disabled โ€” The auth profile metadata indicating that a billing cooldown has been activated, preventing further attempts for a defined cooldown period.

Evidence & Sources

This troubleshooting guide was automatically synthesized by the FixClaw Intelligence Pipeline from community discussions.