<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Shutdown on FixClaw</title>
        <link>https://fixclaw.dev/tags/shutdown/</link>
        <description>Recent content in Shutdown on FixClaw</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en-us</language>
        <lastBuildDate>Mon, 01 Jan 0001 00:00:00 +0000</lastBuildDate><atom:link href="https://fixclaw.dev/tags/shutdown/index.xml" rel="self" type="application/rss+xml" /><item>
            <title>GGML Metal Assertion Crash on Apple Silicon During Shutdown with Local Embeddings</title>
            <link>https://fixclaw.dev/troubleshooting/ggml-metal-assertion-crash-on-apple-silicon-during-shutdown-with-local-embedding/</link>
            <pubDate>Mon, 02 Mar 2026 00:00:00 +0000</pubDate>
            <guid>https://fixclaw.dev/troubleshooting/ggml-metal-assertion-crash-on-apple-silicon-during-shutdown-with-local-embedding/</guid>
            <description>&lt;h2 id=&#34;symptom&#34;&gt;Symptom&#xA;&lt;/h2&gt;&lt;p&gt;When running OpenClaw on Apple Silicon (M-series Macs) with local embeddings configured for memory search, the application crashes during shutdown (e.g., when receiving Ctrl+C / SIGINT or during autoupdate). The crash produces a GGML Metal assertion failure visible in the logs:&lt;/p&gt;&#xA;&lt;p&gt;/Users/runner/work/node-llama-cpp/node-llama-cpp/llama/llama.cpp/ggml/src/ggml-metal/ggml-metal-device.m:608: GGML_ASSERT([rsets-&amp;gt;data count] == 0) failed&lt;/p&gt;&#xA;&lt;p&gt;The full stack trace shows the crash originates from &lt;code&gt;ggml_metal_device_free&lt;/code&gt; being called during process exit, indicating that Metal resources were not properly released before shutdown.&lt;/p&gt;&#xA;&lt;p&gt;Additionally, the following error may appear in the logs before the crash:&lt;/p&gt;&#xA;&lt;p&gt;Unhandled promise rejection: AssertionError [ERR_ASSERTION]: Reached illegal state! IPV4 address change from defined to undefined!&lt;/p&gt;&#xA;&lt;p&gt;This network-related error is a secondary symptom that occurs during the same shutdown sequence.&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Affected Configurations:&lt;/strong&gt;&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;code&gt;macOS&lt;/code&gt; (all versions with Apple Silicon)&lt;/li&gt;&#xA;&lt;li&gt;Local embeddings provider enabled via &lt;code&gt;node-llama-cpp&lt;/code&gt;&lt;/li&gt;&#xA;&lt;li&gt;Example configuration:&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;&amp;ldquo;agents&amp;rdquo;: {&#xA;&amp;ldquo;defaults&amp;rdquo;: {&#xA;&amp;ldquo;memorySearch&amp;rdquo;: {&#xA;&amp;ldquo;provider&amp;rdquo;: &amp;ldquo;local&amp;rdquo;,&#xA;&amp;ldquo;local&amp;rdquo;: {&#xA;&amp;ldquo;modelPath&amp;rdquo;: &amp;ldquo;/path/to/embeddinggemma-&amp;hellip;gguf&amp;rdquo;&#xA;}&#xA;}&#xA;}&#xA;}&lt;/p&gt;&#xA;&lt;h2 id=&#34;root-cause-analysis&#34;&gt;Root Cause Analysis&#xA;&lt;/h2&gt;&lt;p&gt;The crash is caused by a &lt;strong&gt;resource leak&lt;/strong&gt; in the interaction between OpenClaw and the &lt;code&gt;node-llama-cpp&lt;/code&gt; library when using Metal GPU acceleration on Apple Silicon.&lt;/p&gt;&#xA;&lt;h3 id=&#34;technical-details&#34;&gt;Technical Details&#xA;&lt;/h3&gt;&lt;ol&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Embedding Context Lifecycle&lt;/strong&gt;: When local embeddings are used, &lt;code&gt;node-llama-cpp&lt;/code&gt; creates embedding contexts that hold Metal GPU resources (textures, buffers, and command queues managed by GGML Metal).&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Missing Disposal on Shutdown&lt;/strong&gt;: During normal process exit (SIGINT, SIGTERM, or autoupdate restart), these embedding contexts are not explicitly disposed before the Node.js event loop terminates. This leaves Metal resources in an active state.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;GGML Metal Unload Assertion&lt;/strong&gt;: When the process exits, &lt;code&gt;llama.cpp&lt;/code&gt;&amp;rsquo;s &lt;code&gt;ggml_metal_device_free()&lt;/code&gt; function runs during the &lt;code&gt;atexit()&lt;/code&gt; phase. This function asserts that all Metal resource sets (&lt;code&gt;rsets-&amp;gt;data&lt;/code&gt;) have been released. Since the embedding contexts were not disposed, this assertion fails:&#xA;GGML_ASSERT([rsets-&amp;gt;data count] == 0)&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Regression Indicator&lt;/strong&gt;: This issue is classified as a regression because the functionality worked in previous versions, suggesting a change in either OpenClaw&amp;rsquo;s shutdown handling, &lt;code&gt;node-llama-cpp&lt;/code&gt;&amp;rsquo;s behavior, or a combination of both.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;h3 id=&#34;call-chain&#34;&gt;Call Chain&#xA;&lt;/h3&gt;&lt;p&gt;process.exit()&#xA;→ exit() [libsystem_c.dylib]&#xA;→ __cxa_finalize_ranges()&#xA;→ ggml_metal_device_free() [libggml-metal.so]&#xA;→ GGML_ASSERT([rsets-&amp;gt;data count] == 0)  // FAILS HERE&lt;/p&gt;&#xA;&lt;h2 id=&#34;solution&#34;&gt;Solution&#xA;&lt;/h2&gt;&lt;p&gt;Implement a cleanup patch that tracks embedding contexts and explicitly disposes them before process exit.&lt;/p&gt;&#xA;&lt;h3 id=&#34;step-1-create-the-cleanup-patch-file&#34;&gt;Step 1: Create the Cleanup Patch File&#xA;&lt;/h3&gt;&lt;p&gt;Create a new file at &lt;code&gt;src/memory/local-cleanup-patch.ts&lt;/code&gt;:&lt;/p&gt;&#xA;&lt;p&gt;import { getLlama } from &amp;rsquo;node-llama-cpp&#39;;&lt;/p&gt;&#xA;&lt;p&gt;const trackedContexts: any[] = [];&lt;/p&gt;&#xA;&lt;p&gt;const originalCreate = getLlama.prototype.createEmbeddingContext;&#xA;getLlama.prototype.createEmbeddingContext = async function (&amp;hellip;args: any[]) {&#xA;const ctx = await originalCreate.apply(this, args);&#xA;trackedContexts.push(ctx);&#xA;return ctx;&#xA;};&lt;/p&gt;&#xA;&lt;p&gt;async function cleanup() {&#xA;if (trackedContexts.length === 0) return;&#xA;console.log(&lt;code&gt;[cleanup] Disposing ${trackedContexts.length} embedding context(s)&lt;/code&gt;);&#xA;for (const ctx of trackedContexts) {&#xA;if (ctx?.dispose) {&#xA;await ctx.dispose().catch(e =&amp;gt; console.warn(&amp;rsquo;[cleanup] Dispose failed:&amp;rsquo;, e));&#xA;}&#xA;}&#xA;trackedContexts.length = 0;&#xA;}&lt;/p&gt;&#xA;&lt;p&gt;process.once(&amp;lsquo;SIGINT&amp;rsquo;, cleanup);&#xA;process.once(&amp;lsquo;SIGTERM&amp;rsquo;, cleanup);&#xA;process.on(&amp;lsquo;beforeExit&amp;rsquo;, cleanup);&lt;/p&gt;&#xA;&lt;p&gt;export { cleanup };&lt;/p&gt;&#xA;&lt;h3 id=&#34;step-2-modify-the-import-function&#34;&gt;Step 2: Modify the Import Function&#xA;&lt;/h3&gt;&lt;p&gt;Update &lt;code&gt;src/memory/node-llama.ts&lt;/code&gt; to automatically apply the cleanup patch when local embeddings are used:&lt;/p&gt;&#xA;&lt;p&gt;export async function importNodeLlamaCpp() {&#xA;// Automatically apply our shutdown fix when local embeddings are used&#xA;await import(&amp;rsquo;./local-cleanup-patch&amp;rsquo;);&#xA;return import(&amp;ldquo;node-llama-cpp&amp;rdquo;);&#xA;}&lt;/p&gt;&#xA;&lt;h3 id=&#34;step-3-verify-the-fix&#34;&gt;Step 3: Verify the Fix&#xA;&lt;/h3&gt;&lt;p&gt;After implementing the changes:&lt;/p&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;Restart the OpenClaw service&lt;/li&gt;&#xA;&lt;li&gt;Trigger an autoupdate or send SIGINT (Ctrl+C)&lt;/li&gt;&#xA;&lt;li&gt;Confirm the application shuts down gracefully without GGML Metal assertion failures&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;h2 id=&#34;prevention&#34;&gt;Prevention&#xA;&lt;/h2&gt;&lt;p&gt;To prevent this issue from recurring:&lt;/p&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Always Test Shutdown Sequences&lt;/strong&gt;: When integrating libraries that manage native resources (GPU, CUDA, Metal), always test graceful shutdown scenarios including SIGINT, SIGTERM, and forced exits.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Implement Resource Tracking&lt;/strong&gt;: For any async resource allocation (embedding contexts, model instances), implement tracking mechanisms that can be flushed during shutdown.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Register Cleanup Handlers Early&lt;/strong&gt;: Register shutdown cleanup handlers (&lt;code&gt;process.once(&#39;SIGINT&#39;)&lt;/code&gt;, &lt;code&gt;process.once(&#39;SIGTERM&#39;)&lt;/code&gt;, &lt;code&gt;process.on(&#39;beforeExit&#39;)&lt;/code&gt;) as early as possible in the application lifecycle.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Use Graceful Shutdown Patterns&lt;/strong&gt;: Implement a unified shutdown manager that coordinates cleanup across all subsystems, ensuring native resources are released before the event loop terminates.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Add Regression Tests&lt;/strong&gt;: Consider adding automated tests that verify graceful shutdown behavior on all supported platforms, particularly Apple Silicon with Metal GPU acceleration.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;h2 id=&#34;additional-information&#34;&gt;Additional Information&#xA;&lt;/h2&gt;&lt;h3 id=&#34;temporary-workarounds&#34;&gt;Temporary Workarounds&#xA;&lt;/h3&gt;&lt;p&gt;If the fix cannot be applied immediately, use one of the following workarounds to disable Metal GPU acceleration:&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Option 1&lt;/strong&gt;: Disable GPU layers entirely&#xA;export NODE_LLAMA_CPP_GPU_LAYERS=0&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Option 2&lt;/strong&gt;: Disable Metal specifically (version-dependent)&#xA;export NODE_LLAMA_CPP_METAL=false&lt;/p&gt;&#xA;&lt;p&gt;Note: These workarounds will reduce embedding performance but prevent the crash.&lt;/p&gt;&#xA;&lt;h3 id=&#34;affected-versions&#34;&gt;Affected Versions&#xA;&lt;/h3&gt;&lt;ul&gt;&#xA;&lt;li&gt;OpenClaw &lt;strong&gt;2026.3.1&lt;/strong&gt; is confirmed affected&lt;/li&gt;&#xA;&lt;li&gt;Earlier versions may also be affected depending on the &lt;code&gt;node-llama-cpp&lt;/code&gt; version in use&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;related-dependencies&#34;&gt;Related Dependencies&#xA;&lt;/h3&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;code&gt;node-llama-cpp&lt;/code&gt;: Embedding context management&lt;/li&gt;&#xA;&lt;li&gt;&lt;code&gt;ggml-metal&lt;/code&gt; (llama.cpp): Metal GPU resource management&lt;/li&gt;&#xA;&lt;li&gt;&lt;code&gt;@homebridge/ciao&lt;/code&gt;: mDNS/network management (secondary error source during shutdown)&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h3 id=&#34;external-references&#34;&gt;External References&#xA;&lt;/h3&gt;&lt;ul&gt;&#xA;&lt;li&gt;GGML Metal device cleanup: &lt;a class=&#34;link&#34; href=&#34;https://github.com/ggml-org/llama.cpp/pull/17869&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;&#xA;    &gt;llama.cpp PR #17869&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;Setting &lt;code&gt;GGML_BACKTRACE_LLDB&lt;/code&gt; may cause native MacOS Terminal.app to crash; use with caution&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;sources&#34;&gt;Sources&#xA;&lt;/h2&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/openclaw/openclaw/issues/32452&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub Issue #32452&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;</description>
        </item></channel>
</rss>
