Skip to content

[upstream PR 782] f<!-- -->ix(agent-sdk): scope recursion guard to AsyncLocalStorage, memoize SDK import #468

@wbugitlab1

Description

@wbugitlab1

Source: rohitg00#782
Title: fix(agent-sdk): scope recursion guard to AsyncLocalStorage, memoize SDK import
Author: rohitg00
State: closed
Draft: no
Merged: yes
Head: rohitg00/agentmemory:fix/agent-sdk-recursion-guard-als @ 75264f2
Base: main @ de95403
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-06-02T11:57:35Z
Updated: 2026-06-02T16:20:53Z
Closed: 2026-06-02T16:20:49Z
Merged at: 2026-06-02T16:20:49Z

Original PR body:

Problem

Reported in #781. mem::summarize fails with too_many_chunks_skipped: 4/4 chunks failed to parse after retry whenever the agent-sdk provider is active (AGENTMEMORY_ALLOW_AGENT_SDK=true) and a session is large enough to split into ≥2 chunks.

The two PRs interact badly:

  • #181 (Stop-hook → /summarize → agent-sdk infinite recursion fix) added the guard:
    if (process.env.AGENTMEMORY_SDK_CHILD === '1') return ''
    process.env.AGENTMEMORY_SDK_CHILD = '1'
    // ...await SDK...
    // restore in finally
  • #472 (chunk large sessions to fit LLM context) added chunk concurrency:
    for (let i = 0; i < chunks.length; i += concurrency) {
      await Promise.all(batch.map(/* provider.summarize */));
    }

The first chunk in a batch flips the env synchronously before its first await. Sibling chunks in the same batch enter query(), see the flag, and return "". Empty string has no <title>, parseSummaryXml returns null, the chunk is counted as skipped, > 50% skip ratio throws.

The guard (cross-process) and the chunk concurrency (in-process) were never reconciled.

Fix — split the guard by scope

Each concern uses the right primitive:

Concern Primitive Why
In-process recursion (concurrent siblings) AsyncLocalStorage Scoped to the async call tree of the SDK query. Concurrent siblings have separate ALS frames, no longer see each other's marker.
Cross-process recursion (hook scripts) process.env.AGENTMEMORY_SDK_CHILD = '1' around the SDK call Hook scripts run as separate processes spawned by the Claude SDK; they inherit process.env at spawn time. ALS does not cross process boundaries.

The check now reads sdkChildContext.getStore() instead of process.env. The env var stays set+restored around the SDK call so spawned hook subprocesses still see the marker and short-circuit their REST callbacks (the original #149/#181 fix still holds).

The race on the env between concurrent in-process siblings is benign because every sibling wants "1" during its own SDK call anyway.

Bonus: memoize the dynamic import

Refactored await import('@<!-- -->anthropic-ai/claude-agent-sdk') into a per-instance memoized promise. Concurrent callers share one module resolution. Two benefits:

  • Production: one resolution per provider lifetime instead of N
  • Test: vi.mock factories apply uniformly across concurrent imports (without this, the test mock raced with real module references under Promise.all, see commit body)

Tests

5 cases in test/agent-sdk-provider.test.ts:

  • 4 concurrent summarize calls each return the real SDK result (no empty siblings) — direct #781 regression
  • Mixed concurrent summarize + compress on the same provider
  • AGENTMEMORY_SDK_CHILD is set to "1" during the SDK call (verified by reading the env from inside the mocked query) and restored to its prior value on exit
  • Genuine re-entry inside the same async tree still degrades to "" so the #149 / #181 recursion guard stays armed

Full suite: 122 files / 1331 tests pass.

Recommended user action

Users on 0.9.24 can remove the SUMMARIZE_CHUNK_CONCURRENCY=1 workaround after this lands. No env or config changes required.

Closes #781.

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of concurrent memory operations to prevent re-entrancy and state leakage during overlapping calls.
  • Performance Improvements

    • Reduced repeated SDK initialization by caching the SDK load for faster subsequent calls.
  • Tests

    • Added coverage validating concurrency behavior and guard correctness.

Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    decision-candidateFork decision has not been madeupstream-mergedUpstream pull request is merged upstreamupstream-prTracks an upstream pull request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions