Skip to content

memory_consolidate (semantic) and memory_reflect timeout due to sequential KV operations #655

@ftr5672227

Description

@ftr5672227

Bug Description

memory_consolidate with tier: "semantic" and memory_reflect both time out consistently. The episodic tier works fine, and basic operations (memory_save,
memory_recall, memory_smart_search) all work.

Environment

  • agentmemory version: 0.9.21
  • Platform: Windows 11, Claude Code CLI
  • LLM provider: Anthropic SDK → DeepSeek API (ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic)
  • Claude Code default MCP tool timeout: 120 seconds

Root Cause Analysis

I traced through the bundled source (dist/index.mjs) and found 4 specific issues:

1. Sequential KV writes in semantic consolidate (lines ~7498-7540)

Each extracted <fact> from the LLM response triggers an individual await kv.set(), executed one at a time in a while loop. With 20+ facts, this accumulates significant
latency:

while ((match = factRegex.exec(response)) !== null) {
    // ...
    await kv.set(KV.semantic, existing.id, existing);  // sequential!
}

2. N+1 KV operations in reflect (lines ~12010-12102)

Reflect has nested loops: outer loop over concept clusters (up to 10-20), each making an LLM call, then inner loop over parsed insights doing await kv.get() + await kv.set() per
 insight  all sequential:

for (const conceptNames of conceptClusters) {        // 10-20 iterations
    const response = await provider.summarize(...);    // LLM call per cluster
    while ((match = insightRegex.exec(response)) !== null) {
        const existing = await kv.get(KV.insights, fp);   // sequential read
        await kv.set(KV.insights, insight.id, insight);    // sequential write
    }
}

10 clusters × 5 insights × 2 KV ops = 100 sequential awaits, easily exceeding timeout.

3. No explicit timeout on Anthropic SDK initialization (lines ~285-288)

this.client = new Anthropic({
    apiKey,
    ...baseURL ? { baseURL } : {}
    // no timeout specified — defaults to SDK's 60s
});

When using alternative base URLs (e.g., DeepSeek), complex summarization prompts can exceed the default timeout, causing a hard failure with no retry.

4. Inconsistent use of Promise.all()

The decay loop at line ~7620 already uses Promise.all() for batch KV writes, showing the pattern is known. But consolidate and reflect don't use it.

Suggested Fix

1. Batch KV writes  collect writes into an array, then await Promise.all(writes) (same pattern already used in the decay loop)
2. Add configurable timeout to the Anthropic client initialization, e.g., timeout: 120_000 or read from env
3. Parallelize cluster processing in reflect  process independent clusters with Promise.allSettled() instead of a sequential for-loop

Reproduction

# In Claude Code with agentmemory plugin installed
# 1. Accumulate some sessions worth of memory_save calls
# 2. Run:
memory_consolidate({ tier: "episodic" })   #  works
memory_consolidate({ tier: "semantic" })   #  timeout
memory_reflect({ maxClusters: 10 })        #  timeout

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions