Skip to content

[BUG] basic-memory-doctor-event-loop-bug-report #1027

Description

@jason41391

Bug Description

bm doctor (the basic-memory doctor CLI command) fails on a nested asyncio
event loop during its database migration check. The failure signature
differs by Python version, but neither version actually works:

  • On Python 3.12, it fails immediately with
    RuntimeError: cannot reuse already awaited coroutine.
  • On Python 3.14, it doesn't fail fast — it hangs for almost exactly
    60 seconds, then fails with a completely empty error message
    (Doctor failed: with nothing after the colon).

Both point to the same root cause: the doctor command's migration check
tries to start a second, nested asyncio event loop while already running
inside one (the CLI's own asyncio.run() entry point), and the
fallback/recovery logic for that situation is broken. The project's
changelog (v0.17.5: "Skip nest_asyncio on Python 3.14+ where it causes
event loop issues") avoids the fast-failing code path on 3.14, but only
trades it for a silent hang-then-timeout with no useful message.

Steps To Reproduce

  1. Install version 0.22.1 via uv tool install basic-memory
  2. Run bm doctor
  3. See error (empty message on 3.14, RuntimeError on 3.12)
  4. For more detail, run PYTHONASYNCIODEBUG=1 bm doctor 2>&1 | tee bm_doctor_debug.log instead — this surfaces the underlying asyncio task/exception info shown below.

Already tried, same result:

  • bm doctor --local (forces local API routing, ignoring cloud mode) — identical empty-message failure on 3.14.
  • Killing all background basic-memory mcp server processes and restarting Claude Desktop cleanly first, to rule out a multi-process conflict — failure reproduced identically. (Separate OS processes don't share a single asyncio event loop, so this was never a plausible cause, and the clean test confirms it.)
  • bm doctor --help exposes only --local, --cloud, --help — no --verbose/--debug flag is available to get a fuller traceback from the CLI itself.

Expected Behavior

bm doctor should run its local consistency checks and report a clear
pass/fail result. It should not crash with an internal RuntimeError about
coroutine reuse, and it should not hang for ~60 seconds before failing with
an empty, uninformative error message.

Actual Behavior

Python 3.12.13 — fails fast (~0.7s)

Running Basic Memory doctor checks...
Executing <Task pending name='Task-1' coro=<run_with_cleanup.<locals>._with_cleanup() running at /Users/jason/.local/share/uv/tools/basic-memory/lib/python3.12/site-packages/basic_memory/cli/commands/command_utils.py:40> wait_for=<Future pending cb=[shield.<locals>._outer_done_callback() at /opt/homebrew/Cellar/python@3.12/3.12.13_2/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/tasks.py:922, Task.task_wakeup()] cb=[run_until_complete.<locals>.done_cb()] created at /opt/homebrew/Cellar/python@3.12/3.12.13_2/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py:100> took 0.731 seconds
Doctor failed: this event loop is already running.
Task exception was never retrieved
future: <Task finished name='Task-2' coro=<run_async_migrations() done, defined at /Users/jason/.local/share/uv/tools/basic-memory/lib/python3.12/site-packages/basic_memory/alembic/env.py:123> exception=RuntimeError('cannot reuse already awaited coroutine')>
RuntimeError: cannot reuse already awaited coroutine

Python 3.14.5 — hangs ~60s, fails with empty message

Running Basic Memory doctor checks...
Executing <Task pending name='Task-1' coro=<run_with_cleanup.<locals>._with_cleanup() running at /Users/jason/.local/share/uv/tools/basic-memory/lib/python3.14/site-packages/basic_memory/cli/commands/command_utils.py:40> wait_for=<Future pending cb=[shield.<locals>._outer_done_callback() at /opt/homebrew/Cellar/python@3.14/3.14.5/Frameworks/Python.framework/Versions/3.14/lib/python3.14/asyncio/tasks.py:991, Task.task_wakeup()] created at /opt/homebrew/Cellar/python@3.14/3.14.5/Frameworks/Python.framework/Versions/3.14/lib/python3.14/asyncio/tasks.py:968> cb=[run_until_complete.<locals>.done_cb()] created at /opt/homebrew/Cellar/python@3.14/3.14.5/Frameworks/Python.framework/Versions/3.14/lib/python3.14/asyncio/runners.py:109> took 60.776 seconds
Doctor failed:

(Plain bm doctor and bm doctor --local without the debug env var just print Doctor failed: with nothing after the colon — no further output.)

Environment

  • OS: macOS, Apple Silicon (aarch64)
  • Python version: tested on both 3.12.13 and 3.14.5 (both standard Homebrew GIL builds — /opt/homebrew/Cellar/python@3.12/3.12.13_2/... and /opt/homebrew/bin/python3.14Cellar/python@3.14/3.14.5/...). Confirmed via each install's uv-receipt.toml.
  • Basic Memory version: 0.22.1
  • Installation method: uv (uv tool install basic-memory, uv version 0.6.12, e4e03833f 2025-04-02)
  • Claude Desktop version: n/a — confirmed independent of Claude Desktop's MCP server process lifecycle (see "Already tried" above)

Additional Context

  • This machine also has a cpython-3.14.2+freethreaded interpreter managed by pyenv, shadowing python3/python on PATH. This was deliberately not used for either test — both basic-memory tool venvs were explicitly built against the standard (non-free-threaded) Homebrew interpreters, confirmed via uv-receipt.toml, to avoid conflating this bug with free-threaded-ABI wheel-availability issues.
  • Relevant source locations (paths from the installed package, line numbers from the traceback above):
    • basic_memory/cli/commands/command_utils.py:40run_with_cleanup.<locals>._with_cleanup(). This is the CLI's outer task for every command, created via the top-level asyncio.run(), and wrapped with asyncio.shield(...). The ~60-second hang on 3.14 ends almost exactly at a round number, strongly suggesting a hardcoded asyncio.wait_for(asyncio.shield(coro), timeout=60) (or similar) here.
    • basic_memory/alembic/env.py:123run_async_migrations(). On 3.12, this becomes a second Task (Task-2) whose underlying coroutine object appears to have already been consumed by a prior failed attempt, hence "cannot reuse already awaited coroutine."

Possible Solution

  • In basic_memory/alembic/env.py, detect whether an event loop is already running (asyncio.get_running_loop() inside try/except RuntimeError) before deciding how to drive run_async_migrations(). If one is already running, await the coroutine directly (or schedule it against the existing loop) instead of calling asyncio.run() on it from within run_migrations_online().
  • Never reuse the same coroutine object across a failed attempt and a retry/fallback path — call the coroutine function again to get a fresh coroutine object if a retry is genuinely needed. This is what causes the 3.12 RuntimeError: cannot reuse already awaited coroutine.
  • Whatever currently formats the error in command_utils.py (run_with_cleanup) should include at least the exception type when the message body is empty, e.g. f"Doctor failed: {e!r}" instead of f"Doctor failed: {e}", so users aren't left looking at a bare colon.
  • A --verbose/--debug flag on bm doctor (and ideally other bm commands) that prints the full Python traceback would make this whole class of bug much faster for users to diagnose without needing PYTHONASYNCIODEBUG=1.
  • Happy to test a candidate fix locally — this is reproducible on demand on both Python versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions