Summary
This issue operationalizes the architecture proposed in #688. Where #688 establishes
what the typed-operation and engine substrate should be, this issue commits to how
it gets built: a phased, independently-mergeable sequence, a concrete reuse plan that
graduates the strongest pieces of the existing prototype branches, and a set of resolved
decisions.
It also records one deliberate revision to #688 (see "Revision: mode lives in the
type"), the analysis requested alongside the design — shortcomings, sunset candidates,
concepts to revisit, a typing-for-transparency plan, and a documentation plan — and the
trunk debt the study surfaced.
All of this lands under libtmux.experimental.{ops,engines} first and touches no
existing public API. Promotion to libtmux.{ops,engines} happens only after the
cross-engine contract suite and a downstream green run (tmuxp + libtmux-mcp) pass.
Relationship to #688
Inherited from #688 unchanged:
- The split into inert typed operations (
libtmux.ops) and execution engines
(libtmux.engines), with the object API as a compatibility facade.
- Operations are immutable, serializable, version-aware, and carry their result type.
- Engine interfaces are
typing.Protocols, not base classes.
- The core stays stdlib-dataclass-only; Pydantic/MCP schemas live at the edges.
- The engine families: classic subprocess, control mode, asyncio, async control, lazy,
async-lazy, concrete.
Revision: mode lives in the type, not a runtime-bound attribute
#688 states "Control mode should be an engine choice rather than a separate object
hierarchy … a sync Server can be bound to ControlModeEngine." This issue revises
that: the execution mode is encoded in the facade type, e.g. AsyncSession,
LazyControlWindow, AsyncLazyControlWindow, rather than carried as a runtime engine
attribute on a single Server class.
The reason is return-type honesty. A single Session.new_window() cannot be
statically typed to return a live Window and a deferred LazyWindow plan node and
an Awaitable[AsyncWindow] depending on a runtime flag — the type system cannot express
that. Encoding the mode in the class gives each method exactly one statically-known
return type, and async's def → async def coloring stops being a codegen problem and
becomes "a different facade tree over the same spine." Every facade family — sync/async ×
eager/lazy × classic/control — is a thin layer over one shared spine of pure typed
operations. Nothing is generated at runtime; everything is statically typed so checkers
and IDEs see real return types.
Architecture
Three layers, bottom-up:
- The spine —
libtmux.experimental.ops (pure, no I/O). Inert typed operation
values, a typed result hierarchy, a registry, and serialization. This is the single
source of truth that static checkers consume and that fixture-based, parametrizable,
sync- and async-agnostic tests run against without a tmux server.
- Engines —
libtmux.experimental.engines. Each engine owns a transport, an
execution policy, and its own error policy. Engines execute rendered operations and
return typed results. A TmuxEngine Protocol keeps them interchangeable.
- Facade families. Per-mode object families (
Server/Session/Window/Pane/
Client) built on the spine. The classic family reproduces today's behavior exactly;
newer families add async, lazy, and control-mode variants.
The spine: operations as the source of truth
An operation is a frozen stdlib dataclass carrying everything an engine needs to render,
validate, and type a command — but it never dispatches:
kind: a Literal[...] discriminator that is simultaneously the registry key and the
result-type link (a discriminated union static checkers can narrow).
scope: one of server, session, window, pane, client.
target: a closed sum — SessionId('$N'), WindowId('@N'), PaneId('%N'),
Special(token) over tmux's enumerated target tokens, and SlotRef(slot, suffix) for
deferred refs to values that do not exist yet.
args, effects, safety: Literal['readonly', 'mutating', 'destructive'].
min_version and a flag_version_map consulted at render time, replacing the ~49
inline has_gte_version(...) literals scattered across the object methods today.
Operation[ResultT] is generic over its result type, so engine.run(op) is statically
known to return that operation's result.
Results and the failure model (per-engine)
Result types share one shape (operation, argv, status, returncode, stdout,
stderr, payload), with specialized payloads (SplitWindowResult,
CapturePaneResult, etc.). Error policy is owned by the engine, not global:
- The classic engine reproduces today's
libtmux.{Server,Session,Window,Pane}
behavior exactly — it returns live ORM objects and raises in-facade as it does now.
This is the compatibility contract downstreams depend on.
- Newer engines return typed result objects (Go/Rust-style: the error is data on the
result) with an opt-in result.raise_for_status() that raises a typed
TmuxCommandError.
This mirrors CPython's subprocess.CompletedProcess — a plain value plus a short
check_returncode() — and matches libtmux's existing reality: tmux_cmd already never
raises on a tmux-side error; raising is layered on by ~92 explicit raise_if_stderr()
call sites today. Engine-broken conditions (missing binary, lost socket, protocol
desync) always raise, on every engine — those are distinct from a tmux command failing,
which is data on the returned result.
One tmux-specific wrinkle to settle before the result type is frozen: tmux frequently
signals failure via stderr text while exiting 0, and the has-session path folds
stderr into stdout. raise_for_status() for tmux therefore considers non-empty stderr,
not returncode alone — a deliberate divergence from CPython's returncode-only test, to be
documented on the method.
Engine seam: no injection
There is no runtime engine injected into the existing Server. The existing
libtmux.{Server,Session,Window,Pane} are untouched. New work is parallel,
engine-typed facade families that you import directly; each binds the shared spine to its
engine. This cleanly sidesteps the dual-dispatch hazard (today neo.fetch_objs and
get_version bypass Server.cmd) because the new families route reads, writes, and
version queries through their engine from the start.
Prior art to graduate
The study found the two halves of #688 already built — on different branches — and the
typed-operation substrate (the actual heart of #688) absent everywhere. Plan:
| Asset |
Source branch |
Disposition |
TmuxEngine Protocol, CommandRequest/CommandResult, EngineSpec + name-keyed registry |
libtmux-protocol-engines (src/libtmux/engines/{base,registry,subprocess}.py) |
Graduate as the canonical engine lineage |
Bytes-based ControlParser (typed notifications, octal unescape), persistent control engine with weakref.finalize lifecycle, Subscription |
libtmux-protocol-engines (src/libtmux/engines/control_mode/*) |
Graduate near-verbatim in Phase 4 |
| Native imsg client (v8) |
libtmux-protocol-engines (src/libtmux/engines/imsg/*) |
Easter-egg engine (Phase 6): opt-in, v8-only, no-attach, separately test-gated |
Sans-I/O drive() resolution generator (sync and async drivers diverge only at runner.cmd vs await runner.cmd), forward-ref/SlotRef capture, fail-closed COMMAND_SPECS scope/chainability registry |
chainable-commands-experiment-00 (src/libtmux/_experimental/chain/{_resolve,ir,plan,chain}.py) |
Graduate drive() + the registry concept; build the typed-op values net-new |
Persistent tmux -C runner with %begin/%end/%error parsing |
chainable-commands-experiment-00 (src/libtmux/_experimental/chain/control.py) |
Superseded by the protocol-engines control parser |
Async subprocess via create_subprocess_exec |
asyncio branch |
Basis for the real asyncio engine (not to_thread) |
What the spine adds net-new: the typed Operation value (kind/scope/effects/result-type/
version metadata), the typed Result hierarchy with raise_for_status(), the operation
registry keyed by kind, and stdlib serialization. None of these exist on any branch.
Typing for transparency
- Frozen dataclass operations with a discriminated
kind: Literal[...] so runtime
dispatch and static narrowing share one source of truth.
Operation[ResultT] generic linking each kind to its result subclass; the result type
is carried as metadata, not regenerated per call.
- Targets as a closed sum (ids,
Special literal tokens from tmux's target tables, and
SlotRef), so an illegal target is a type error, not a runtime surprise.
- Effects and safety carried as typed flags, so MCP annotations and safety tiers derive
from the operation rather than a hand-maintained table downstream.
- The registry as the single source of truth: one entry per
kind with scope,
chainability, result type, min_version, flag_version_map, effects, safety, and a
primitive-vs-composite marker (an operation that wraps one tmux command vs one composed
from others, e.g. a synthesized has-server check).
- A typed
status: Literal['complete', 'failed', 'skipped', 'unknown'] on the result, so
cross-engine equality and serialization assert on an enum, not ad-hoc returncode reads.
Serialization
Operations and results serialize to/from plain dicts with no live objects, subprocess
handles, or event-loop references — stable kind/scope/target/args/status/payload only.
Round-trip tests guard the schema. An optional edge module exposes Pydantic/JSON-Schema/
MCP schemas behind an extra; the core never imports Pydantic.
Testing
A single cross-engine contract suite is the promotion gate. One operation spec is
parametrized over engines (a provider fixture, stable ids) and asserts: result equality
across engines, serialization round-trips, version-gated rendering, deferred refs across
scopes, and the full status vocabulary (complete/failed/skipped/unknown).
Environment-conditional engines (control, imsg) use pytest.param(marks=...), never
collection errors. Async variants use pytest-asyncio; the shared drive() core means the
resolution logic is tested once and reused by both sync and async drivers. A concrete,
no-tmux engine backs doctests so every example executes without +SKIP.
Documentation
Operations warrant a dedicated, autogenerated catalog rather than hand-written pages. The
plan is a custom Sphinx domain (tmuxop) — the same pattern the maintainer's gp-sphinx
already ships (an argparse domain plus a registry→autodoc package) — registering object
types operation/result/scope with cross-reference roles and an auto-generated index.
A catalog directive walks the operation registry and emits one entry per operation with
its scope, minimum tmux version and per-flag version gates, effects/safety tier,
primitive-vs-composite marker, and cross-linked result type. Because the registry is the
single source of truth for both runtime and docs, the drift seen today (e.g. formats.py
reference lists vs neo.Obj fields) cannot recur. Catalog examples are driven by the
concrete engine so they run under the project's no-+SKIP doctest rule. Performance is
described as capability ("control mode pipelines batches over one connection"), never as a
fragile metric.
Phased plan
Each phase is independently mergeable, keeps the suite green, and changes no existing API.
- Phase 0 — Packages + engine contract. Create
libtmux.experimental.{ops,engines};
port the TmuxEngine Protocol, CommandRequest/CommandResult, and EngineSpec +
registry from libtmux-protocol-engines; add Result.raise_for_status() and
exc.TmuxCommandError. Exit: suite + mypy + doctests + build-docs green; zero public
API change.
- Phase 1 — Inert ops spine (pure, no tmux). The frozen
Operation value, the typed
Result hierarchy, the operation registry, serialization, and version-gated render,
with seed operations (split-window, capture-pane, send-keys, select-layout).
Exit: 100% of this phase's tests run with no tmux server; serialization round-trips;
registry fails closed; mypy clean.
- Phase 2 — Classic engine + classic facade slice. The classic subprocess engine and
a classic facade that runs one operation end-to-end (split-window) returning a live
Pane, identical in signature, return type, timing, and raise behavior to today.
Exit: parity verified; tmuxp + libtmux-mcp split flows green via a branch pin.
- Phase 3 — Concrete engine + contract suite (the gate). A deterministic no-tmux
engine and the parametrized contract matrix that becomes the promotion gate. Exit:
contract suite green across classic + concrete for all Phase-1 operations.
- Phase 4 — Control-mode parser then engine. Graduate the bytes
ControlParser and
the persistent control engine + Subscription, gated by --engine=. Exit: the
fixture and contract suites pass under --engine=control_mode; results equal classic.
- Phase 5 — asyncio engine + lazy/async-lazy facades. A real async engine
(create_subprocess_exec, cancellation — not to_thread) and the lazy/async-lazy
facades over the shared drive() core; AsyncLazyControlWindow and siblings
materialize here. Exit: contract suite parametrized across classic/concrete/control/
asyncio with result equality; cancellation leaks no subprocess.
- Phase 6 — imsg easter-egg engine. The native binary client as a registered, opt-in,
separately test-gated engine — proof that the operation/result contract is
transport-agnostic (same SplitWindowResult from subprocess, tmux -C, or the binary
peer protocol). Exit: opt-in tests green; framed as "native client, v8 only, no
attach."
Nothing leaves experimental until the contract suite and a tmuxp + libtmux-mcp
downstream green run pass.
Shortcomings in current libtmux (motivation)
Surfaced by the study; these are why the substrate is needed.
- Build is fused to dispatch.
subprocess.Popen(...).communicate() runs inside
tmux_cmd.__init__ (src/libtmux/common.py), so no object exists between "argv built"
and "process run" — nothing can introspect, validate, batch, serialize, or dry-run an
operation.
- No typed result, scattered failure policy.
tmux_cmd exposes only
cmd/stdout/stderr/returncode and never raises on a tmux error; raising is bolted on by
~92 raise_if_stderr() call sites. There is no check_returncode()-style opt-in.
- Version gating scattered. ~49 inline
has_gte_version(...) literals with no
capability table; neo.py independently re-gates format tokens. Flag/version drift is
unauditable.
- Dual dispatch paths.
neo.fetch_objs and get_version construct tmux_cmd
directly, re-implementing socket-flag insertion, so any engine injected at Server.cmd
would silently miss all reads. (The engine-typed-facade model avoids this by routing
everything through the engine from the start.)
Server.cmd() has no timeout, forcing libtmux-mcp to shell out in several modules.
- No serialization and no per-operation metadata (effects/safety/scope/primitive-vs-
composite), so libtmux-mcp hand-maintains a large parallel layer (tags, annotation
presets, an exception-to-result classifier).
client scope is unmodeled in both the ORM operation surface and the chain
prototype's scope literal.
Sunset candidates
- The
_experimental.chain package layout and its per-scope Bound*Commands namespaces
with a .raw() escape hatch — keep the tests as a porting checklist; replace the layout
with a single registry-driven typed-op builder so typing is the default path.
- The
asyncio.to_thread async executors and the asyncio-2 branch (the latter deletes
large test suites) — replace with a real async engine.
- The half-wired async-first codegen pipeline (
common_async.py) — replace with
hand-written async behind the engine Protocol.
- The
control-mode branch's string-based _internal/engines prototype — strictly
inferior to the protocol-engines bytes parser; mine only for feature ideas.
- The
CommandResultLike three-field passthrough as the result contract — replaced by the
typed Result base with raise_for_status().
Concepts to revisit
- The
cmd() error contract and whether raise_for_status() trips on non-empty stderr
(the tmux divergence above).
- neo vs ops: whether ops absorbs reads (neo becomes the read half over the same
engine) — the single biggest seam decision.
has_minimum_version/has_gte_version scatter → a declarative capability table, while
preserving warn-and-degrade behavior and stacklevel at the user-facing boundary.
- Eager ORM semantics vs lazy/plan timing — lazy must stay strictly additive and opt-in.
- Engine exception taxonomy — a single umbrella with meaningful subclasses (timeout vs
connection-lost vs desync) plus a retryable/expected classification.
- Whether
attach/interactive operations need a non-capturing engine mode or stay a
documented special case for v1.
Trunk debt (noted, not actioned)
The study surfaced pre-existing debt in trunk, recorded here for visibility only — no
action is proposed under this issue per the repository's trunk-cleanup policy: a dead
Pane.split branch using substring rather than list-membership matching; 43 inert
DeprecatedError shells; formats.py reference lists drifted from neo.Obj fields; and
EnvironmentMixin duplicate logic with a dead ternary.
Decisions
Resolved for this work:
- Code lives under
libtmux.experimental.{ops,engines} now; promotion to
libtmux.{ops,engines} only after the contract suite + downstream green run.
- Mode is encoded in the facade type (the revision above), realized as thin engine-typed
facades over one shared spine — no runtime code generation.
- Error policy is per-engine: classic reproduces today's behavior; newer engines return
typed results with opt-in raise_for_status().
- The core is stdlib-dataclass-only; Pydantic/MCP schemas live behind an optional extra.
- The registry/kind set is explicitly unstable while under
experimental, frozen at
promotion.
- imsg ships as an opt-in, test-gated easter-egg engine.
Deferred (as in #688): the complete operation registry; final class names for every
operation/result; rollback/compensation semantics for multi-operation plans; control-mode
subscription/event-streaming public API; op-version coexistence (op:name@version) until
a real need appears.
Prior art and references
In-repo prototype branches: chainable-commands-experiment-00,
libtmux-protocol-engines, control-mode, asyncio, libtmux-async-first-codegen.
- tmux (
b0588f4) control-mode framing, command queue, and target resolution:
control.c,
cmd-queue.c,
cmd-find.c.
- CPython (
90bf681) result-object and async-execution precedents:
Lib/subprocess.py,
Lib/asyncio/subprocess.py,
Lib/asyncio/taskgroups.py.
- fastmcp (
v3.4.2) registry, schema generation, and definition/execution separation:
https://github.com/jlowin/fastmcp/tree/v3.4.2.
- pytest (
4904d2b) parametrized contract testing:
https://github.com/pytest-dev/pytest/blob/4904d2b/src/_pytest/python.py.
- pytest-asyncio (
c23095a) async test integration:
https://github.com/pytest-dev/pytest-asyncio/blob/c23095a/pytest_asyncio/plugin.py.
- libtmux-mcp (
188784b) downstream consumer wishlist (typed results, safety tiers,
serializable operations): https://github.com/tmux-python/libtmux-mcp/tree/188784b.
- tmuxp (
fc87d86) downstream compatibility surface:
https://github.com/tmux-python/tmuxp/tree/fc87d86.
Refs above are pinned to git tags where the checkout sat on one, otherwise a 7-char
commit ref.
Summary
This issue operationalizes the architecture proposed in #688. Where #688 establishes
what the typed-operation and engine substrate should be, this issue commits to how
it gets built: a phased, independently-mergeable sequence, a concrete reuse plan that
graduates the strongest pieces of the existing prototype branches, and a set of resolved
decisions.
It also records one deliberate revision to #688 (see "Revision: mode lives in the
type"), the analysis requested alongside the design — shortcomings, sunset candidates,
concepts to revisit, a typing-for-transparency plan, and a documentation plan — and the
trunk debt the study surfaced.
All of this lands under
libtmux.experimental.{ops,engines}first and touches noexisting public API. Promotion to
libtmux.{ops,engines}happens only after thecross-engine contract suite and a downstream green run (tmuxp + libtmux-mcp) pass.
Relationship to #688
Inherited from #688 unchanged:
libtmux.ops) and execution engines(
libtmux.engines), with the object API as a compatibility facade.typing.Protocols, not base classes.async-lazy, concrete.
Revision: mode lives in the type, not a runtime-bound attribute
#688 states "Control mode should be an engine choice rather than a separate object
hierarchy … a sync
Servercan be bound toControlModeEngine." This issue revisesthat: the execution mode is encoded in the facade type, e.g.
AsyncSession,LazyControlWindow,AsyncLazyControlWindow, rather than carried as a runtime engineattribute on a single
Serverclass.The reason is return-type honesty. A single
Session.new_window()cannot bestatically typed to return a live
Windowand a deferredLazyWindowplan node andan
Awaitable[AsyncWindow]depending on a runtime flag — the type system cannot expressthat. Encoding the mode in the class gives each method exactly one statically-known
return type, and async's
def→async defcoloring stops being a codegen problem andbecomes "a different facade tree over the same spine." Every facade family — sync/async ×
eager/lazy × classic/control — is a thin layer over one shared spine of pure typed
operations. Nothing is generated at runtime; everything is statically typed so checkers
and IDEs see real return types.
Architecture
Three layers, bottom-up:
libtmux.experimental.ops(pure, no I/O). Inert typed operationvalues, a typed result hierarchy, a registry, and serialization. This is the single
source of truth that static checkers consume and that fixture-based, parametrizable,
sync- and async-agnostic tests run against without a tmux server.
libtmux.experimental.engines. Each engine owns a transport, anexecution policy, and its own error policy. Engines execute rendered operations and
return typed results. A
TmuxEngineProtocolkeeps them interchangeable.Server/Session/Window/Pane/Client) built on the spine. The classic family reproduces today's behavior exactly;newer families add async, lazy, and control-mode variants.
The spine: operations as the source of truth
An operation is a frozen stdlib dataclass carrying everything an engine needs to render,
validate, and type a command — but it never dispatches:
kind: aLiteral[...]discriminator that is simultaneously the registry key and theresult-type link (a discriminated union static checkers can narrow).
scope: one ofserver,session,window,pane,client.target: a closed sum —SessionId('$N'),WindowId('@N'),PaneId('%N'),Special(token)over tmux's enumerated target tokens, andSlotRef(slot, suffix)fordeferred refs to values that do not exist yet.
args,effects,safety: Literal['readonly', 'mutating', 'destructive'].min_versionand aflag_version_mapconsulted at render time, replacing the ~49inline
has_gte_version(...)literals scattered across the object methods today.Operation[ResultT]is generic over its result type, soengine.run(op)is staticallyknown to return that operation's result.
Results and the failure model (per-engine)
Result types share one shape (
operation,argv,status,returncode,stdout,stderr,payload), with specialized payloads (SplitWindowResult,CapturePaneResult, etc.). Error policy is owned by the engine, not global:libtmux.{Server,Session,Window,Pane}behavior exactly — it returns live ORM objects and raises in-facade as it does now.
This is the compatibility contract downstreams depend on.
result) with an opt-in
result.raise_for_status()that raises a typedTmuxCommandError.This mirrors CPython's
subprocess.CompletedProcess— a plain value plus a shortcheck_returncode()— and matches libtmux's existing reality:tmux_cmdalready neverraises on a tmux-side error; raising is layered on by ~92 explicit
raise_if_stderr()call sites today. Engine-broken conditions (missing binary, lost socket, protocol
desync) always raise, on every engine — those are distinct from a tmux command failing,
which is data on the returned result.
One tmux-specific wrinkle to settle before the result type is frozen: tmux frequently
signals failure via stderr text while exiting
0, and thehas-sessionpath foldsstderr into stdout.
raise_for_status()for tmux therefore considers non-empty stderr,not returncode alone — a deliberate divergence from CPython's returncode-only test, to be
documented on the method.
Engine seam: no injection
There is no runtime engine injected into the existing
Server. The existinglibtmux.{Server,Session,Window,Pane}are untouched. New work is parallel,engine-typed facade families that you import directly; each binds the shared spine to its
engine. This cleanly sidesteps the dual-dispatch hazard (today
neo.fetch_objsandget_versionbypassServer.cmd) because the new families route reads, writes, andversion queries through their engine from the start.
Prior art to graduate
The study found the two halves of #688 already built — on different branches — and the
typed-operation substrate (the actual heart of #688) absent everywhere. Plan:
TmuxEngineProtocol,CommandRequest/CommandResult,EngineSpec+ name-keyed registrylibtmux-protocol-engines(src/libtmux/engines/{base,registry,subprocess}.py)ControlParser(typed notifications, octal unescape), persistent control engine withweakref.finalizelifecycle,Subscriptionlibtmux-protocol-engines(src/libtmux/engines/control_mode/*)libtmux-protocol-engines(src/libtmux/engines/imsg/*)drive()resolution generator (sync and async drivers diverge only atrunner.cmdvsawait runner.cmd), forward-ref/SlotRefcapture, fail-closedCOMMAND_SPECSscope/chainability registrychainable-commands-experiment-00(src/libtmux/_experimental/chain/{_resolve,ir,plan,chain}.py)drive()+ the registry concept; build the typed-op values net-newtmux -Crunner with%begin/%end/%errorparsingchainable-commands-experiment-00(src/libtmux/_experimental/chain/control.py)create_subprocess_execasynciobranchto_thread)What the spine adds net-new: the typed
Operationvalue (kind/scope/effects/result-type/version metadata), the typed
Resulthierarchy withraise_for_status(), the operationregistry keyed by
kind, and stdlib serialization. None of these exist on any branch.Typing for transparency
kind: Literal[...]so runtimedispatch and static narrowing share one source of truth.
Operation[ResultT]generic linking each kind to its result subclass; the result typeis carried as metadata, not regenerated per call.
Specialliteral tokens from tmux's target tables, andSlotRef), so an illegal target is a type error, not a runtime surprise.from the operation rather than a hand-maintained table downstream.
kindwith scope,chainability, result type,
min_version,flag_version_map, effects, safety, and aprimitive-vs-composite marker (an operation that wraps one tmux command vs one composed
from others, e.g. a synthesized
has-servercheck).status: Literal['complete', 'failed', 'skipped', 'unknown']on the result, socross-engine equality and serialization assert on an enum, not ad-hoc returncode reads.
Serialization
Operations and results serialize to/from plain dicts with no live objects, subprocess
handles, or event-loop references — stable
kind/scope/target/args/status/payload only.Round-trip tests guard the schema. An optional edge module exposes Pydantic/JSON-Schema/
MCP schemas behind an extra; the core never imports Pydantic.
Testing
A single cross-engine contract suite is the promotion gate. One operation spec is
parametrized over engines (a provider fixture, stable ids) and asserts: result equality
across engines, serialization round-trips, version-gated rendering, deferred refs across
scopes, and the full status vocabulary (
complete/failed/skipped/unknown).Environment-conditional engines (control, imsg) use
pytest.param(marks=...), nevercollection errors. Async variants use pytest-asyncio; the shared
drive()core means theresolution logic is tested once and reused by both sync and async drivers. A concrete,
no-tmux engine backs doctests so every example executes without
+SKIP.Documentation
Operations warrant a dedicated, autogenerated catalog rather than hand-written pages. The
plan is a custom Sphinx domain (
tmuxop) — the same pattern the maintainer's gp-sphinxalready ships (an argparse domain plus a registry→autodoc package) — registering object
types
operation/result/scopewith cross-reference roles and an auto-generated index.A catalog directive walks the operation registry and emits one entry per operation with
its scope, minimum tmux version and per-flag version gates, effects/safety tier,
primitive-vs-composite marker, and cross-linked result type. Because the registry is the
single source of truth for both runtime and docs, the drift seen today (e.g.
formats.pyreference lists vs
neo.Objfields) cannot recur. Catalog examples are driven by theconcrete engine so they run under the project's no-
+SKIPdoctest rule. Performance isdescribed as capability ("control mode pipelines batches over one connection"), never as a
fragile metric.
Phased plan
Each phase is independently mergeable, keeps the suite green, and changes no existing API.
libtmux.experimental.{ops,engines};port the
TmuxEngineProtocol,CommandRequest/CommandResult, andEngineSpec+registry from
libtmux-protocol-engines; addResult.raise_for_status()andexc.TmuxCommandError. Exit: suite + mypy + doctests + build-docs green; zero publicAPI change.
Operationvalue, the typedResulthierarchy, the operation registry, serialization, and version-gated render,with seed operations (
split-window,capture-pane,send-keys,select-layout).Exit: 100% of this phase's tests run with no tmux server; serialization round-trips;
registry fails closed; mypy clean.
a classic facade that runs one operation end-to-end (split-window) returning a live
Pane, identical in signature, return type, timing, and raise behavior to today.Exit: parity verified; tmuxp + libtmux-mcp split flows green via a branch pin.
engine and the parametrized contract matrix that becomes the promotion gate. Exit:
contract suite green across classic + concrete for all Phase-1 operations.
ControlParserandthe persistent control engine +
Subscription, gated by--engine=. Exit: thefixture and contract suites pass under
--engine=control_mode; results equal classic.(
create_subprocess_exec, cancellation — notto_thread) and the lazy/async-lazyfacades over the shared
drive()core;AsyncLazyControlWindowand siblingsmaterialize here. Exit: contract suite parametrized across classic/concrete/control/
asyncio with result equality; cancellation leaks no subprocess.
separately test-gated engine — proof that the operation/result contract is
transport-agnostic (same
SplitWindowResultfrom subprocess,tmux -C, or the binarypeer protocol). Exit: opt-in tests green; framed as "native client, v8 only, no
attach."
Nothing leaves
experimentaluntil the contract suite and a tmuxp + libtmux-mcpdownstream green run pass.
Shortcomings in current libtmux (motivation)
Surfaced by the study; these are why the substrate is needed.
subprocess.Popen(...).communicate()runs insidetmux_cmd.__init__(src/libtmux/common.py), so no object exists between "argv built"and "process run" — nothing can introspect, validate, batch, serialize, or dry-run an
operation.
tmux_cmdexposes onlycmd/stdout/stderr/returncode and never raises on a tmux error; raising is bolted on by
~92
raise_if_stderr()call sites. There is nocheck_returncode()-style opt-in.has_gte_version(...)literals with nocapability table;
neo.pyindependently re-gates format tokens. Flag/version drift isunauditable.
neo.fetch_objsandget_versionconstructtmux_cmddirectly, re-implementing socket-flag insertion, so any engine injected at
Server.cmdwould silently miss all reads. (The engine-typed-facade model avoids this by routing
everything through the engine from the start.)
Server.cmd()has no timeout, forcing libtmux-mcp to shell out in several modules.composite), so libtmux-mcp hand-maintains a large parallel layer (tags, annotation
presets, an exception-to-result classifier).
clientscope is unmodeled in both the ORM operation surface and the chainprototype's scope literal.
Sunset candidates
_experimental.chainpackage layout and its per-scopeBound*Commandsnamespaceswith a
.raw()escape hatch — keep the tests as a porting checklist; replace the layoutwith a single registry-driven typed-op builder so typing is the default path.
asyncio.to_threadasync executors and theasyncio-2branch (the latter deleteslarge test suites) — replace with a real async engine.
common_async.py) — replace withhand-written async behind the engine Protocol.
control-modebranch's string-based_internal/enginesprototype — strictlyinferior to the protocol-engines bytes parser; mine only for feature ideas.
CommandResultLikethree-field passthrough as the result contract — replaced by thetyped
Resultbase withraise_for_status().Concepts to revisit
cmd()error contract and whetherraise_for_status()trips on non-empty stderr(the tmux divergence above).
engine) — the single biggest seam decision.
has_minimum_version/has_gte_versionscatter → a declarative capability table, whilepreserving warn-and-degrade behavior and
stacklevelat the user-facing boundary.connection-lost vs desync) plus a retryable/expected classification.
attach/interactive operations need a non-capturing engine mode or stay adocumented special case for v1.
Trunk debt (noted, not actioned)
The study surfaced pre-existing debt in trunk, recorded here for visibility only — no
action is proposed under this issue per the repository's trunk-cleanup policy: a dead
Pane.splitbranch using substring rather than list-membership matching; 43 inertDeprecatedErrorshells;formats.pyreference lists drifted fromneo.Objfields; andEnvironmentMixinduplicate logic with a dead ternary.Decisions
Resolved for this work:
libtmux.experimental.{ops,engines}now; promotion tolibtmux.{ops,engines}only after the contract suite + downstream green run.facades over one shared spine — no runtime code generation.
typed results with opt-in
raise_for_status().experimental, frozen atpromotion.
Deferred (as in #688): the complete operation registry; final class names for every
operation/result; rollback/compensation semantics for multi-operation plans; control-mode
subscription/event-streaming public API; op-version coexistence (
op:name@version) untila real need appears.
Prior art and references
In-repo prototype branches:
chainable-commands-experiment-00,libtmux-protocol-engines,control-mode,asyncio,libtmux-async-first-codegen.b0588f4) control-mode framing, command queue, and target resolution:control.c,cmd-queue.c,cmd-find.c.90bf681) result-object and async-execution precedents:Lib/subprocess.py,Lib/asyncio/subprocess.py,Lib/asyncio/taskgroups.py.v3.4.2) registry, schema generation, and definition/execution separation:https://github.com/jlowin/fastmcp/tree/v3.4.2.
4904d2b) parametrized contract testing:https://github.com/pytest-dev/pytest/blob/4904d2b/src/_pytest/python.py.
c23095a) async test integration:https://github.com/pytest-dev/pytest-asyncio/blob/c23095a/pytest_asyncio/plugin.py.
188784b) downstream consumer wishlist (typed results, safety tiers,serializable operations): https://github.com/tmux-python/libtmux-mcp/tree/188784b.
fc87d86) downstream compatibility surface:https://github.com/tmux-python/tmuxp/tree/fc87d86.
Refs above are pinned to git tags where the checkout sat on one, otherwise a 7-char
commit ref.