Background
Follow-up to #2192 (foundation) and PR #2193 (pytest + Zuul infrastructure). Part of Tier 5 (#2199). osism/tasks/__init__.py (489 LOC) is the shared foundation of all Celery workers: the Celery Config class, container-version lookup, the central run_ansible_in_environment / run_command subprocess runners (incl. host extraction from Ansible output via HOST_PATTERN), play-execution logging, and the CLI-side handle_task wait/revoke helper.
Scope
Add tests/unit/tasks/test_init.py covering the module-level helpers in osism/tasks/__init__.py. There is no existing coverage: osism.tasks.handle_task is only ever mocked in tests/unit/commands/test_manage_wiring.py, never tested directly. tests/unit/tasks/__init__.py already exists, so only the test module itself needs to be created. None of the targets are Celery tasks (plain functions/classes), so no broker is needed. If the run_ansible_in_environment matrix grows too large, the issue may be split during implementation (subprocess runners vs. pure helpers), as done for Tier 3.
Test targets
HOST_PATTERN — __init__.py:19
Pure regex, no mocking. Parametrize:
"ok: [node-1]", "changed: [host.example.com]", "failed: [x]", "skipping: [x]", "unreachable: [x]" → match, group(2) is the hostname
- Delegation lines:
"ok: [node-1 -> 192.168.0.5]" → [^\]]+ captures "node-1 -> 192.168.0.5" (pin current behavior)
"fatal: [node-1]: FAILED!" → no match (Ansible reports failures as fatal:, which the pattern does not list — document this gap with a test)
"TASK [Gathering Facts]", "ok:", empty string → no match
- Leading whitespace → no match against the raw line (the caller strips first; the pattern is anchored with
^)
Config — __init__.py:22
No mocking; plain attribute assertions:
broker_url == "redis://redis", result_backend == "redis://redis", task_default_queue == "default", task_create_missing_queues is True, broker_connection_retry_on_startup is True, enable_utc is True
task_routes maps all eight osism.tasks.<sub>.* patterns to their queues (osism-ansible, ceph-ansible, conductor, kolla-ansible, kubernetes, netbox, openstack, reconciler)
enable_ironic defaults to the string "True" (read from ENABLE_IRONIC at import time — only the already-imported value can be asserted; do not reload the module)
task_track_started == (True,) — note: this is a 1-tuple due to a trailing comma; pin current behavior and flag the likely typo in the PR description (fixing it is out of scope here)
get_container_version(worker) — __init__.py:43
Patch osism.tasks.Path (the file path /interface/versions/{worker}.yml is hardcoded) with a callable redirecting into tmp_path, then write real YAML files.
- Version file missing → returns
"unknown", debug logged
osism-ansible file with osism_ansible_version: "7.0.5a" → returns "7.0.5a" (key derived via worker.replace("-", "_") + "_version")
kolla-ansible / ceph-ansible / osism-kubernetes key derivation (parametrize)
- Version value
"" → returns "latest"
- YAML present but key missing → returns
"unknown"
- Empty YAML file (
yaml.safe_load → None) → None.get raises AttributeError, caught by the broad except → returns "unknown", warning logged
- Invalid YAML (
yaml.YAMLError) → returns "unknown", warning logged
log_play_execution(...) — __init__.py:95
Patch osism.tasks.Path (hardcoded /share/ansible-execution-history.json) to a tmp_path file and osism.tasks.get_container_version (covered above). fcntl.flock works on real temp files — no mock needed.
- Happy path: one JSON line appended; record contains
request_id, worker, worker_version (from the patched get_container_version), environment, role, hosts, arguments, result; timestamp ends with "Z"
hosts=None → [] in the record; arguments=None and arguments="" → "" (falsy check)
- Two calls append two lines (mode
"a")
- Parent directory does not exist → created via
mkdir(parents=True, exist_ok=True)
- Write failure (e.g. patched
Path pointing into a read-only dir, or builtins.open raising) → warning logged, no exception propagates
run_ansible_in_environment(...) — __init__.py:146
Patch:
osism.tasks.subprocess.Popen
osism.tasks.utils.redis — lazy attribute on osism.utils; patch with mocker.patch("osism.tasks.utils.redis", new=MagicMock(), create=True) before anything touches it, otherwise osism.utils.__getattr__ opens a real Redis connection
osism.tasks.utils.create_redlock, osism.tasks.utils.push_task_output, osism.tasks.utils.finish_task_output
osism.tasks.log_play_execution
osism.tasks.tempfile.mkdtemp (return a str(tmp_path / "ssh") dir) and optionally osism.tasks.shutil.rmtree to assert cleanup
Argument handling
arguments as list → joined with spaces; as str → passed through (type(...) == list is an exact-type check — a tuple is not joined; pin that)
worker="kolla-ansible" + "-e kolla_action=stop" in arguments → -e kolla_action_stop_ignore_missing=true appended; already present → not duplicated; non-kolla worker with stop action → not appended
Environment construction (assert via Popen.call_args.kwargs["env"])
ANSIBLE_FORCE_COLOR == "1", PY_COLORS == "1", PYTHONUNBUFFERED == "1"
ANSIBLE_SSH_CONTROL_PATH_DIR set to the mkdtemp result (prefix contains the request_id)
ANSIBLE_SSH_RETRIES defaults to "3" (str(ssh_retries)); pre-set via monkeypatch.setenv("ANSIBLE_SSH_RETRIES", "7") → not overridden
- Sub-environment
environment="manager.sub1" → env["SUB"] == "manager.sub1", env["ENVIRONMENT"] == "manager", default branch builds /run-manager.sh
utils.redis.get("ansible_vault_password") truthy → env["VAULT"] == "/ansible-vault.py"; None → key absent
Worker dispatch (assert the command string passed to Popen, incl. stdbuf -oL prefix and shell=True)
kolla-ansible → stdbuf -oL /run.sh deploy <role> <args>
kolla-ansible + role mariadb-backup (and alias mariadb_backup) → action backup, role rewritten to mariadb, and -e kolla_action=deploy (or any of the listed actions) regex-rewritten to -e kolla_action=backup
osism-kubernetes and ceph-ansible → stdbuf -oL /run.sh <role> <args>
- Any other worker →
stdbuf -oL /run-<environment>.sh <role> <args>
Output loop, host extraction, logging
- Fake process:
poll side_effect=[None, None, 0], stdout.readline returning e.g. b"ok: [node-2]\n", b"ok: [node-1]\n", wait → 0; returned result is the concatenated output; push_task_output called once per line; finish_task_output called with rc=0
log_play_execution called twice: first with result="started", hosts=None; then with result="success" and hosts=["node-1", "node-2"] (deduplicated via set, sorted)
wait returns non-zero → second call has result="failure"
- Duplicate host lines (
ok: + changed: for the same host) → host appears once
publish=False → neither push_task_output nor finish_task_output called
Locking & cleanup
locking=True → create_redlock(key="lock-ansible-<environment>-<role>", auto_release_time=...); lock.acquire() called; lock.release() called in the finally even when wait raises
lock.release() raising → warning logged, no exception propagates
locking=False → create_redlock never called
- SSH control dir:
shutil.rmtree called on the mkdtemp dir even when Popen raises; rmtree raising → warning logged, original return/exception flow unaffected
run_command(...) — __init__.py:371
Patch osism.tasks.subprocess.Popen, osism.tasks.utils.push_task_output, osism.tasks.utils.finish_task_output, osism.tasks.utils.create_redlock.
Popen invoked with the argv list [command, arg1, arg2] (no shell=True) and stderr=subprocess.STDOUT
ignore_env=True → env passed verbatim; ignore_env=False → os.environ copy updated with env (assert an injected key plus an inherited one via monkeypatch.setenv)
- Output loop: lines accumulated into the return value;
push_task_output per line; finish_task_output(request_id, rc=rc)
publish=False → no push/finish calls
locking=True → create_redlock(key="lock-<command>", ...) and lock.release() called — note: acquire() is never called in the current implementation; pin that behavior and flag it in the PR description
locking=False → create_redlock not called
handle_task(t, wait, format, timeout) — __init__.py:418
Patch osism.tasks.utils.fetch_task_output, osism.tasks.utils.revoke_task, and prompt_toolkit.prompt (function-local from prompt_toolkit import prompt → patch in the prompt_toolkit module). Use a MagicMock task: the code reads both t.id (for fetch_task_output) and t.task_id (for all log messages).
wait=True, fetch succeeds → returns the fetch result; called with (t.id, timeout=timeout)
fetch_task_output raises TimeoutError → returns 1, info logs mention the timeout and the osism wait follow-up command (use the loguru_logs fixture from tests/conftest.py)
KeyboardInterrupt + prompt answers "y" / "yes": revoke_task(t.task_id) called; returns True → "has been revoked" logged; returns False → error logged; function returns 1
KeyboardInterrupt + prompt answer "n" (or default) → no revoke, "continues running" logged, returns 1
KeyboardInterrupt + prompt itself raises KeyboardInterrupt → handled, returns 1
KeyboardInterrupt + prompt raises EOFError → handled, returns 1
wait=False, format="log" → returns 0, info logged, fetch_task_output never called
wait=False, format="script" → prints exactly t.task_id (assert via capsys), returns 0
Mocking hints
- All patch targets live at
osism.tasks.* — the module imports os, subprocess, tempfile, shutil, yaml and utils directly, so patch e.g. osism.tasks.subprocess.Popen, not subprocess.Popen.
utils.redis is created lazily by osism.utils.__getattr__ on first access and would open a real connection — always patch with create=True before calling run_ansible_in_environment.
- Build a small fake-process helper/fixture for both runners:
def make_process(lines, rc=0):
p = MagicMock()
p.poll.side_effect = [None] * len(lines) + [rc]
p.stdout.readline.side_effect = [line.encode() for line in lines]
p.wait.return_value = rc
return p
- For
get_container_version / log_play_execution the file paths are hardcoded absolute paths; patching osism.tasks.Path with lambda p: tmp_path / Path(p).name keeps everything inside tmp_path and lets the real YAML/JSON and fcntl code run.
log_play_execution is called from inside run_ansible_in_environment — patch it there so the runner tests don't touch the history file, and test it in isolation separately.
- Loguru output is invisible to
caplog; use the existing loguru_logs fixture from tests/conftest.py for all "warning/error logged" assertions.
- None of these functions are Celery tasks, so they can be called directly; the Celery
app lives in osism/tasks/ansible.py etc. and is not touched by this issue.
Definition of Done
Dependencies
Background
Follow-up to #2192 (foundation) and PR #2193 (pytest + Zuul infrastructure). Part of Tier 5 (#2199).
osism/tasks/__init__.py(489 LOC) is the shared foundation of all Celery workers: the CeleryConfigclass, container-version lookup, the centralrun_ansible_in_environment/run_commandsubprocess runners (incl. host extraction from Ansible output viaHOST_PATTERN), play-execution logging, and the CLI-sidehandle_taskwait/revoke helper.Scope
Add
tests/unit/tasks/test_init.pycovering the module-level helpers inosism/tasks/__init__.py. There is no existing coverage:osism.tasks.handle_taskis only ever mocked intests/unit/commands/test_manage_wiring.py, never tested directly.tests/unit/tasks/__init__.pyalready exists, so only the test module itself needs to be created. None of the targets are Celery tasks (plain functions/classes), so no broker is needed. If therun_ansible_in_environmentmatrix grows too large, the issue may be split during implementation (subprocess runners vs. pure helpers), as done for Tier 3.Test targets
HOST_PATTERN—__init__.py:19Pure regex, no mocking. Parametrize:
"ok: [node-1]","changed: [host.example.com]","failed: [x]","skipping: [x]","unreachable: [x]"→ match,group(2)is the hostname"ok: [node-1 -> 192.168.0.5]"→[^\]]+captures"node-1 -> 192.168.0.5"(pin current behavior)"fatal: [node-1]: FAILED!"→ no match (Ansible reports failures asfatal:, which the pattern does not list — document this gap with a test)"TASK [Gathering Facts]","ok:", empty string → no match^)Config—__init__.py:22No mocking; plain attribute assertions:
broker_url == "redis://redis",result_backend == "redis://redis",task_default_queue == "default",task_create_missing_queues is True,broker_connection_retry_on_startup is True,enable_utc is Truetask_routesmaps all eightosism.tasks.<sub>.*patterns to their queues (osism-ansible,ceph-ansible,conductor,kolla-ansible,kubernetes,netbox,openstack,reconciler)enable_ironicdefaults to the string"True"(read fromENABLE_IRONICat import time — only the already-imported value can be asserted; do not reload the module)task_track_started == (True,)— note: this is a 1-tuple due to a trailing comma; pin current behavior and flag the likely typo in the PR description (fixing it is out of scope here)get_container_version(worker)—__init__.py:43Patch
osism.tasks.Path(the file path/interface/versions/{worker}.ymlis hardcoded) with a callable redirecting intotmp_path, then write real YAML files."unknown", debug loggedosism-ansiblefile withosism_ansible_version: "7.0.5a"→ returns"7.0.5a"(key derived viaworker.replace("-", "_") + "_version")kolla-ansible/ceph-ansible/osism-kuberneteskey derivation (parametrize)""→ returns"latest""unknown"yaml.safe_load→None) →None.getraisesAttributeError, caught by the broadexcept→ returns"unknown", warning loggedyaml.YAMLError) → returns"unknown", warning loggedlog_play_execution(...)—__init__.py:95Patch
osism.tasks.Path(hardcoded/share/ansible-execution-history.json) to atmp_pathfile andosism.tasks.get_container_version(covered above).fcntl.flockworks on real temp files — no mock needed.request_id,worker,worker_version(from the patchedget_container_version),environment,role,hosts,arguments,result;timestampends with"Z"hosts=None→[]in the record;arguments=Noneandarguments=""→""(falsy check)"a")mkdir(parents=True, exist_ok=True)Pathpointing into a read-only dir, orbuiltins.openraising) → warning logged, no exception propagatesrun_ansible_in_environment(...)—__init__.py:146Patch:
osism.tasks.subprocess.Popenosism.tasks.utils.redis— lazy attribute onosism.utils; patch withmocker.patch("osism.tasks.utils.redis", new=MagicMock(), create=True)before anything touches it, otherwiseosism.utils.__getattr__opens a real Redis connectionosism.tasks.utils.create_redlock,osism.tasks.utils.push_task_output,osism.tasks.utils.finish_task_outputosism.tasks.log_play_executionosism.tasks.tempfile.mkdtemp(return astr(tmp_path / "ssh")dir) and optionallyosism.tasks.shutil.rmtreeto assert cleanupArgument handling
argumentsaslist→ joined with spaces; asstr→ passed through (type(...) == listis an exact-type check — a tuple is not joined; pin that)worker="kolla-ansible"+"-e kolla_action=stop"in arguments →-e kolla_action_stop_ignore_missing=trueappended; already present → not duplicated; non-kolla worker with stop action → not appendedEnvironment construction (assert via
Popen.call_args.kwargs["env"])ANSIBLE_FORCE_COLOR == "1",PY_COLORS == "1",PYTHONUNBUFFERED == "1"ANSIBLE_SSH_CONTROL_PATH_DIRset to themkdtempresult (prefix contains therequest_id)ANSIBLE_SSH_RETRIESdefaults to"3"(str(ssh_retries)); pre-set viamonkeypatch.setenv("ANSIBLE_SSH_RETRIES", "7")→ not overriddenenvironment="manager.sub1"→env["SUB"] == "manager.sub1",env["ENVIRONMENT"] == "manager", default branch builds/run-manager.shutils.redis.get("ansible_vault_password")truthy →env["VAULT"] == "/ansible-vault.py";None→ key absentWorker dispatch (assert the
commandstring passed toPopen, incl.stdbuf -oLprefix andshell=True)kolla-ansible→stdbuf -oL /run.sh deploy <role> <args>kolla-ansible+ rolemariadb-backup(and aliasmariadb_backup) → actionbackup, role rewritten tomariadb, and-e kolla_action=deploy(or any of the listed actions) regex-rewritten to-e kolla_action=backuposism-kubernetesandceph-ansible→stdbuf -oL /run.sh <role> <args>stdbuf -oL /run-<environment>.sh <role> <args>Output loop, host extraction, logging
pollside_effect=[None, None, 0],stdout.readlinereturning e.g.b"ok: [node-2]\n",b"ok: [node-1]\n",wait→0; returnedresultis the concatenated output;push_task_outputcalled once per line;finish_task_outputcalled withrc=0log_play_executioncalled twice: first withresult="started",hosts=None; then withresult="success"andhosts=["node-1", "node-2"](deduplicated via set, sorted)waitreturns non-zero → second call hasresult="failure"ok:+changed:for the same host) → host appears oncepublish=False→ neitherpush_task_outputnorfinish_task_outputcalledLocking & cleanup
locking=True→create_redlock(key="lock-ansible-<environment>-<role>", auto_release_time=...);lock.acquire()called;lock.release()called in thefinallyeven whenwaitraiseslock.release()raising → warning logged, no exception propagateslocking=False→create_redlocknever calledshutil.rmtreecalled on themkdtempdir even whenPopenraises;rmtreeraising → warning logged, original return/exception flow unaffectedrun_command(...)—__init__.py:371Patch
osism.tasks.subprocess.Popen,osism.tasks.utils.push_task_output,osism.tasks.utils.finish_task_output,osism.tasks.utils.create_redlock.Popeninvoked with the argv list[command, arg1, arg2](noshell=True) andstderr=subprocess.STDOUTignore_env=True→envpassed verbatim;ignore_env=False→os.environcopy updated withenv(assert an injected key plus an inherited one viamonkeypatch.setenv)push_task_outputper line;finish_task_output(request_id, rc=rc)publish=False→ no push/finish callslocking=True→create_redlock(key="lock-<command>", ...)andlock.release()called — note:acquire()is never called in the current implementation; pin that behavior and flag it in the PR descriptionlocking=False→create_redlocknot calledhandle_task(t, wait, format, timeout)—__init__.py:418Patch
osism.tasks.utils.fetch_task_output,osism.tasks.utils.revoke_task, andprompt_toolkit.prompt(function-localfrom prompt_toolkit import prompt→ patch in theprompt_toolkitmodule). Use aMagicMocktask: the code reads botht.id(forfetch_task_output) andt.task_id(for all log messages).wait=True, fetch succeeds → returns the fetch result; called with(t.id, timeout=timeout)fetch_task_outputraisesTimeoutError→ returns1, info logs mention the timeout and theosism waitfollow-up command (use theloguru_logsfixture fromtests/conftest.py)KeyboardInterrupt+ prompt answers"y"/"yes":revoke_task(t.task_id)called; returnsTrue→ "has been revoked" logged; returnsFalse→ error logged; function returns1KeyboardInterrupt+ prompt answer"n"(or default) → no revoke, "continues running" logged, returns1KeyboardInterrupt+ prompt itself raisesKeyboardInterrupt→ handled, returns1KeyboardInterrupt+ prompt raisesEOFError→ handled, returns1wait=False,format="log"→ returns0, info logged,fetch_task_outputnever calledwait=False,format="script"→ prints exactlyt.task_id(assert viacapsys), returns0Mocking hints
osism.tasks.*— the module importsos,subprocess,tempfile,shutil,yamlandutilsdirectly, so patch e.g.osism.tasks.subprocess.Popen, notsubprocess.Popen.utils.redisis created lazily byosism.utils.__getattr__on first access and would open a real connection — always patch withcreate=Truebefore callingrun_ansible_in_environment.get_container_version/log_play_executionthe file paths are hardcoded absolute paths; patchingosism.tasks.Pathwithlambda p: tmp_path / Path(p).namekeeps everything insidetmp_pathand lets the real YAML/JSON andfcntlcode run.log_play_executionis called from insiderun_ansible_in_environment— patch it there so the runner tests don't touch the history file, and test it in isolation separately.caplog; use the existingloguru_logsfixture fromtests/conftest.pyfor all "warning/error logged" assertions.applives inosism/tasks/ansible.pyetc. and is not touched by this issue.Definition of Done
tests/unit/tasks/test_init.pycreatedpytest --cov=osism.tasksshows ≥ 90 % forosism/tasks/__init__.pypipenv run pytest tests/unit/tasks/test_init.pypasses locallyflake8,mypy,python-blackremain greenpython-osism-unit-testspassesDependencies