test: broaden coverage of analysis, OTel merge, and rendering by dbroeglin · Pull Request #24 · dbroeglin/github-copilot-lab

dbroeglin · 2026-06-27T17:08:03Z

Follow-up to #23 (merged): a thorough test-coverage review of the
session-analysis, session-reading, and OTel-merge code paths. Tests only — no
production code changes.

What's added (43 offline tests)

analyze_trajectory (ATIF) — driven by synthetic, hand-controlled
inputs. The real trajectory.json is produced by this package's own Pier
converter, so it can't serve as authoritative golden data; these tests
exercise the parsing/aggregation logic directly: metadata, tool histogram +
failure detection ("code":"failure", permission-denied markers),
final_metrics → economics (aiu/reasoning/peak/compactions), summed-output
fallbacks, and malformed-input guards.
_apply_otel_records merge — shutdown stays authoritative while turns are
still enriched; OTel-only reconstruction without a shutdown; unmatched
turn_id and no-chat-span paths.
OTel decoders — _otel_attrs (dict + OTLP list), _otel_value typed
scalars + array/kvlist, _otel_attr_value flat + nested dotted lookup,
_parse_otel_time across ISO / [sec,nanos] / epoch s·ms·ns, and
llm_calls_from_otel over OTLP list-format attributes.
sessionlog — load_events blank/invalid skipping, copy_events
roundtrip + missing-source, compaction_start peak tracking without a
shutdown.
render — render_session_analysis overview incl. the multi-model
"Per model" table and the no-economics path; remaining LiveEventFormatter
branches.

Coverage

Module	Before	After
analysis.py	77%	90%
render.py	87%	93%
sessionlog.py	90%	93%

Full suite: 203 passed. Ruff clean + formatted.

Add offline unit tests targeting the session-analysis internals that were thinly covered: - analyze_trajectory (ATIF) driven by synthetic, hand-controlled inputs (metadata, tool histogram + failure detection, final_metrics economics, summed-output fallbacks, malformed-input guards). Real trajectory.json is produced by this package's own converter, so it can't serve as authoritative golden data; these tests exercise the parsing logic directly. - _apply_otel_records merge: shutdown stays authoritative while turns are still enriched; OTel-only reconstruction without a shutdown; unmatched turn_id and no-chat-span paths. - OTel decoders: _otel_attrs (dict + OTLP list), _otel_value typed scalars and array/kvlist, _otel_attr_value flat + nested dotted lookup, _parse_otel_time across ISO / [sec,nanos] / epoch s|ms|ns, and llm_calls_from_otel over OTLP list-format attributes. - sessionlog: load_events blank/invalid skipping, copy_events roundtrip and missing-source, compaction_start peak tracking without a shutdown. - render: render_session_analysis overview incl. the multi-model "Per model" table and the no-economics path; remaining LiveEventFormatter branches. Coverage: analysis.py 77%->90%, render.py 87%->93%, sessionlog.py 90%->93%. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dbroeglin merged commit 0273a29 into main Jun 27, 2026
7 checks passed

dbroeglin mentioned this pull request Jun 27, 2026

test: harden session analysis against real-world edge cases #25

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: broaden coverage of analysis, OTel merge, and rendering#24

test: broaden coverage of analysis, OTel merge, and rendering#24
dbroeglin merged 1 commit into
mainfrom
dbroeglin/analysis-test-coverage

dbroeglin commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dbroeglin commented Jun 27, 2026

What's added (43 offline tests)

Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant