You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Perf testing today is manual: Editor/Dialogs/StressTestDialog.cs spawns N copies with live FPS/draw-call stats, and --stress/--benchmark flags launch GameHost in a separate process — but there is no automated CI benchmark suite, no results tracking, and EngineTest (EngineTest/Main.cpp, TestECS.h) is a compile-only proof-of-concept with no assertions or CI integration (verified). The v4 epic promises "10x"; without regression gates, every render-graph/ECS/culling refactor in this milestone can silently eat the wins. This is the guardrail that makes the number provable.
Goal
Build a headless benchmark harness with four canonical scenes — instancing storm, indoor occlusion, skinned crowd, particle storm — that runs in the GitHub Actions pipeline (build-release.yml infrastructure), emits frame-time JSON, fails PRs on regression beyond a threshold, and tracks history for trend charts.
Acceptance Criteria
Headless benchmark mode (extend the existing --benchmark GameHost path): fixed camera paths, warmup frames, fixed frame count, deterministic scene content
Four benchmark scenes committed as engine test content, each stressing a distinct axis (instance count, occlusion, skinning, transparency/particles)
CI job runs the suite per PR and fails on >X% regression vs the stored baseline (X configurable, default 5%)
Baseline update mechanism gated on explicit approval (label or manual dispatch)
History persisted (repo branch or artifact) with a rendered trend chart
Documented caveat handling for shared-runner variance (relative thresholds, repeat-run median, or self-hosted runner option)
Technical Notes
GitHub-hosted runners lack a real GPU — plan for WARP-based determinism checks plus a self-hosted RTX runner (the dev machine) for true perf numbers via workflow_dispatch/nightly.
Dependencies
None (land this FIRST — before the render graph refactor starts)
Context
Perf testing today is manual:
Editor/Dialogs/StressTestDialog.csspawns N copies with live FPS/draw-call stats, and--stress/--benchmarkflags launch GameHost in a separate process — but there is no automated CI benchmark suite, no results tracking, andEngineTest(EngineTest/Main.cpp,TestECS.h) is a compile-only proof-of-concept with no assertions or CI integration (verified). The v4 epic promises "10x"; without regression gates, every render-graph/ECS/culling refactor in this milestone can silently eat the wins. This is the guardrail that makes the number provable.Goal
Build a headless benchmark harness with four canonical scenes — instancing storm, indoor occlusion, skinned crowd, particle storm — that runs in the GitHub Actions pipeline (
build-release.ymlinfrastructure), emits frame-time JSON, fails PRs on regression beyond a threshold, and tracks history for trend charts.Acceptance Criteria
--benchmarkGameHost path): fixed camera paths, warmup frames, fixed frame count, deterministic scene contentTechnical Notes
GitHub-hosted runners lack a real GPU — plan for WARP-based determinism checks plus a self-hosted RTX runner (the dev machine) for true perf numbers via
workflow_dispatch/nightly.Dependencies