Context
Only aggregate FPS and draw-call counters exist today (Real/Shown FPS readout, instances_tested/drawn). The v4.0 milestone rebuilds the renderer around a render graph, GPU culling, bindless and async compute — optimizing all of that blind, with end-to-end frame time as the only signal, is guesswork. The CI benchmark suite measures outcomes; this issue provides the instrumentation that explains them. It should land before the other v4 tickets.
Goal
Per-pass GPU timings via D3D12 timestamp queries, hierarchical CPU zone timers, and a live profiler window (per-pass bars, history graph, CSV/JSON export) usable in editor and standalone builds.
Acceptance Criteria
Technical Notes
ID3D12QueryHeap TIMESTAMP + resolve to readback buffer, calibrate with GetTimestampFrequency. Design the pass-marker API so the future render graph emits markers automatically.
Dependencies
- build: benchmark suite in CI — perf regression gates (shared export format)
Context
Only aggregate FPS and draw-call counters exist today (Real/Shown FPS readout, instances_tested/drawn). The v4.0 milestone rebuilds the renderer around a render graph, GPU culling, bindless and async compute — optimizing all of that blind, with end-to-end frame time as the only signal, is guesswork. The CI benchmark suite measures outcomes; this issue provides the instrumentation that explains them. It should land before the other v4 tickets.
Goal
Per-pass GPU timings via D3D12 timestamp queries, hierarchical CPU zone timers, and a live profiler window (per-pass bars, history graph, CSV/JSON export) usable in editor and standalone builds.
Acceptance Criteria
Technical Notes
ID3D12QueryHeap TIMESTAMP + resolve to readback buffer, calibrate with GetTimestampFrequency. Design the pass-marker API so the future render graph emits markers automatically.
Dependencies