Problem
The mobile player currently:
- Downloads all stems as MP3 files and decodes them entirely into RAM before playback starts
- Holds up to ~420 MB for a 5-minute 4-stem track
- Crashes or shows an error for tracks longer than ~14 minutes (600 MB guard)
- Makes users wait for the full download + decode before hearing anything
Proposed solution
A chunked audio engine that fetches WAV stems in 10-second windows via HTTP Range requests and chains AudioBufferSourceNodes back-to-back on the AudioContext clock.
Key properties:
- Audio starts after the first chunk downloads (~7 MB for 4 stems on WiFi) instead of the full file
- Peak RAM: ~28 MB (2 chunks x 4 stems) vs. up to 420 MB
- No track length limit (removes the 14-minute cap)
- Same glitch-free behavior as the current engine (no streaming elements, no HTTP/1.1 connection-cap underruns)
Why WAV range requests work here:
- Backend: Starlette's
FileResponse already handles Range headers natively - zero backend changes needed
- WAV PCM is uncompressed, so byte offsets map directly to audio positions (no frame boundary issues like MP3)
- Confirmed format: 16-bit PCM, 44100 Hz, stereo (
pcm_s16le) - consistent output from Demucs pipeline
Implementation scope
Two files, no backend changes:
-
New static/js/chunkedAudioEngine.js - same interface as createAudioEngine so the rest of the app needs no changes
- WAV header parser (Range fetch of first 1024 bytes)
- PCM deinterleave to AudioBuffer (typed array fast path for stereo 16-bit)
- Fetch + decode chunks in parallel across all stems
- RAF-driven lookahead scheduler (pre-fetches next chunk while current plays)
- Clean seek: flush scheduled nodes, fetch from new position
-
static/mobile/app.js - three-line change in openTrack():
- Import
createChunkedAudioEngine instead of createAudioEngine
- Use WAV URLs (remove the
.replace(.wav -> .mp3) call)
- Remove the 600 MB guard block
Complexity estimate
3-5 days for a solid implementation. The main complexity is the RAF-driven scheduler, multi-stem coordination (all stems' chunk N must be ready before any starts), and seeking into an uncached chunk position.
Related
Discussed in PR #235 (mobile load OOM fixes). This is the follow-up improvement once the immediate OOM crash fix is merged.
Problem
The mobile player currently:
Proposed solution
A chunked audio engine that fetches WAV stems in 10-second windows via HTTP Range requests and chains
AudioBufferSourceNodes back-to-back on the AudioContext clock.Key properties:
Why WAV range requests work here:
FileResponsealready handlesRangeheaders natively - zero backend changes neededpcm_s16le) - consistent output from Demucs pipelineImplementation scope
Two files, no backend changes:
New
static/js/chunkedAudioEngine.js- same interface ascreateAudioEngineso the rest of the app needs no changesstatic/mobile/app.js- three-line change inopenTrack():createChunkedAudioEngineinstead ofcreateAudioEngine.replace(.wav -> .mp3)call)Complexity estimate
3-5 days for a solid implementation. The main complexity is the RAF-driven scheduler, multi-stem coordination (all stems' chunk N must be ready before any starts), and seeking into an uncached chunk position.
Related
Discussed in PR #235 (mobile load OOM fixes). This is the follow-up improvement once the immediate OOM crash fix is merged.