Skip to content

Fenix enrollment integration test is flaky (empty log-state read + transient SDK download corruption) #15845

@jaredlockhart

Description

@jaredlockhart

The Fenix Enrollment Integration Test workflow (.github/workflows/fenix-integration-test.yml, test experimenter/tests/integration/nimbus/android/test_fenix_enrollment.py) fails intermittently — mostly on the beta channel. Two independent flakes:

1. AssertionError: No log-state row found ... after 60s

The test enrolls (nimbus-cli enroll --reset-app, launch 1), sleeps 15s, then runs nimbus-cli log-state once (launch 2) and polls adb logcat -d for 60s. On a loaded CI emulator the enroll launch's applyPendingExperiments doesn't persist within 15s, so the single log-state launch reads an empty DB. The poll loop only re-dumps the logcat ring buffer — it never re-issues log-state — so once that one read is empty, nothing can recover it (the app won't re-print state without being re-launched).

Proposed fix: move the log-state invocation inside the poll loop — clear logcat, relaunch log-state (which re-runs applyPendingExperiments on startup), wait, dump, match — and raise the timeout to match observed apply latency. This mirrors the iOS sibling test, which polls a persistent log for 180s.

2. could not connect to TCP port 5554: Connection refused

Misleading message. The real failure is a corrupted SDK package download inside the reactivecircus/android-emulator-runner "Install Android SDK" step (Error reading Zip content from a SeekableByteChannel on the Android Emulator package / the google_apis;x86_64 system image). The emulator is never installed → never boots → the cleanup's adb emu kill reports connection-refused. The action does not retry the SDK download.

Proposed fix: add a step before the emulator-runner that pre-installs emulator + system-images;android-34;google_apis;x86_64 with a bash retry loop. Once the packages are validly installed, the action's own sdkmanager install no-ops and never re-downloads.

Acceptance criteria

  • log-state is re-issued on each poll iteration so a late-persisted enrollment still surfaces.
  • Android emulator SDK packages are pre-downloaded with retry so transient zip-corruption no longer fails the job.
  • No change to enrollment authority — Nimbus still owns assignment; this is read-reliability + CI-infra only.

┆Issue is synchronized with this Jira Task

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions