Skip to content

Add bounded retry and conservative failover chain #7

Description

@xodapi

Why

When one free model is temporarily unavailable, OpenCode/Droid sessions should not fail immediately if another configured model can safely handle the request.

Scope

  • Retry transient 429/5xx/timeouts with bounded exponential backoff.
  • Fail over to the next eligible model only when safe.
  • Treat streaming and non-idempotent request shapes carefully.
  • Record original and final model in privacy-safe metrics.

Acceptance criteria

  • Retry count and final model are visible in metrics without storing prompts/responses.
  • Failover does not loop indefinitely.
  • Tests cover success-after-retry, all-models-fail, and no-retry cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: reliabilityRuntime reliability, retries, failover, watchdogsenhancementNew feature or requestpriority: p1High priority

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions