Migration runner leaks idle-in-transaction sessions on postgres (v3.13.9)

## Summary

Tina4 v3.13's migration runner opens a transaction for the migration-tracking lookup queries and never closes/commits it. With `psycopg2` (autocommit off by default) the connection is then left in `idle in transaction` state indefinitely. Over time these leak across pod restarts and eventually exhaust the postgres connection pool — at which point new boots fail with:

```
FATAL: remaining connection slots are reserved for roles with the SUPERUSER attribute
```

Tina4's autodiscovery opens a fresh connection per `src/app/*.py` it loads, so when the pool is saturated every module load fails and `src/routes/*` never register. The HTTP server still answers `/health-check` (probe passes → pod marked Ready), but every actual app route returns 404 — a silent "ready but broken" state.

## Versions

- `tina4_python` **3.13.9**
- `psycopg2` (postgres driver, transactions=on by default)
- Python 3.13 / Alpine
- PostgreSQL 14 / 16 (reproduced on both)

## Reproduction

1. Boot a Tina4 v3.13 app against a postgres DB
2. Look at `pg_stat_activity` shortly after startup:

```sql
SELECT pid, state, age(now(), xact_start) AS tx_age, substr(query, 1, 80)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND application_name = ''  -- the app
ORDER BY backend_start;
```

You'll see at least one session like:

```
pid       state                last_query                                                  tx_age
2371466   idle in transaction  SELECT MAX(batch) as max_batch FROM tina4_migration         8m
```

The transaction stays open for the lifetime of the connection (we've observed sessions sitting in this state for **2+ days**, eventually being killed by `idle_in_transaction_session_timeout`).

## Smoking gun in source

Two unwrapped reads in [`tina4_python/migration/runner.py`](tina4_python/migration/runner.py):

- Line 297 — `_get_next_batch`:
  ```python
  row = db.fetch_one("SELECT MAX(batch) as max_batch FROM tina4_migration")
  ```
- Line 491:
  ```python
  row = db.fetch_one("SELECT MAX(batch) as max_batch FROM tina4_migration WHERE passed = 1")
  ```

Neither path commits or rolls back, and the `db.fetch_one` helper appears to start an implicit transaction (psycopg2 default). The transaction therefore sits open holding row-level locks on `tina4_migration` and a connection slot.

## Impact

- Each fresh pod start leaks (at minimum) one `idle in transaction` connection.
- On a busy cluster with frequent restarts, the pool fills in days. We hit `FATAL: remaining connection slots are reserved` after ~89 restarts over 38 hours on a postgres with `max_connections=100`.
- Symptoms are confusing because the pod looks healthy: liveness/readiness pass (just `/health-check`), but actual app routes 404 because the module autodiscovery silently failed during boot when connections were saturated.

## Workaround we're using

Set a per-role timeout so postgres auto-kills the zombies:

```sql
ALTER ROLE <app_role> SET idle_in_transaction_session_timeout = '300s';
```

This stops the leak from saturating the pool, but doesn't fix the underlying transaction-handling bug.

## Suggested fix

Wrap the read-only lookups in explicit `with db.transaction(): ...` (or commit immediately), or run them in autocommit mode since they're plain SELECTs that don't need transactional isolation. Same treatment likely needed in the analogous spots in `tina4_python/database/` where helper-style `db.fetch_one`/`db.fetch_all` start implicit transactions.

Happy to send a PR if the team confirms the intended fix shape.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration runner leaks idle-in-transaction sessions on postgres (v3.13.9) #51

Summary

Versions

Reproduction

Smoking gun in source

Impact

Workaround we're using

Suggested fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Migration runner leaks idle-in-transaction sessions on postgres (v3.13.9) #51

Description

Summary

Versions

Reproduction

Smoking gun in source

Impact

Workaround we're using

Suggested fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions