Skip to content

get_simulations drops LIMIT when max_results is None and interpolates into SQL #3451

Description

@MaxGhenis

Summary

get_simulations builds its ORDER BY ... LIMIT clause via f-string interpolation, not parameters, and when max_results is None the entire LIMIT clause is dropped — the handler returns the whole reform_impact table.

Location

policyengine_api/endpoints/simulation.py:29-33

What goes wrong

def get_simulations(
    max_results: int = 100,
):
    desc_limit = f"DESC LIMIT {max_results}" if max_results is not None else ""

    result = local_database.query(
        f"SELECT * FROM reform_impact ORDER BY start_time {desc_limit}",
    ).fetchall()

Two problems:

  1. If any caller passes max_results=None, the query degrades to SELECT * FROM reform_impact ORDER BY start_time, loading the entire table into memory. The reform_impact table grows without bound, so this is a latent DoS.
  2. max_results is concatenated into the SQL string. The current signature types it as int, but nothing enforces that at runtime. If this function ever grows a call path that forwards an HTTP query parameter, it becomes an injection point.

Suggested fix

Always apply a bounded LIMIT and pass it as a parameter:

def get_simulations(max_results: int | None = 100) -> dict:
    limit = min(int(max_results or 100), 1000)
    result = local_database.query(
        "SELECT * FROM reform_impact ORDER BY start_time DESC LIMIT ?",
        (limit,),
    ).fetchall()
    return {"result": [dict(r) for r in result]}

Severity

Medium.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsecuritySecurity issues

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions