Skip to content

SQLAlchemy reflection (autoload_with) does not support S3 Tables three-part identifiers #728

Description

@laughingman7743

Summary

Follow-up from #710 / #727. PR #727 adds S3 Tables support for DDL emission (three-part catalog.namespace.table identifiers and omitting LOCATION for managed storage). It does not cover reflection: reflecting an S3 Tables table via SQLAlchemy autoload_with / MetaData(schema="s3tablescatalog/<bucket>.<namespace>") does not work.

Root cause

The dialect's introspection path passes the table's schema straight through as the Athena database name and never splits the catalog from the namespace. For a three-part S3 Tables identifier the catalog (s3tablescatalog/<bucket>) and namespace are different things, but get_columns / _get_table call cursor.get_table_metadata(table_name, schema_name=schema) with the whole dotted string, while the catalog stays the connection default (AwsDataCatalog). So the lookup targets the wrong catalog/database and the table is not found.

Relevant code: pyathena/sqlalchemy/base.py (_get_table, get_columns, get_table_names, has_table, ...).

Scope to investigate

  1. Decide how a reflected S3 Tables table should be addressed — split the dotted schema into (catalog, database) and pass the catalog through to get_table_metadata / list_table_metadata / list_databases.
  2. Make get_columns, _get_table, has_table, get_table_names, and get_view_names catalog-aware for s3tablescatalog/<bucket>.<namespace> schemas.
  3. Add reflection round-trip coverage to the gated S3 Tables E2E tests in tests/pyathena/sqlalchemy/test_base.py (the current test_create_s3tables_iceberg_table verifies creation via a raw SELECT because reflection is unsupported).

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions