Summary
Follow-up from #710 / #727. PR #727 adds S3 Tables support for DDL emission (three-part catalog.namespace.table identifiers and omitting LOCATION for managed storage). It does not cover reflection: reflecting an S3 Tables table via SQLAlchemy autoload_with / MetaData(schema="s3tablescatalog/<bucket>.<namespace>") does not work.
Root cause
The dialect's introspection path passes the table's schema straight through as the Athena database name and never splits the catalog from the namespace. For a three-part S3 Tables identifier the catalog (s3tablescatalog/<bucket>) and namespace are different things, but get_columns / _get_table call cursor.get_table_metadata(table_name, schema_name=schema) with the whole dotted string, while the catalog stays the connection default (AwsDataCatalog). So the lookup targets the wrong catalog/database and the table is not found.
Relevant code: pyathena/sqlalchemy/base.py (_get_table, get_columns, get_table_names, has_table, ...).
Scope to investigate
- Decide how a reflected S3 Tables table should be addressed — split the dotted
schema into (catalog, database) and pass the catalog through to get_table_metadata / list_table_metadata / list_databases.
- Make
get_columns, _get_table, has_table, get_table_names, and get_view_names catalog-aware for s3tablescatalog/<bucket>.<namespace> schemas.
- Add reflection round-trip coverage to the gated S3 Tables E2E tests in
tests/pyathena/sqlalchemy/test_base.py (the current test_create_s3tables_iceberg_table verifies creation via a raw SELECT because reflection is unsupported).
References
Summary
Follow-up from #710 / #727. PR #727 adds S3 Tables support for DDL emission (three-part
catalog.namespace.tableidentifiers and omittingLOCATIONfor managed storage). It does not cover reflection: reflecting an S3 Tables table via SQLAlchemyautoload_with/MetaData(schema="s3tablescatalog/<bucket>.<namespace>")does not work.Root cause
The dialect's introspection path passes the table's
schemastraight through as the Athena database name and never splits the catalog from the namespace. For a three-part S3 Tables identifier the catalog (s3tablescatalog/<bucket>) and namespace are different things, butget_columns/_get_tablecallcursor.get_table_metadata(table_name, schema_name=schema)with the whole dotted string, while the catalog stays the connection default (AwsDataCatalog). So the lookup targets the wrong catalog/database and the table is not found.Relevant code:
pyathena/sqlalchemy/base.py(_get_table,get_columns,get_table_names,has_table, ...).Scope to investigate
schemainto(catalog, database)and pass the catalog through toget_table_metadata/list_table_metadata/list_databases.get_columns,_get_table,has_table,get_table_names, andget_view_namescatalog-aware fors3tablescatalog/<bucket>.<namespace>schemas.tests/pyathena/sqlalchemy/test_base.py(the currenttest_create_s3tables_iceberg_tableverifies creation via a rawSELECTbecause reflection is unsupported).References