Skip to content

[GOBBLIN-XXXX] Avoid orphaned JDBC staging tables by checking DROP privilege up front#4199

Draft
DaisyModi wants to merge 3 commits into
apache:masterfrom
DaisyModi:dmodi/jdbc-staging-droppable-check
Draft

[GOBBLIN-XXXX] Avoid orphaned JDBC staging tables by checking DROP privilege up front#4199
DaisyModi wants to merge 3 commits into
apache:masterfrom
DaisyModi:dmodi/jdbc-staging-droppable-check

Conversation

@DaisyModi

Copy link
Copy Markdown
Contributor

Summary

  • JdbcWriterInitializer previously verified staging-table droppability by creating a table, dropping it, and recreating it as a probe, inside a 10-attempt loop. On an account with CREATE but not DROP, each attempt created a stage_<nanoTime> table it couldn't drop — leaving up to 10 orphaned staging tables per run.
  • Replace the probe with an up-front check: new JdbcWriterCommands.hasDropPrivilege(database) (default true; MySQL overrides via SHOW GRANTS FOR CURRENT_USER(), failing open on error/inconclusive parse). If DROP is positively absent, fail fast before creating anything.
  • Keep create-failure retries, but clean up per attempt: on a create failure, best-effort drop that attempt's table before retrying, so a transient failure never leaves an orphan behind.
  • Scope: fixes the missing-DROP-privilege path. Orphans from runs killed mid-write (timeout/SIGKILL) bypass cleanup and need a separate out-of-band sweep.

Testing Done

  • MySqlWriterCommandsTest covers the SHOW GRANTS parse (ALL PRIVILEGES, db.* / *.*, no-DROP, USAGE, different-db, DROP-in-table-name, null).
  • JdbcWriterInitializerTest: no-DROP account fails fast and creates nothing; create-once happy path; retry-then-succeed drops the failed attempt's table (no orphan).
  • Not run locally (Gradle/JDK toolchain in my environment); relying on CI. The grant-string parse is heuristic and does not cover DROP granted via inactive roles — hence fail-open.

Draft pending CI verification and a real GOBBLIN issue id.

Daisy Modi and others added 3 commits June 11, 2026 21:29
…ivilege up front

JdbcWriterInitializer tested whether staging tables were droppable by creating a
table and dropping it as a probe, inside a 10-iteration retry loop. On an account
with CREATE but not DROP, each attempt created a stage_<nanoTime> table that could
not be dropped, leaving up to NAMING_STAGING_TABLE_TRIAL (10) orphaned tables per
run.

Instead, check DROP privilege up front via a new JdbcWriterCommands.hasDropPrivilege()
and fail fast before creating anything when it is positively absent. The MySQL impl
inspects SHOW GRANTS FOR CURRENT_USER() and fails open (returns true) on any error or
inconclusive parse, so a check failure never blocks a legitimately-permitted run.
Other dialects inherit a default that returns true.

createStagingTable now also creates the staging table exactly once (no drop/recreate
probe) and only retries on a genuine name collision; a real create failure propagates
instead of spinning and leaking tables.

Note: this addresses staging-table orphans from the missing-DROP-privilege path only.
Orphans from runs killed mid-write (e.g. connection/timeout, SIGKILL) bypass cleanup
entirely and require a separate out-of-band sweep.

Tests: add MySqlWriterCommandsTest for the SHOW GRANTS parse, and a
JdbcWriterInitializer test asserting a no-DROP account fails fast and creates nothing;
update the existing staging-table test for the create-once behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ailing outright

Restore the create-failure retry that the previous commit dropped, but make each
failed attempt clean up after itself: on a SQLException from createTableStructure,
best-effort drop that attempt's table before retrying, so a transient failure never
leaves an orphan behind. DROP privilege is verified up front, so the cleanup drop
should succeed; if the table was never created the drop is a harmless no-op.

Add a test asserting that when the first create attempt fails and the second
succeeds, two create attempts are made and the failed attempt's table is dropped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cleanup

Make MySqlWriterCommands.drop issue DROP TABLE IF EXISTS so dropping a table that
does not exist is a no-op rather than an error. This lets the staging-table create
retry path clean up its attempt with a plain drop() (no best-effort try/catch needed):
if a failed create never actually created the table, the cleanup drop simply does
nothing. All MySQL drops (including close() cleanup) become idempotent as a result.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant