Skip to content

Spark: Fail fast when metadata file location is null in BaseSparkAction#16846

Open
bobmerevel wants to merge 2 commits into
apache:mainfrom
bobmerevel:spark-fail-fast-null-metadata-location
Open

Spark: Fail fast when metadata file location is null in BaseSparkAction#16846
bobmerevel wants to merge 2 commits into
apache:mainfrom
bobmerevel:spark-fail-fast-null-metadata-location

Conversation

@bobmerevel

Copy link
Copy Markdown

What changes are proposed in this pull request?

This PR adds a fail-fast validation in BaseSparkAction.newStaticTable() to ensure that TableMetadata.metadataFileLocation() is not null before constructing a BaseTable.

Currently, if the metadata file location is null, the Spark action proceeds further into execution and eventually fails with a NullPointerException, making the actual root cause difficult to diagnose.

The new validation makes this precondition explicit and produces a more meaningful error message.

Why are the changes needed?

This improves diagnostics for invalid TableMetadata instances and prevents failures from surfacing much later during Spark job execution.

The intent is not to support invalid catalog implementations, but to fail earlier with a clearer message when the required precondition is violated.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added a unit test covering the case where metadataFileLocation() is null and verified that newStaticTable() fails with an explicit exception.

AI assistance disclosure:

I used AI assistance while drafting the PR description and exploring possible test implementations. The final implementation and testing were reviewed and verified manually.

Closes #16838

- A new `Preconditions` check is added to ensure that the `metadataFileLocation()` returns a non-null value when creating a static table, including improved error handling and validation in the code.
- Introduced a test class `TestBaseSparkAction` with a single test method `testNewStaticTableFailsWhenMetadataLocationIsNull`, which tests the behavior of a Spark action when the metadata file location is null.
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;

public class TestBaseSparkAction extends TestBase {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you follow https://iceberg.apache.org/contribute/#conventions-and-recommendations ?

  • Omit the public modifier for test classes, test methods, and lifecycle methods for newly added tests.
  • Omit the test prefix for newly added test methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ExpireSnapshotsSparkAction fails with NullPointerException when TableMetadata.metadataFileLocation() is null

2 participants