Skip to content

[SPARK-57525][CONNECT] Declarative Pipelines should not throw NoSuchElementException when a run fails without an attached cause#56594

Open
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:sdp-run-failure-no-cause
Open

[SPARK-57525][CONNECT] Declarative Pipelines should not throw NoSuchElementException when a run fails without an attached cause#56594
LuciferYang wants to merge 1 commit into
apache:masterfrom
LuciferYang:sdp-run-failure-no-cause

Conversation

@LuciferYang

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

PipelinesHandler.startRun rethrows a failed pipeline run to the Spark Connect client via runFailureEvent.foreach { event => throw event.error.get }. But event.error is None for run termination reasons that carry no cause - UnexpectedRunFailure and FailureStoppingFlow both have cause = None - so event.error.get raised a NoSuchElementException, crashing the handler and hiding the real failure from the client.

This PR extracts the rethrow into throwRunFailure: when the failure has a cause it is rethrown unchanged; when it does not, a SparkException with a new PIPELINE_RUN_FAILED error condition is thrown, carrying the run's termination message. PIPELINE_RUN_FAILED (rather than INTERNAL_ERROR) is used so that operational outcomes such as FailureStoppingFlow are not mislabeled as Spark bugs.

Why are the changes needed?

A run that fails without an attached cause (e.g. UnexpectedRunFailure, or a flow that fails to stop) currently surfaces to the Connect client as an opaque NoSuchElementException ("None.get") instead of the actual run-failure message. That masks the real problem and looks like an internal error. These reasons reach this code via the asynchronous onCompletion path, where PipelineExecution.runPipeline's own catch never fires.

Does this PR introduce any user-facing change?

Yes. When a pipeline run fails without an attached cause, the Spark Connect client now receives a PIPELINE_RUN_FAILED error carrying the run's termination message (e.g. "Run failed unexpectedly.") instead of a NoSuchElementException.

How was this patch tested?

New PipelinesHandlerSuite unit-tests throwRunFailure for both cases: the cause-present case rethrows the original cause, and the no-cause case throws a PIPELINE_RUN_FAILED SparkException carrying the termination message (verified with checkError, using the real UnexpectedRunFailure and FailureStoppingFlow messages). The cause-less termination reasons cannot be triggered deterministically through the end-to-end run path, so the rethrow is unit-tested directly. SparkThrowableSuite validates the new error condition.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

…lementException when a run fails without an attached cause

When a pipeline run fails, PipelinesHandler.startRun rethrows the failure to the Spark Connect client via runFailureEvent.foreach { event => throw event.error.get }. But event.error is None for some run termination reasons - UnexpectedRunFailure and FailureStoppingFlow both have cause = None - so event.error.get raised a NoSuchElementException, crashing the handler with a meaningless internal error and hiding the real failure (e.g. "Run failed unexpectedly.") from the client. These reasons reach this code via the asynchronous onCompletion path, where PipelineExecution.runPipeline's own catch never fires.

Extract the rethrow into throwRunFailure: when a cause is present it is rethrown unchanged; when absent, throw a SparkException with a new PIPELINE_RUN_FAILED error condition carrying the run's termination message. PIPELINE_RUN_FAILED (rather than INTERNAL_ERROR) is used so operational outcomes such as FailureStoppingFlow are not mislabeled as Spark bugs.

Added PipelinesHandlerSuite. The cause-less termination reasons cannot be triggered deterministically through the end-to-end run path, so the rethrow is unit-tested directly using the real UnexpectedRunFailure and FailureStoppingFlow messages.
@LuciferYang LuciferYang force-pushed the sdp-run-failure-no-cause branch from 5d686df to c8e4062 Compare June 18, 2026 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant