[SPARK-57531][SQL] Reuse OrcTail in non-vectorized ORC reader path by cxzl25 · Pull Request #56591 · apache/spark

cxzl25 · 2026-06-18T10:14:12Z

What changes were proposed in this pull request?

This PR mirrors what buildColumnarReader already does since SPARK-44556: capture the readerOptions returned by createORCReader, then pass readerOptions.getOrcTail when constructing the per-split record reader so the footer is not re-read.

Why are the changes needed?

OrcPartitionReaderFactory.buildReader (the non-vectorized / row-based read path) previously called OrcInputFormat.createRecordReader(fileSplit, taskAttemptContext), which internally calls OrcFile.createReader without an OrcTail and therefore re-parses the file footer from storage on every split.
Without OrcTail reuse the non-vectorized path pays this cost a second time when opening the data reader for each split, while the vectorized path has been avoiding it since SPARK-44556.

Does this PR introduce any user-facing change?

No

How was this patch tested?

GHA

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code

dongjoon-hyun

Thank you. Could you update DataSource v1 patch, OrcFileFormat.scala?

SPARK-57531

301b766

dongjoon-hyun requested changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57531][SQL] Reuse OrcTail in non-vectorized ORC reader path#56591

[SPARK-57531][SQL] Reuse OrcTail in non-vectorized ORC reader path#56591
cxzl25 wants to merge 1 commit into
apache:masterfrom
cxzl25:SPARK-57531

cxzl25 commented Jun 18, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cxzl25 commented Jun 18, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants