Skip to content

Migrate file-preview Parquet reader off @dsnp/parquetjs (thrift CVE chain) #9094

Description

@craxal

Problem

The file-preview component reads Parquet files using @dsnp/parquetjs, which transitively pulls thrift@0.21.0.

@dsnp/parquetjs is updated infrequently (latest release was about a year ago) and pins thrift to 0.21.0 exactly, so we cannot pull a newer thrift through normal version-range resolution. The thrift npm package itself is largely unmaintained at the JS layer. This violates our policy of consuming only dependencies that receive updates roughly every 6 months, and the older thrift package is a recurring source of dependency-hygiene noise that we cannot address without changing the parquet reader.

Fix

Migrate file-preview's Parquet reading code from @dsnp/parquetjs to hyparquet:

  • Pure JavaScript, no native dependencies
  • No thrift dependency
  • Actively maintained (multiple releases per month as of 2026)
  • MIT-licensed

The API differs from @dsnp/parquetjs, so the migration is a code change rather than a drop-in swap. Affected file is src/components/file-preview/src/parseParquetBuffer.ts (and its tests). Verify behavior parity for all existing parquet preview test fixtures, including decimal handling, timestamp formatting, and any related open feature work (#7506 delta encoding, #7675 ZSTD support).

Consider whether this migration interacts with #8447 (DuckDB-based tabular previewing). If DuckDB is the long-term direction for tabular file previews, this hyparquet migration is a tactical fix; if DuckDB lands in the same milestone, the migration may be subsumed by it.

Side benefits

  • Drops a slow-moving dependency from the supply chain.
  • Pure-JS implementation simplifies cross-platform builds.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions