Skip to content

Implement VariantVisitor (parquet) to support MERGE INTO operations #14707

@enriquh

Description

@enriquh

Feature Request / Improvement

Feature description

Implement schema visitor for variant datatype in parquet. This is currently not implemented on the code

Steps to reproduce

  1. Create a table with a variant field
  2. Perform a MERGE INTO operation to update records on the target table using a variant property on the condition
    Result
  3. Issue also happens when sub-variant extraction are not included in the condition.

UnsupportedOperationException: Not implemented for variant at org.apache.iceberg.parquet.TypeWithSchemaVisitor.variant(TypeWithSchemaVisitor.java:242)

spark.sql(f"""
CREATE TABLE {table_name} (
id BIGINT,
variant_data VARIANT
) USING iceberg
TBLPROPERTIES ('format-version' = '3')
""")

merge_sql_2 = f"""
MERGE INTO {table_name} AS target
USING merge_source AS source
ON variant_get(target.variant_data, '$.name', 'string') = variant_get(source.variant_data, '$.name', 'string')
AND target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.variant_data = source.variant_data
WHEN NOT MATCHED THEN
INSERT (id, variant_data) VALUES (source.id, source.variant_data)
"""

Expected results

Ability to execute MERGE operation on tables with variant fields being able to use variant_get in Spark to reduce scanned data.

Environment details

  • Spark 4.0
  • Iceberg 1.11 (build from main)

Query engine

Spark

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementPR that improves existing functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions