Feature Request / Improvement
Feature description
Implement schema visitor for variant datatype in parquet. This is currently not implemented on the code
Steps to reproduce
- Create a table with a variant field
- Perform a MERGE INTO operation to update records on the target table using a variant property on the condition
Result
- Issue also happens when sub-variant extraction are not included in the condition.
UnsupportedOperationException: Not implemented for variant at org.apache.iceberg.parquet.TypeWithSchemaVisitor.variant(TypeWithSchemaVisitor.java:242)
spark.sql(f"""
CREATE TABLE {table_name} (
id BIGINT,
variant_data VARIANT
) USING iceberg
TBLPROPERTIES ('format-version' = '3')
""")
merge_sql_2 = f"""
MERGE INTO {table_name} AS target
USING merge_source AS source
ON variant_get(target.variant_data, '$.name', 'string') = variant_get(source.variant_data, '$.name', 'string')
AND target.id = source.id
WHEN MATCHED THEN
UPDATE SET target.variant_data = source.variant_data
WHEN NOT MATCHED THEN
INSERT (id, variant_data) VALUES (source.id, source.variant_data)
"""
Expected results
Ability to execute MERGE operation on tables with variant fields being able to use variant_get in Spark to reduce scanned data.
Environment details
- Spark 4.0
- Iceberg 1.11 (build from main)
Query engine
Spark
Willingness to contribute
Feature Request / Improvement
Feature description
Implement schema visitor for variant datatype in parquet. This is currently not implemented on the code
Steps to reproduce
Result
UnsupportedOperationException: Not implemented for variant at org.apache.iceberg.parquet.TypeWithSchemaVisitor.variant(TypeWithSchemaVisitor.java:242)Expected results
Ability to execute MERGE operation on tables with variant fields being able to use variant_get in Spark to reduce scanned data.
Environment details
Query engine
Spark
Willingness to contribute