Describe the bug
When building a snapshot summary, a user-supplied snapshot property can overwrite a computed metric key (e.g. added-data-files, added-records), which is the opposite of iceberg-java's behavior and has two consequences:
-
Corrupted summary metrics. In SnapshotProducer::summary (crates/iceberg/src/transaction/snapshot.rs), the computed metrics are built first and then the user's snapshot_properties are merged on top:
let mut additional_properties = summary_collector.build();
additional_properties.extend(self.snapshot_properties.clone());
So a caller doing fast_append().set_snapshot_properties({"added-data-files": "9999"}) ends up writing added-data-files=9999 into the snapshot summary instead of the real count.
-
Panic on a non-integer value. Those user values then flow into update_totals (crates/iceberg/src/spec/snapshot_summary.rs), which parses added-*/removed-* with .expect("must be parsable as it was just serialized"). A non-integer value (e.g. added-data-files=bad) panics the commit. (A bad value can also legitimately arrive from a previous snapshot's summary.)
Expected behavior (matching iceberg-java)
iceberg-java does not let user properties shadow computed metrics, and tolerates unparseable totals instead of failing:
SnapshotSummary.Builder.build() adds custom properties first (builder.putAll(properties)), then metrics.addTo(builder) — so computed metrics win on a key collision.
SnapshotProducer.updateTotal wraps the Long.parseLong calls in try { … } catch (NumberFormatException e) { // ignore and do not add total } — an unparseable value skips that total rather than throwing.
(Both verified against apache/iceberg main: core/src/main/java/org/apache/iceberg/SnapshotProducer.java and SnapshotSummary.java.)
So in iceberg-rust:
- Computed
added-* / total-* metrics should override any colliding user-supplied snapshot property.
update_totals should skip a total when an added/removed value is unparseable, not panic.
To Reproduce
let action = tx
.fast_append()
.set_snapshot_properties(HashMap::from([
("added-data-files".to_string(), "9999".to_string()),
]))
.add_data_files(vec![one_data_file]);
let commit = Arc::new(action).commit(&table).await.unwrap();
// snapshot summary reports added-data-files=9999 instead of 1
Setting added-data-files to a non-integer instead panics in update_totals.
Willingness to contribute
I can contribute a fix for this bug independently.
Describe the bug
When building a snapshot summary, a user-supplied snapshot property can overwrite a computed metric key (e.g.
added-data-files,added-records), which is the opposite of iceberg-java's behavior and has two consequences:Corrupted summary metrics. In
SnapshotProducer::summary(crates/iceberg/src/transaction/snapshot.rs), the computed metrics are built first and then the user'ssnapshot_propertiesare merged on top:So a caller doing
fast_append().set_snapshot_properties({"added-data-files": "9999"})ends up writingadded-data-files=9999into the snapshot summary instead of the real count.Panic on a non-integer value. Those user values then flow into
update_totals(crates/iceberg/src/spec/snapshot_summary.rs), which parsesadded-*/removed-*with.expect("must be parsable as it was just serialized"). A non-integer value (e.g.added-data-files=bad) panics the commit. (A bad value can also legitimately arrive from a previous snapshot's summary.)Expected behavior (matching iceberg-java)
iceberg-java does not let user properties shadow computed metrics, and tolerates unparseable totals instead of failing:
SnapshotSummary.Builder.build()adds custom properties first (builder.putAll(properties)), thenmetrics.addTo(builder)— so computed metrics win on a key collision.SnapshotProducer.updateTotalwraps theLong.parseLongcalls intry { … } catch (NumberFormatException e) { // ignore and do not add total }— an unparseable value skips that total rather than throwing.(Both verified against
apache/icebergmain:core/src/main/java/org/apache/iceberg/SnapshotProducer.javaandSnapshotSummary.java.)So in iceberg-rust:
added-*/total-*metrics should override any colliding user-supplied snapshot property.update_totalsshould skip a total when an added/removed value is unparseable, not panic.To Reproduce
Setting
added-data-filesto a non-integer instead panics inupdate_totals.Willingness to contribute
I can contribute a fix for this bug independently.