I created simple table using build from upstream pr #1896:
┌─statement──────────────────────────────────────────────────────────────────────────────┐
1. │ CREATE TABLE default.`ns1.table1` ↴│
│↳( ↴│
│↳ `col1` Nullable(Int32) ↴│
│↳) ↴│
│↳ENGINE = Iceberg('http://minio:9000/warehouse/data2/', 'admin', '[HIDDEN]', 'Parquet')↴│
│↳PARTITION BY col1 ↴│
│↳ORDER BY col1 │
└────────────────────────────────────────────────────────────────────────────────────────┘
1 row in set. Elapsed: 0.001 sec.
And inserted three values with one insert:
SELECT *
FROM `ns1.table1`
ORDER BY col1 ASC
Query id: 5f00f445-4e94-4c1c-99b6-6f4867c7ab17
┌─col1─┐
1. │ 1 │
2. │ 2 │
3. │ 3 │
└──────┘
Issues with metadata:
- Missing current-snapshot-id
"refs" : {
"main" : {
"snapshot-id" : 2164490262916510684,
"type" : "branch"
}
},
But Iceberg says current-snapshot-id should match the current ID of the main branch in refs. Your metadata has refs.main.snapshot-id, but no top-level:
"current-snapshot-id": 2164490262916510684
The spec describes current-snapshot-id as the table’s current snapshot ID and says it must match the current ID of the main branch in refs.
- parent-snapshot-id: -1 should be omitted - The spec says parent-snapshot-id is optional and “omitted for any snapshot with no parent.”
- metadata-log probably should not point to the current metadata file
"metadata-log": [
{
"metadata-file": "/data2/metadata/v2.metadata.json",
"timestamp-ms": 1781007525135
}
]
Iceberg’s metadata-log is meant to track previous metadata files, not the current one. If this JSON file is itself v2.metadata.json, then the log entry should usually point to v1.metadata.json, or be empty/omitted for a first metadata file. The spec says each new metadata file adds the previous metadata file location to the log.
- location may be questionable
This may work in a local filesystem setup, but it depends on the catalog/engine. In many Iceberg setups this would be something like:
"location": "file:/data2"
or
"location": "s3://bucket/path/table"
The spec treats location as the table’s base location; for newer format descriptions, it notes that when present it must be an absolute path.
I created simple table using build from upstream pr #1896:
And inserted three values with one insert:
Issues with metadata:
But Iceberg says current-snapshot-id should match the current ID of the main branch in refs. Your metadata has refs.main.snapshot-id, but no top-level:
The spec describes current-snapshot-id as the table’s current snapshot ID and says it must match the current ID of the main branch in refs.
Iceberg’s metadata-log is meant to track previous metadata files, not the current one. If this JSON file is itself v2.metadata.json, then the log entry should usually point to v1.metadata.json, or be empty/omitted for a first metadata file. The spec says each new metadata file adds the previous metadata file location to the log.
This may work in a local filesystem setup, but it depends on the catalog/engine. In many Iceberg setups this would be something like:
or
The spec treats location as the table’s base location; for newer format descriptions, it notes that when present it must be an absolute path.