-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catalog: return Iceberg snapshot log based on Nessie commit history #10033
base: main
Are you sure you want to change the base?
Catalog: return Iceberg snapshot log based on Nessie commit history #10033
Conversation
e0cac8c
to
614f88c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 A couple of minor comments below
var contentHistory = | ||
treeService(apiContext) | ||
.getContentHistory( | ||
key, reference.name(), reference.hashWithRelativeSpec(), requestMeta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it makes sense to limit the history to 2
by default? I suppose most client-side Iceberg operations need at most one parent snapshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, we have to return all the previous snapshots, because we cannot know which ones are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will it work with Nessie GC? We never remove snapshots from iceberg metadata, IIRC 🤔
...og/service/impl/src/main/java/org/projectnessie/catalog/service/impl/CatalogServiceImpl.java
Outdated
Show resolved
Hide resolved
The current behavior of Nessie's Iceberg REST is to return only the most recent Iceberg snapshot. However, this seems to conflict with some Iceberg operations, which are not only maintenance operations, but rather related to "merge on read" / (equality) deletes. This change changes Nessie's behavior by returning older snapshots from a load-table and update-table operations. Register-table operation however do not change, because only the latest snapshot is actually imported. The behavior does change by returning an error if the table to be registered has more than 1 snapshots. Fixes projectnessie#10013 Fixes projectnessie#9969
Ouch. So, a single Iceberg transaction can actually add multiple snapshots - but we only have one commit in Nessie - but need to be able to access both snapshots... |
614f88c
to
134a13c
Compare
8673633
to
e9d0abd
Compare
So, I hope that all tests are happy again. The changes became more intrusive, but it should all work now. Had to let the register-table endpoint not accept Nessie catalog URIs anymore, because that would lead to two Nessie tables pointing to the same snapshot (NessieTableSnapshot) - and it's actually not good in general to have different tables work in the same (storage) location. OTOH externally registered tables now get all snapshots imported via register-table. The check is a bit racy, but I suspect, it's okay-ish / better-than-nothing ™️ . |
var implicitIcebergSnapshot = | ||
snapFuture != null | ||
? implicitIcebergSnapshot( | ||
(NessieTableSnapshot) snapFuture.get(0, TimeUnit.SECONDS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapFuture.join()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It's already finished
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. It's just a nicer method for this case, does not have any timeout parameters and does not have checked exceptions.
URI icebergSnapshot( | ||
Reference effectiveReference, ContentKey key, NessieEntitySnapshot<?> snapshot); | ||
} | ||
boolean checkIcebergSnapshotPresent(String metadataLocation, long versionId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
versionId
-> snapshotId
(for clarity)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The property in IcebergContent
is versionId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the method checks for snapshots and it's not apparent that the versionId
parameter here should be the snapshot ID.
if (snapId == iceberg.currentSnapshotId()) { | ||
continue; | ||
} | ||
if (previous == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is previous
ever non-null? IJ shows all non-test call paths leading to null
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those calls come from importIcebergTable
... So, essentially if someone commits via Nessie API, each NessieTableSnapshot
(after importing) will have duplicate "implicit" snapshot lists. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why keep previous
if it is always null
in production call paths?
@@ -1599,6 +1626,23 @@ public static void addSnapshot(AddSnapshot u, IcebergTableMetadataUpdateState st | |||
.icebergManifestListLocation(icebergSnapshot.manifestList()) | |||
.icebergManifestFileLocations(icebergSnapshot.manifests()); | |||
|
|||
// If another Iceberg snapshot was add in this same update-transaction, memoize the previous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add
-> added
- Nessie now returns the snapshot history in the `snapshots` and `snapshot-log` attributes of an Iceberg | ||
table-metadata retrieved via Iceberg REST for table changes that have been committed via Iceberg REST | ||
to Nessie 0.101.0 or newer. Commits made using older Nessie versions will not return older snapshots. | ||
- Generally, not only for Nessie, It is recommended to keep the number of snapshots maintained in an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Generally, not only for Nessie, It is recommended to keep the number of snapshots maintained in an | |
- Generally, not only for Nessie, it is recommended to keep the number of snapshots maintained in an |
|
||
/** | ||
* Contains Iceberg snapshots for which no explicit Nessie commit exists. We need to memoize those | ||
* snapshots here in case a single Iceberg transaction add multiple snapshots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* snapshots here in case a single Iceberg transaction add multiple snapshots. | |
* snapshots here in case a single Iceberg transaction adds multiple snapshots. |
// iterator | ||
tuple(keyRenamed, "rename 1", hashRename1.getCommitHash(), contentRename1), | ||
tuple(key, "update 2", hashUpdate2.getCommitHash(), contentUpdate2), | ||
tuple(key, "create-other", hashCreateOther.getCommitHash(), contentUpdate1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commit returned, since it does not change key
? Imho it should return the "update 1"
commit message here since that is the commit that introduced contentUpdate1
.
The current behavior of Nessie's Iceberg REST is to return only the most recent Iceberg snapshot. However, this seems to conflict with some Iceberg operations, which are not only maintenance operations, but rather related to "merge on read" / (equality) deletes.
This change changes Nessie's behavior by returning older snapshots from a load-table and update-table operations.
Register-table operation however do not change, because only the latest snapshot is actually imported. The behavior does change by returning an error if the table to be registered has more than 1 snapshots.
Fixes #10013
Fixes #9969