-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop table should not clean the folder for Nessie catalog #22392
Conversation
@@ -237,7 +237,8 @@ public void dropTable(ConnectorSession session, SchemaTableName schemaTableName) | |||
BaseTable table = (BaseTable) loadTable(session, schemaTableName); | |||
validateTableCanBeDropped(table); | |||
nessieClient.dropTable(toIdentifier(schemaTableName), true); | |||
deleteTableDirectory(fileSystemFactory.create(session), schemaTableName, table.location()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial PR might have copied code from other catalogs and forgot to handle it.
Now the behaviour is same as Spark integration (expected)
@@ -859,7 +859,7 @@ private ZonedDateTime getSnapshotTime(String tableName, long snapshotId) | |||
.getOnlyColumnAsSet()); | |||
} | |||
|
|||
private String getTableLocation(String tableName) | |||
protected String getTableLocation(String tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed for overriding the tests
@@ -140,8 +163,8 @@ protected void dropTableFromMetastore(String tableName) | |||
@Override | |||
protected String getMetadataLocation(String tableName) | |||
{ | |||
// used when registering a table, which is not supported by the Nessie catalog | |||
throw new UnsupportedOperationException("metadata location for register_table is not supported"); | |||
BaseTable table = (BaseTable) catalog.loadTable(TableIdentifier.of("tpch", tableName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getMetadataLocation
is a required functionality and it is used for other than register table also.
@@ -197,7 +220,7 @@ public void testRegisterTableWithDifferentTableName() | |||
public void testRegisterTableWithMetadataFile() | |||
{ | |||
assertThatThrownBy(super::testRegisterTableWithMetadataFile) | |||
.hasMessageContaining("metadata location for register_table is not supported"); | |||
.hasMessageContaining("register_table procedure is disabled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because now getMetadataLocation
is implemented.
assertThat(getQueryRunner().tableExists(getSession(), tableName)).isFalse(); | ||
assertThat(fileSystem.listFiles(tableLocation).hasNext()) | ||
.describedAs("Table location should exist") | ||
.isTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base tests assume table location to NOT exist. Hence, had to override as per Nessie behaviour.
cc: @dimas-b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Thanks, @ajantha-bhat !
ping for review |
@ajantha-bhat : Will this problem also exist when Nessie is used via its Iceberg REST Catalog API? |
Server will handle the delete implementation for the REST catalog as of now. So that problem does not exist for REST catalog. |
The same table might still be live in other branches or tags. Therefore, dropping the table should not clean up the files as it is not reference-aware. Use the Nessie GC tool to clean up expired files. This behavior is consistent with the Spark Nessie integration.
@findepi or @findinpath: Can you please take a look? |
We've been running into issues with Trino and nessie branching because of this. |
@olivier-derom: Thanks for using Nessie and Trino. We are working with the community to get this merged. |
At first glance this looks good to me. Maybe @cwsteinbach can help next. |
One question/remark I have is around consistency across different metastores/catalogs. Seems like we are ending up in a situation where dropping a table has different results in terms of meta data and file deletion depending on what catalog is used (Nessie vs REST vs HMS vs...). But I assume we cant really avoid given the expectation of consistent behaviour across query engines. Under that assumption I think the implemented approach and reliance on the Nessie GC is fine. Unless of course there is development in the Iceberg spec around what is supposed to happen on dropping tables. |
@mosabua, the Iceberg table format spec doesn't cover this, and I don't expect that it ever will. It would be really nice to have consistent DDL behavior across engine/catalog combinations, but I don't think Iceberg can dictate DDL semantics for third-party systems, and even if they could, it would create a lot of pain for users in the form of backward-incompatible changes. That said, the Iceberg REST Catalog API does make a distinction between only dropping the table record and dropping both the table record and the underlying data, and defaults to the former rather than the latter: |
Thanks @mosabua and @cwsteinbach for the inputs. Yeah. Since Nessie is a catalog level versioning. We can't drop a table without knowing where and all it is referenced. Hence, we block it from the usual path and recommend using Nessie GC. This is one of the complexity that has introduced because of catalog level versioning. With REST API v2 we are planning to formally introduce catalog level versioning capability in the REST catalog spec for easy clarification to the users. Lastly, let me know if anything needed for this PR to get merged. Thanks. |
Thanks @ajantha-bhat |
Thanks for the review and merge. |
I've just updated nessie to the latest version as well @ajantha-bhat |
Description
The same table might still be live in other branches or tags. Therefore, dropping the table should not clean up the files as it is not reference-aware. Use the Nessie GC tool to clean up expired files. This behaviour is consistent with the Spark Nessie integration.
Additional context and related issues
Spark integration:
https://github.com/apache/iceberg/blob/main/nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java#L557-L562
Docs: https://iceberg.apache.org/docs/1.5.1/nessie/#further-use-cases
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: