Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop table should not clean the folder for Nessie catalog #22392

Merged
merged 1 commit into from
Jul 4, 2024

Conversation

ajantha-bhat
Copy link
Member

Description

The same table might still be live in other branches or tags. Therefore, dropping the table should not clean up the files as it is not reference-aware. Use the Nessie GC tool to clean up expired files. This behaviour is consistent with the Spark Nessie integration.

Additional context and related issues

Spark integration:
https://github.com/apache/iceberg/blob/main/nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java#L557-L562
Docs: https://iceberg.apache.org/docs/1.5.1/nessie/#further-use-cases

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jun 14, 2024
@github-actions github-actions bot added the iceberg Iceberg connector label Jun 14, 2024
@@ -237,7 +237,8 @@ public void dropTable(ConnectorSession session, SchemaTableName schemaTableName)
BaseTable table = (BaseTable) loadTable(session, schemaTableName);
validateTableCanBeDropped(table);
nessieClient.dropTable(toIdentifier(schemaTableName), true);
deleteTableDirectory(fileSystemFactory.create(session), schemaTableName, table.location());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial PR might have copied code from other catalogs and forgot to handle it.

Now the behaviour is same as Spark integration (expected)

@@ -859,7 +859,7 @@ private ZonedDateTime getSnapshotTime(String tableName, long snapshotId)
.getOnlyColumnAsSet());
}

private String getTableLocation(String tableName)
protected String getTableLocation(String tableName)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed for overriding the tests

@@ -140,8 +163,8 @@ protected void dropTableFromMetastore(String tableName)
@Override
protected String getMetadataLocation(String tableName)
{
// used when registering a table, which is not supported by the Nessie catalog
throw new UnsupportedOperationException("metadata location for register_table is not supported");
BaseTable table = (BaseTable) catalog.loadTable(TableIdentifier.of("tpch", tableName));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getMetadataLocation is a required functionality and it is used for other than register table also.

@@ -197,7 +220,7 @@ public void testRegisterTableWithDifferentTableName()
public void testRegisterTableWithMetadataFile()
{
assertThatThrownBy(super::testRegisterTableWithMetadataFile)
.hasMessageContaining("metadata location for register_table is not supported");
.hasMessageContaining("register_table procedure is disabled");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because now getMetadataLocation is implemented.

assertThat(getQueryRunner().tableExists(getSession(), tableName)).isFalse();
assertThat(fileSystem.listFiles(tableLocation).hasNext())
.describedAs("Table location should exist")
.isTrue();
Copy link
Member Author

@ajantha-bhat ajantha-bhat Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base tests assume table location to NOT exist. Hence, had to override as per Nessie behaviour.

@ajantha-bhat
Copy link
Member Author

cc: @dimas-b

Copy link

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Thanks, @ajantha-bhat !

@wendigo wendigo requested a review from findinpath June 14, 2024 13:20
@ajantha-bhat
Copy link
Member Author

ping for review

@dimas-b
Copy link

dimas-b commented Jun 21, 2024

@ajantha-bhat : Will this problem also exist when Nessie is used via its Iceberg REST Catalog API?

@ajantha-bhat
Copy link
Member Author

@ajantha-bhat : Will this problem also exist when Nessie is used via its Iceberg REST Catalog API?

Server will handle the delete implementation for the REST catalog as of now. So that problem does not exist for REST catalog.

https://github.com/apache/iceberg/blob/a47937c0c1fcafe57d7dc83551d8c9a3ce0ab1b9/core/src/main/java/org/apache/iceberg/rest/RESTClient.java#L38-L71

@ajantha-bhat ajantha-bhat changed the title Nessie: Drop table should not clean the folder Drop table should not clean the folder for Nessie catalog Jun 23, 2024
The same table might still be live in other branches or tags.
Therefore, dropping the table should not clean up the files as it
is not reference-aware. Use the Nessie GC tool to clean up expired
files. This behavior is consistent with the Spark Nessie integration.
@ajantha-bhat
Copy link
Member Author

@findepi or @findinpath: Can you please take a look?

@ajantha-bhat ajantha-bhat requested a review from ebyhr June 25, 2024 13:14
@ebyhr ebyhr removed their request for review June 26, 2024 00:42
@olivier-derom
Copy link

We've been running into issues with Trino and nessie branching because of this.
This PR would be a great improvement, thanks!

@ajantha-bhat
Copy link
Member Author

@olivier-derom: Thanks for using Nessie and Trino. We are working with the community to get this merged.

@mosabua
Copy link
Member

mosabua commented Jul 3, 2024

At first glance this looks good to me. Maybe @cwsteinbach can help next.

@mosabua
Copy link
Member

mosabua commented Jul 3, 2024

One question/remark I have is around consistency across different metastores/catalogs. Seems like we are ending up in a situation where dropping a table has different results in terms of meta data and file deletion depending on what catalog is used (Nessie vs REST vs HMS vs...). But I assume we cant really avoid given the expectation of consistent behaviour across query engines.

Under that assumption I think the implemented approach and reliance on the Nessie GC is fine. Unless of course there is development in the Iceberg spec around what is supposed to happen on dropping tables.

@cwsteinbach
Copy link

Unless of course there is development in the Iceberg spec around what is supposed to happen on dropping tables.

@mosabua, the Iceberg table format spec doesn't cover this, and I don't expect that it ever will. It would be really nice to have consistent DDL behavior across engine/catalog combinations, but I don't think Iceberg can dictate DDL semantics for third-party systems, and even if they could, it would create a lot of pain for users in the form of backward-incompatible changes.

That said, the Iceberg REST Catalog API does make a distinction between only dropping the table record and dropping both the table record and the underlying data, and defaults to the former rather than the latter:

https://github.com/apache/iceberg/blob/d255c87b00c8ca422a1d32a33b3a9cfe2f04cea2/open-api/rest-catalog-open-api.yaml#L787

@ajantha-bhat
Copy link
Member Author

ajantha-bhat commented Jul 4, 2024

Thanks @mosabua and @cwsteinbach for the inputs.

Yeah. Since Nessie is a catalog level versioning. We can't drop a table without knowing where and all it is referenced. Hence, we block it from the usual path and recommend using Nessie GC. This is one of the complexity that has introduced because of catalog level versioning.

With REST API v2 we are planning to formally introduce catalog level versioning capability in the REST catalog spec for easy clarification to the users.

Lastly, let me know if anything needed for this PR to get merged. Thanks.

@wendigo wendigo merged commit 75a9a91 into trinodb:master Jul 4, 2024
43 checks passed
@wendigo
Copy link
Contributor

wendigo commented Jul 4, 2024

Thanks @ajantha-bhat

@ajantha-bhat
Copy link
Member Author

Thanks for the review and merge.

@github-actions github-actions bot added this to the 452 milestone Jul 4, 2024
@wendigo
Copy link
Contributor

wendigo commented Jul 4, 2024

I've just updated nessie to the latest version as well @ajantha-bhat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

None yet

6 participants