Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix iceberg $files metadata table not show delete files #16232

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

0xffmeta
Copy link
Member

@0xffmeta 0xffmeta commented Feb 23, 2023

Description

This PR is aimed to fix $files table not showing delete files for iceberg v2 format. #16233

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Fix `$files` table not showing delete files for iceberg v2 format. ({issue}`[16233]`)

@cla-bot
Copy link

cla-bot bot commented Feb 23, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Feb 24, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Feb 24, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Feb 24, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Feb 25, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0xffmeta
Copy link
Member Author

fyi - I have already signed CLA, and I think it might be pending on the process at this moment.

@cla-bot
Copy link

cla-bot bot commented Feb 27, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0xffmeta
Copy link
Member Author

Thanks @ebyhr for the review. I have updated the PR according to the comment.
Please take a look when you have some time.

@ebyhr ebyhr self-requested a review February 28, 2023 07:22
@@ -255,6 +271,32 @@ private Object toIntegerVarcharMapBlock(Map<Integer, String> values)
blockBuilder.closeEntry();
return integerToVarcharMapType.getObject(blockBuilder, 0);
}

private Object getIntegerArrayBlock(List<Integer> values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the type is changed in 1st commit Change to use TypeManager to get ArrayType signature. then type-related changes should be part of 1st commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Let me put those changes into the 1st commit.

@@ -140,6 +145,8 @@ public PlanFilesIterable(CloseableIterable<FileScanTask> planFiles, Map<Integer,
this.types = ImmutableList.copyOf(requireNonNull(types, "types is null"));
this.integerToBigintMapType = typeManager.getType(mapType(INTEGER.getTypeSignature(), BIGINT.getTypeSignature()));
this.integerToVarcharMapType = typeManager.getType(mapType(INTEGER.getTypeSignature(), VARCHAR.getTypeSignature()));
this.integerArrayType = typeManager.getType(arrayType(INTEGER.getTypeSignature()));
this.bigintArrayType = typeManager.getType(arrayType(INTEGER.getTypeSignature()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BIGINT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd have declared and assigned in order (same as order of tableMetadata) of bigintArrayType and then integerArrayType.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Let me update the PR.

@cla-bot
Copy link

cla-bot bot commented Mar 2, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@github-actions github-actions bot added the iceberg Iceberg connector label Mar 2, 2023
Copy link
Contributor

@krvikash krvikash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

return null;
}

BlockBuilder blockBuilder = bigintArrayType.createBlockBuilder(null, values.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected entries are 1 block of List type.

Suggested change
BlockBuilder blockBuilder = bigintArrayType.createBlockBuilder(null, values.size());
BlockBuilder blockBuilder = bigintArrayType.createBlockBuilder(null, 1);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Let me update this accordingly.

return null;
}

BlockBuilder blockBuilder = integerArrayType.createBlockBuilder(null, values.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected entries are 1 block of List type.

Suggested change
BlockBuilder blockBuilder = integerArrayType.createBlockBuilder(null, values.size());
BlockBuilder blockBuilder = integerArrayType.createBlockBuilder(null, 1);

Comment on lines 301 to 302
.filter(path -> !path.contains("regionkey=1"))
.toArray(String[]::new));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change needed? If yes then it should be in a separate commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...this might be related with my code format. It's not needed...let me revert this in the new commit.

@cla-bot
Copy link

cla-bot bot commented Mar 3, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@0xffmeta
Copy link
Member Author

0xffmeta commented Mar 3, 2023

@krvikash Thanks for the review. I just updated the PR accordingly. Please take a look when you have some time.

@@ -25,7 +25,6 @@
import io.trino.spi.connector.SchemaTableName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix split_offsets and equality_ids to use ArrayBlock for size calculation -> Wrap collection values in array blocks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me update the commit message.

@@ -171,16 +174,25 @@ public CloseableIterator<List<Object>> iterator()
addCloseable(planFilesIterator);

return new CloseableIterator<>() {
private Iterator<DeleteFile> deleteFileIterator = emptyIterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: deleteFileIterator -> deleteFilesIterator

@Test
public void testFilesTableWithDelete()
{
assertQuery("SELECT count(*) FROM test_schema.\"test_table_with_delete$files\" WHERE content = 0", "VALUES 4");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use FileContent .id() method instead of magic constants here.

}

@Override
public List<Object> next()
{
return getRecord(planFilesIterator.next().file());
if (deleteFileIterator.hasNext()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use createMetadataTableInstance(icebergTable, FILES) in io.trino.plugin.iceberg.FilesTable#cursor to be able to get delete files from planFiles().

        TableScan tableScan = createMetadataTableInstance(icebergTable, FILES)
                .newScan()
                .includeColumnStats();

While your solution works as well, I find it retrieves a bit too much data in order to provide the functionality required by $files table.

I've done some time ago an experiment related to #11206 which you may find useful here:

master...findinpath:trino:findinpath/add-all-files

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed info. Let me use the createMetadataTableInstance for the table scan.

}
else {
FileScanTask planFileTask = planFilesIterator.next();
deleteFileIterator = planFileTask.deletes().iterator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the potential to include duplicates. Equality deletes can belong to several files, so those will show up multiple times here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Let me find a way to deduplicate the delete files and add a test case for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by using the metadata table scan suggested by @findinpath, it can avoid duplicated equality files in the final results. I have added one test in TestIcebergV2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xffmeta try the following in a product test environment

testing/bin/ptl env up  --environment 'singlenode-spark-iceberg'  --config 'config-default'

via spark-sql

create table default.test1 (x int) using iceberg tblproperties ('format-version' = '2','write.delete.mode' = 'merge-on-read', 'write.update.mode'='merge-on-read',  'write.merge.mode'='merge-on-read');

via trino

insert into t1 values 1,2,3,4,5,6;
insert into t1 values 1,2,3,4,5,6;

via spark-sql

delete from t1 where x=2; -- position deletes
delete from t1 where x > 2; -- position deletes

check via Trino whether $files returns duplicated equality delete files.

via flink can be written equality deletes

delete from t1 where x=2;

The equality deletes may be duplicated (in case they apply to several data files) in the content of $files - in such a case we'll need to figure out a way to deduplicate such entries...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Thanks for the detailed steps. Let me try to check this in the test env.

Copy link
Member Author

@0xffmeta 0xffmeta Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check via Trino whether $files returns duplicated equality delete files.

The trino result:
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

via flink can be written equality deletes

Not sure which environment should be used to create a flink-sql container... @findinpath Do you know how to run flink sql in product test env?

@ebyhr
Copy link
Member

ebyhr commented Mar 7, 2023

@cla-bot check

@github-actions github-actions bot removed the stale label Feb 12, 2024
@hashhar
Copy link
Member

hashhar commented Mar 1, 2024

@alexjo2144 @krvikash This is important to fix.

One thing I've observed people trying to do is to build some way to identify whether their table needs to be optimized to remove delete files for example.

A simple logic is to see if the size of delete files is above some threshold or the count of records in delete files is above some threshold. Unfortunately Trino cannot be used for this today because $files doesn't show delete files.

Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Mar 22, 2024
@mosabua
Copy link
Member

mosabua commented Mar 22, 2024

Maybe @electrum @findepi or @findinpath can help out here.

Also @0xffmeta could you rebase?

@github-actions github-actions bot removed the stale label Mar 25, 2024
Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Apr 15, 2024
@bitsondatadev
Copy link
Member

@amoghmargoor would you mind giving this review a pass please?

@0xffmeta let me know if you're willing to pick this up. I do apologize it's taken this long. If we don't hear from you after a couple weeks we'll pick up the work. I definitely think this is important to track.

@bitsondatadev bitsondatadev added stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. and removed stale labels Apr 16, 2024
@0xffmeta
Copy link
Member Author

As @hashhar mentioned, we also have the similar usecase to view the status of deletion file, and we turned out to use spark to view those files. But it's not efficienct to switch between trino sql and spark sql. With more and more usecase onboared to v2 format, I think this will be usefull to allow the user to check the deletion file status.

@bitsondatadev Sure, let me pick this up and will rebase this PR for the review.

@bitsondatadev
Copy link
Member

As @hashhar mentioned, we also have the similar usecase to view the status of deletion file, and we turned out to use spark to view those files. But it's not efficienct to switch between trino sql and spark sql. With more and more usecase onboared to v2 format, I think this will be usefull to allow the user to check the deletion file status.

@bitsondatadev Sure, let me pick this up and will rebase this PR for the review.

Awesome, I'll do what I can to keep the momentum going. Don't hesitate to ping me.

@0xffmeta 0xffmeta force-pushed the fix_files_system_table branch 9 times, most recently from c5c856d to 06bb2b2 Compare April 19, 2024 07:23
@0xffmeta 0xffmeta force-pushed the fix_files_system_table branch 2 times, most recently from 9638685 to c31c13d Compare April 23, 2024 08:22
@0xffmeta
Copy link
Member Author

Hi @bitsondatadev, this PR is ready for the review. :)

@hashhar
Copy link
Member

hashhar commented Apr 25, 2024

@findinpath / @alexjo2144 Can you please help with a review here?

@ebyhr
Copy link
Member

ebyhr commented Jun 24, 2024

Rebased on master to resolve conflicts and fixed a wrong style.

Comment on lines +201 to +208
// the StructWithReadableMetrics is a private class in iceberg-core
// so we need to use reflection to access it
try {
Class<?> clazz = dataTaskFile.getClass();
Field field = clazz.getDeclaredField("struct");
field.setAccessible(true);
innerContentFile = (StructLike) field.get(dataTaskFile);
}
Copy link
Member

@ebyhr ebyhr Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to avoid reflection if we manage a map (name -> position) like SnapshotsTable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed iceberg Iceberg connector stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed.
Development

Successfully merging this pull request may close these issues.

None yet