Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Alluxio filesystem #21603

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

JiamingMai
Copy link

@JiamingMai JiamingMai commented Apr 18, 2024

Description

Add trino-filesystem-alluxio module to implement the new interfaces (TrinoFileSystem, TrinoInput, TrinoInputFile, TrinoOutputFile ) and support native Alluxio.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Apr 18, 2024
@JiamingMai JiamingMai changed the title Add alluxio file system Add Alluxio filesystem Apr 18, 2024
Copy link

cla-bot bot commented Apr 22, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot cla-bot bot removed the cla-signed label Apr 22, 2024
@mosabua mosabua requested a review from electrum April 22, 2024 20:43
Copy link

cla-bot bot commented Apr 23, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

4 similar comments
Copy link

cla-bot bot commented Apr 23, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 23, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 23, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@JiamingMai JiamingMai self-assigned this Apr 24, 2024
Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

4 similar comments
Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Apr 24, 2024

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot cla-bot bot added the cla-signed label Apr 24, 2024
@github-actions github-actions bot added docs hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector bigquery BigQuery connector mongodb MongoDB connector labels Apr 24, 2024
pom.xml Outdated
@@ -183,7 +184,7 @@
<dep.accumulo-hadoop.version>2.7.7-1</dep.accumulo-hadoop.version>
<dep.accumulo.version>3.0.0</dep.accumulo.version>
<dep.airlift.version>248</dep.airlift.version>
<dep.alluxio.version>312</dep.alluxio.version>
<dep.alluxio.version>2.9.4</dep.alluxio.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the Alluxio versioning scheme... Maven central says 2.9.4 was released after 312, but 312 was released after 2.9.3.

Are the x.x.x and xxx versioning schemes completely separate? Should we use one or the other in particular?

(Trying to make sure this isn't a downgrade that could affect the Alluxio cache fs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosabua who can answer this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they switched from semver to full numbers so we need to stick with 3xx .. 2.9.4 is the old incompatible stuff and I guess they continue to release backports of some stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for the filesystem caching we already have implemented it needs to stay with 312 or 3xx or newer from all I know.

For the new filesystem module I think it would be best to also use that newer version .. but it might not be possible (yet) .. if that is the case we need to probably use two different properties

Copy link
Member

@jja725 jja725 Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.9.4 is a more stable version and we would keep backporting changes from 3xx to 2.9.x line. With 2.9.4 we would have the same code of local cache manager as 3xx and compatibility with the stable Filesystem interface. That's why we chose to have 2.9.4 here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a downgrade in version is really desirable, we should just do it as a separate PR first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.. could really be in parallel in terms of timing and still be used in this PR as well. Then we could verify separately and also just merge the new PR first, rebase this, and ship this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM as well, I would submit a separate PR to downgrade the version first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#22350 Hi @mosabua , I submit a PR to downgrade alluxio version as we discussed. In the meantime, we still have issue with the docker test. Do you think we can have someone help us on this or just give us some hint on what might go wrong? #21603 (comment)

@jja725 jja725 force-pushed the add-alluxio-file-system branch 6 times, most recently from f789598 to 357b950 Compare June 25, 2024 23:37
@jja725 jja725 force-pushed the add-alluxio-file-system branch 6 times, most recently from 165d9b2 to cef0807 Compare June 29, 2024 06:28
@github-actions github-actions bot added the iceberg Iceberg connector label Jun 29, 2024
@jja725
Copy link
Member

jja725 commented Jul 1, 2024

grpc/grpc-java#11284 @wendigo due to this issue, I think we should either upgrade grpc to 1.65 or revert the netty version. Otherwise we would have error since we are not using shaded netty in grpc

@wendigo
Copy link
Contributor

wendigo commented Jul 1, 2024

In order to update grpc we need to wait for the next google cloud sdk release

@jja725
Copy link
Member

jja725 commented Jul 1, 2024

In order to update grpc we need to wait for the next google cloud sdk release

OK In this case we would probably just use the netty within grpc so we would not go into this kind of compatibility issue later on. Alluxio/alluxio#18642

@jja725 jja725 force-pushed the add-alluxio-file-system branch 4 times, most recently from a09da7a to 91cc015 Compare July 8, 2024 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

None yet

10 participants