Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-20496][state backends] RocksDB partitioned index/filters option. #14341

Closed
wants to merge 2 commits into from

Conversation

liuyufei9527
Copy link
Contributor

@liuyufei9527 liuyufei9527 commented Dec 8, 2020

What is the purpose of the change

Configure partitioned index and filters options according to 'https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters'.

Brief change log

  • Add state.backend.rocksdb.memory.partitioned-index-filters option represent enable this feature
  • Add state.backend.rocksdb.block.metadata-blocksize config can set partitioned index block size

Verifying this change

This change added tests and can be verified as follows:

  • Add test testGetColumnFamilyOptionsWithPartitionedIndex in RocksDBResourceContainerTest

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): ( no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): ( no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes )
  • If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Dec 8, 2020

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit b97a26e (Tue Dec 08 19:03:11 UTC 2020)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Dec 8, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@liuyufei9527
Copy link
Contributor Author

@flinkbot run azure

Copy link
Member

@Myasuka Myasuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! I think we could separate this PR into two commits:

  1. one commit to introduce partitioned index/filter.
  2. one commit to introduce settings of metadata block size.

@liuyufei9527
Copy link
Contributor Author

Thanks for your contribution! I think we could separate this PR into two commits:

  1. one commit to introduce partitioned index/filter.
  2. one commit to introduce settings of metadata block size.

@Myasuka Should I separate into two commits and do a force push?

@Myasuka
Copy link
Member

Myasuka commented Dec 10, 2020

@liuyufei9527 I suggest to separate this PR into two commits.

@liuyufei9527
Copy link
Contributor Author

@Myasuka Thanks for reviewing, I had update a new version, please take another look.

@carp84 carp84 self-requested a review December 11, 2020 06:52
Copy link
Member

@Myasuka Myasuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late review after some internal busy time, please consider to refactor your PR according to the comments, and remember to update the web content.

@liuyufei9527
Copy link
Contributor Author

@flinkbot run azure

@Myasuka
Copy link
Member

Myasuka commented Jan 16, 2021

Your CI was broken due to the bad format, please resubmit your PR with correct format with maven spotless plugin.

Since you actually split two commits in one PR, please use git fixup to point out which newer commit is to fix previous one. You could also remain the PR as only two commits with force push.

@liuyufei9527 liuyufei9527 force-pushed the FLINK-20496 branch 2 times, most recently from dd0c578 to 4a0e89b Compare January 17, 2021 03:09
@liuyufei9527
Copy link
Contributor Author

@flinkbot run azure

@liuyufei9527
Copy link
Contributor Author

@Myasuka CI have passed.

Copy link
Member

@Myasuka Myasuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
@carp84 do you have more comments?

Copy link
Member

@carp84 carp84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuyufei9527 Thanks for the contribution, and sorry for the late response, please see my inline comments.

What's more, somehow I cannot make more than one code suggestions in this review, so there're more coming.

Copy link
Member

@carp84 carp84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liuyufei9527 This is the 2nd-round review, please check and let me know your thoughts. Thanks.

@carp84
Copy link
Member

carp84 commented Feb 19, 2021

And it seems there are some conflicts to resolve, sorry for the inconvenience due to my late review.

@carp84
Copy link
Member

carp84 commented Feb 27, 2021

Thanks for the update @liuyufei9527 , please check and fix the check style issue, and make sure the pre-commit build here could pass. Thanks.

2021-02-24T12:59:19.9459125Z [ERROR] Failed to execute goal com.diffplug.spotless:spotless-maven-plugin:2.4.2:check (spotless-check) on project flink-statebackend-rocksdb_2.11: The following files had format violations:
2021-02-24T12:59:19.9459962Z [ERROR] src/main/java/org/apache/flink/contrib/streaming/state/RocksDBResourceContainer.java
2021-02-24T12:59:19.9473145Z [ERROR] Run 'mvn spotless:apply' to fix these violations.

@liuyufei9527 liuyufei9527 force-pushed the FLINK-20496 branch 2 times, most recently from 14806f8 to 0c9ad2b Compare February 27, 2021 17:55
@liuyufei9527
Copy link
Contributor Author

@flinkbot run azure

Copy link
Member

@carp84 carp84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest changes LGTM, will merge it in 24 hours if no more comments/concerns from other committers.

cc @StephanEwen @tillrohrmann

@carp84 carp84 closed this in f743974 Mar 4, 2021
autophagy pushed a commit to autophagy/flink that referenced this pull request Mar 16, 2021
@StephanEwen
Copy link
Contributor

Thanks, @liuyufei9527 and @carp84 for taking care of this one.
Looks like a good addition 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants