Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(inverted_index): integrate puffin manager with sst indexer #4285

Merged
merged 14 commits into from
Jul 5, 2024

Conversation

zhongzc
Copy link
Contributor

@zhongzc zhongzc commented Jul 4, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

#4246

What's changed and what's your intention?

...to make room for the addition of full-text indexer

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

Summary by CodeRabbit

  • New Features

    • Added a new configuration parameter to control the compression of inverted index data.
  • Deprecations

    • Deprecated intermediate_path setting in favor of region_engine.mito.index.aux_path.
  • Refactor

    • Refactored index creation process, introducing new types and methods to streamline operations.
  • Bug Fixes

    • Updated expected values in tests to align with modified region statistics behavior.
  • Documentation

    • Updated configuration examples to reflect new and deprecated settings.

@zhongzc zhongzc requested review from evenyag, v0y4g3r, waynexia and a team as code owners July 4, 2024 10:40
Copy link
Contributor

coderabbitai bot commented Jul 4, 2024

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Walkthrough

The update introduces a new configuration parameter region_engine.mito.inverted_index.compress to control the compression of inverted index data. It also refactors the handling and management of inverted index creation, including restructuring indexer logic and updating various modules. Additionally, deprecated settings are replaced, and new default values are defined. Error handling and configuration are streamlined, enhancing the robustness and maintainability of the system.

Changes

File(s) Change Summary
config/config.md, .../datanode.example.toml, .../standalone.example.toml Added new configuration compress for inverted index data. Deprecated intermediate_path.
src/mito2/src/access_layer.rs Introduced OperationType enum. Refactored SstWriteRequest to use InvertedIndexConfig.
src/mito2/src/cache/write_cache.rs Added PuffinManagerFactory. Updated WriteCache methods.
src/mito2/src/compaction/compactor.rs Removed old index configuration. Added inverted_index_config. Refined index determination during compaction.
src/mito2/src/config.rs Added compress field to InvertedIndexConfig struct.
src/mito2/src/engine/basic_test.rs Updated assertion value in test_region_usage function.
src/mito2/src/error.rs Removed PuffinFinish variant from Error enum.
src/mito2/src/flush.rs Refactored SstWriteRequest and RegionFlushTask logic.
src/mito2/src/read/scan_region.rs Updated import paths for SstIndexApplierBuilder and SstIndexApplierRef.
src/mito2/src/sst/index.rs, src/mito2/src/sst/index/indexer.rs Introduced new indexer module structure. Updated index creation logic.
src/mito2/src/sst/index/indexer/abort.rs, src/mito2/src/sst/index/indexer/finish.rs Added methods for aborting and finishing index processes.
src/mito2/src/sst/index/indexer/update.rs Introduced methods for updating inverted index within Indexer struct.
.../sst/index/inverted_index.rs, .../sst/index/inverted_index/applier.rs Updated module structure for managing inverted index.
.../sst/index/inverted_index/applier/builder.rs Updated import paths for SstIndexApplierBuilder.
.../sst/index/inverted_index/applier/builder/{between.rs, comparison.rs, eq_list.rs, in_list.rs} Updated import paths for SstIndexApplierBuilder in respective modules.
.../sst/index/inverted_index/applier/builder/regex_match.rs Updated import paths for SstIndexApplierBuilder and related tests.

Poem

In the code fields, changes bloom,
Inverted indexes do resume.
Compress or not, the choice's here,
Enhancements bring much cheer!
Refactored paths and fresher air,
Errors handled with much care.
The system stands robust, a new frontier! 🌸


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jul 4, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 4ec247f and 708c275.

Files selected for processing (32)
  • config/config.md (2 hunks)
  • config/datanode.example.toml (1 hunks)
  • config/standalone.example.toml (1 hunks)
  • src/mito2/src/access_layer.rs (3 hunks)
  • src/mito2/src/cache/write_cache.rs (11 hunks)
  • src/mito2/src/compaction/compactor.rs (5 hunks)
  • src/mito2/src/config.rs (2 hunks)
  • src/mito2/src/engine/basic_test.rs (1 hunks)
  • src/mito2/src/error.rs (2 hunks)
  • src/mito2/src/flush.rs (3 hunks)
  • src/mito2/src/read/scan_region.rs (1 hunks)
  • src/mito2/src/sst/index.rs (5 hunks)
  • src/mito2/src/sst/index/indexer.rs (1 hunks)
  • src/mito2/src/sst/index/indexer/abort.rs (1 hunks)
  • src/mito2/src/sst/index/indexer/finish.rs (1 hunks)
  • src/mito2/src/sst/index/indexer/update.rs (1 hunks)
  • src/mito2/src/sst/index/inverted_index.rs (1 hunks)
  • src/mito2/src/sst/index/inverted_index/applier.rs (1 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder.rs (1 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder/between.rs (2 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder/comparison.rs (2 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder/eq_list.rs (2 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder/in_list.rs (2 hunks)
  • src/mito2/src/sst/index/inverted_index/applier/builder/regex_match.rs (2 hunks)
  • src/mito2/src/sst/index/inverted_index/creator.rs (11 hunks)
  • src/mito2/src/sst/index/puffin_manager.rs (2 hunks)
  • src/mito2/src/sst/parquet.rs (2 hunks)
  • src/mito2/src/sst/parquet/reader.rs (1 hunks)
  • src/mito2/src/sst/parquet/writer.rs (2 hunks)
  • src/mito2/src/test_util.rs (2 hunks)
  • src/mito2/src/worker.rs (4 hunks)
  • tests-integration/tests/http.rs (1 hunks)
Files skipped from review due to trivial changes (6)
  • src/mito2/src/error.rs
  • src/mito2/src/sst/index/indexer.rs
  • src/mito2/src/sst/index/inverted_index/applier/builder/between.rs
  • src/mito2/src/sst/index/inverted_index/applier/builder/comparison.rs
  • src/mito2/src/sst/index/inverted_index/applier/builder/in_list.rs
  • src/mito2/src/sst/index/inverted_index/applier/builder/regex_match.rs
Additional comments not posted (72)
src/mito2/src/sst/index/inverted_index.rs (2)

1-13: File header is well-defined.

The file header includes the copyright notice and license information, which is correctly formatted and necessary for compliance.


15-19: Modules and constants are appropriately defined.

The new modules (applier, creator, and codec) and the constant INDEX_BLOB_TYPE are clearly defined and appropriately named. Ensure that these modules are correctly implemented and integrated.

src/mito2/src/sst/index/indexer/update.rs (3)

1-13: File header is well-defined.

The file header includes the copyright notice and license information, which is correctly formatted and necessary for compliance.


15-29: Method do_update is well-structured.

The method do_update correctly handles empty batches and calls the appropriate update and abort methods. Ensure that the do_update_inverted_index and do_abort methods are correctly implemented.


31-54: Method do_update_inverted_index handles errors appropriately.

The method do_update_inverted_index correctly handles the absence of an inverted indexer and errors during the update process. The use of cfg!(any(test, feature = "test")) for conditional panic is a good practice for testing. Ensure that the creator.update method is correctly implemented.

src/mito2/src/sst/index/indexer/abort.rs (4)

1-13: File header is well-defined.

The file header includes the copyright notice and license information, which is correctly formatted and necessary for compliance.


15-24: Method do_abort is well-structured.

The method do_abort correctly calls the appropriate abort methods for the inverted index and puffin writer. Ensure that the do_abort_inverted_index and do_abort_puffin_writer methods are correctly implemented.


26-45: Method do_abort_inverted_index handles errors appropriately.

The method do_abort_inverted_index correctly handles the absence of an inverted indexer and errors during the abort process. The use of cfg!(any(test, feature = "test")) for conditional panic is a good practice for testing. Ensure that the indexer.abort method is correctly implemented.


47-68: Method do_abort_puffin_writer handles errors appropriately.

The method do_abort_puffin_writer correctly handles the absence of a puffin writer and errors during the abort process. The use of cfg!(any(test, feature = "test")) for conditional panic is a good practice for testing. Ensure that the puffin_writer.finish method is correctly implemented.

src/mito2/src/sst/index/indexer/finish.rs (5)

1-13: File header is well-defined.

The file header includes the copyright notice and license information, which is correctly formatted and necessary for compliance.


15-41: Method do_finish is well-structured.

The method do_finish correctly handles the finishing of operations, including taking the puffin writer, updating the inverted index, and handling errors. Ensure that the do_finish_inverted_index and do_finish_puffin_writer methods are correctly implemented.


43-62: Method do_finish_puffin_writer handles errors appropriately.

The method do_finish_puffin_writer correctly handles errors during the finish process. The use of cfg!(any(test, feature = "test")) for conditional panic is a good practice for testing. Ensure that the puffin_writer.finish method is correctly implemented.


64-100: Method do_finish_inverted_index handles errors appropriately.

The method do_finish_inverted_index correctly handles the absence of an inverted indexer and errors during the finish process. The use of cfg!(any(test, feature = "test")) for conditional panic is a good practice for testing. Ensure that the indexer.finish method is correctly implemented.


102-119: Method fill_inverted_index_output is well-structured.

The method fill_inverted_index_output correctly fills the output with the relevant information about the inverted index. Ensure that the column_ids method of the InvertedIndexer is correctly implemented.

src/mito2/src/sst/index/puffin_manager.rs (3)

25-25: Imports look good.

The imports for BlobGuard and PuffinManager are necessary for the Puffin Manager integration.


45-45: Type alias definition looks good.

The type alias SstPuffinWriter is correctly defined and necessary for the Puffin Manager integration.


Line range hint 147-157:
IndexerBuilder initialization looks good.

The initialization of IndexerBuilder includes all required fields for the integration of the Puffin Manager with the SST Indexer.

src/mito2/src/access_layer.rs (4)

25-25: Import looks good.

The import of InvertedIndexConfig is necessary for the changes made to the SstWriteRequest struct.


145-157: IndexerBuilder initialization looks good.

The initialization of IndexerBuilder includes all required fields for the integration of the Puffin Manager with the SST Indexer.


187-193: Enum definition looks good.

The OperationType enum is necessary to represent the origin of the SstWriteRequest.


196-206: Struct modification looks good.

The modifications to the SstWriteRequest struct, including op_type and inverted_index_config, are necessary for the integration of the Puffin Manager with the SST Indexer.

src/mito2/src/sst/index/inverted_index/applier.rs (2)

33-33: Import looks good.

The import of INDEX_BLOB_TYPE from inverted_index is necessary for the changes made to the SstIndexApplier struct.


33-33: Imports look good.

The imports of BlobReader and PuffinManagerFactory are necessary for the changes made to the SstIndexApplier struct.

src/mito2/src/sst/parquet/writer.rs (2)

137-137: Method modification looks good.

The modification to the write_all method, including the call to self.indexer.finish(), is necessary for the integration of the Puffin Manager with the SST Indexer.


166-166: Method modification looks good.

The modification to the write_all method, including index_metadata in the SstInfo struct, is necessary for the integration of the Puffin Manager with the SST Indexer.

src/mito2/src/sst/index/inverted_index/applier/builder/eq_list.rs (1)

23-23: Import Path Update

The import path for SstIndexApplierBuilder has been updated. Ensure that the new path is correct and consistent with the updated module structure.

src/mito2/src/sst/index/inverted_index/applier/builder.rs (2)

39-40: Import Path Update

The import paths for SstIndexApplier and IndexValueCodec have been updated. Ensure that the new paths are correct and consistent with the updated module structure.


Line range hint 237-237:
Import Path Update

The import path for PuffinManagerFactory has been added. Ensure that this addition is correct and consistent with the rest of the codebase.

src/mito2/src/sst/index.rs (14)

15-19: Module Imports Update

The module imports have been updated to include indexer, intermediate, inverted_index, puffin_manager, and statistics. Ensure that these updates are correct and consistent with the updated module structure.


25-26: Import Path Update

The import paths for PuffinManager, SstPuffinManager, and SstPuffinWriter have been added. Ensure that these additions are correct and consistent with the rest of the codebase.


38-38: Import Path Update

The import path for SstIndexCreator has been updated to inverted_index::creator::SstIndexCreator as InvertedIndexer. Ensure that this update is correct and consistent with the rest of the codebase.


40-52: New Struct Definitions

New struct definitions for IndexOutput and InvertedIndexOutput have been added. Ensure that these definitions are correct and consistent with the rest of the codebase.


61-62: New Fields in Indexer Struct

New fields inverted_indexer and puffin_writer have been added to the Indexer struct. Ensure that these additions are correct and consistent with the rest of the codebase.


66-94: New Methods in Indexer Struct

New methods update, finish, abort, and memory_usage have been added to the Indexer struct. Ensure that these methods are correct and consistent with the rest of the codebase.


99-107: New Fields in IndexerBuilder Struct

New fields op_type, file_id, file_path, metadata, row_group_size, puffin_manager, intermediate_manager, index_options, and inverted_index_config have been added to the IndexerBuilder struct. Ensure that these additions are correct and consistent with the rest of the codebase.


111-134: New Method in IndexerBuilder Struct

A new method build has been added to the IndexerBuilder struct. Ensure that this method is correct and consistent with the rest of the codebase.


Line range hint 136-204:
New Method in IndexerBuilder Struct

A new method build_inverted_indexer has been added to the IndexerBuilder struct. Ensure that this method is correct and consistent with the rest of the codebase.


206-224: New Method in IndexerBuilder Struct

A new method build_puffin_writer has been added to the IndexerBuilder struct. Ensure that this method is correct and consistent with the rest of the codebase.


304-325: New Test for IndexerBuilder

A new test test_build_indexer_basic has been added for the IndexerBuilder struct. Ensure that this test is correct and covers relevant scenarios.


328-352: New Test for IndexerBuilder

A new test test_build_indexer_disable_create has been added for the IndexerBuilder struct. Ensure that this test is correct and covers relevant scenarios.


355-376: New Test for IndexerBuilder

A new test test_build_indexer_no_tag has been added for the IndexerBuilder struct. Ensure that this test is correct and covers relevant scenarios.


379-400: New Test for IndexerBuilder

A new test test_build_indexer_zero_row_group has been added for the IndexerBuilder struct. Ensure that this test is correct and covers relevant scenarios.

src/mito2/src/config.rs (2)

357-359: New Field in InvertedIndexConfig Struct

A new field compress has been added to the InvertedIndexConfig struct. Ensure that this addition is correct and consistent with the rest of the codebase.


376-378: Default Value for Compress Field

The default value for the compress field in the InvertedIndexConfig struct has been set to true. Ensure that this default value is appropriate and consistent with the intended behavior.

src/mito2/src/cache/write_cache.rs (5)

48-49: Field Addition Approved

The addition of the puffin_manager_factory field to the WriteCache struct looks good.


Line range hint 64-73:
Method Update Approved

The integration of the puffin_manager_factory parameter in the new method looks good.


Line range hint 84-95:
Method Update Approved

The integration of the puffin_manager_factory parameter in the new_fs method looks good.


122-135: Method Update Approved

The usage of the puffin_manager_factory in the write_and_upload_sst method and the updates to the IndexerBuilder initialization look good.


262-262: Import Addition Approved

The addition of the OperationType import in the test module looks good.

config/datanode.example.toml (1)

434-436: Field Addition Approved

The addition of the compress field under the region_engine.mito.inverted_index section looks good.

config/standalone.example.toml (1)

457-459: Field Addition Approved

The addition of the compress field under the region_engine.mito.inverted_index section looks good.

src/mito2/src/compaction/compactor.rs (2)

27-27: Field Addition Approved

The addition of the inverted_index_config field to the SstWriteRequest struct looks good.


Line range hint 277-318:
Method Update Approved

The usage of the inverted_index_config in the merge_ssts method and the updates to the SstWriteRequest initialization look good.

src/mito2/src/sst/index/inverted_index/creator.rs (6)

404-404: Verify test coverage for compress field in SstIndexCreator.

Ensure that the tests in this module cover the new compress field in SstIndexCreator.

Verification successful

Verify test coverage for compress field in SstIndexCreator.

The search results indicate that the compress field is indeed covered in the test files.

  • Files with relevant test coverage:
    • tests-integration/tests/http.rs
    • tests-integration/src/cluster.rs
    • src/puffin/src/puffin_manager/tests.rs
    • src/mito2/src/manifest/tests/checkpoint.rs
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test coverage for `compress` field in `SstIndexCreator`.

# Test: Search for the usage of `compress` in tests. Expect: Proper test coverage for the new field.
rg --type rust --test $'compress'

Length of output: 463


Script:

#!/bin/bash
# Description: Verify test coverage for `compress` field in `SstIndexCreator`.

# Test: Search for the usage of `compress` in tests. Expect: Proper test coverage for the new field.
rg --type rust --text 'compress'

Length of output: 43672


249-251: Verify usage of compress field in SstIndexCreator::finish.

Ensure that the SstIndexCreator::finish method correctly uses the new compress field in PutOptions. Verify that it is properly documented and tested.

Verification successful

Verify usage of compress field in SstIndexCreator::finish.

The compress field in the SstIndexCreator::finish method is correctly used and referenced in the codebase. The PutOptions struct and compress field are also present in tests, ensuring proper testing.

  • src/mito2/src/sst/index/inverted_index/creator.rs: Definition and usage of compress in SstIndexCreator::finish
  • src/puffin/src/puffin_manager/tests.rs: Tests involving PutOptions and compress
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify usage of `compress` field in `SstIndexCreator::finish`.

# Test: Search for the usage of `SstIndexCreator::finish`. Expect: Proper usage of the new field.
rg --type rust $'SstIndexCreator::finish'

Length of output: 40


Script:

#!/bin/bash
# Description: Locate the definition of `SstIndexCreator` and its methods, including `finish`.

# Search for the definition of `SstIndexCreator`.
ast-grep --lang rust --pattern 'struct SstIndexCreator { $$$ }'

# Search for the `finish` method within `SstIndexCreator`.
ast-grep --lang rust --pattern 'impl SstIndexCreator { $$$ fn finish(&self) -> $_ { $$$ } $$$ }'

# Search for references to `PutOptions` to verify usage of `compress`.
rg --type rust 'PutOptions'

# Search for references to `compress` within the repository.
rg --type rust 'compress'

Length of output: 46020


77-77: Verify the integration and usage of compress in SstIndexCreator.

Ensure that the new field compress is correctly integrated and used throughout the codebase. Verify that it is properly documented and tested.


147-149: Verify handling of compress field in SstIndexCreator::update.

Ensure that the SstIndexCreator::update method correctly handles the new compress field. Verify that it is properly documented and tested.


92-92: Verify the integration and usage of compress parameter in SstIndexCreator::new.

Ensure that the new compress parameter is correctly integrated and used throughout the codebase. Verify that it is properly documented and tested.


123-124: Verify the integration and usage of compress field in SstIndexCreator initialization.

Ensure that the new compress field is correctly integrated and used throughout the codebase. Verify that it is properly documented and tested.

Verification successful

The compress field is correctly integrated and used in the SstIndexCreator initialization.

  • The compress field is included in the SstIndexCreator struct.
  • The compress parameter is correctly handled in the new method of SstIndexCreator.
  • The instantiation of SstIndexCreator correctly passes a boolean value for the compress field.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration and usage of `compress` field in `SstIndexCreator` initialization.

# Test: Search for the usage of `SstIndexCreator` initialization. Expect: Proper usage of the new field.
rg --type rust $'SstIndexCreator'

Length of output: 610


Script:

#!/bin/bash
# Description: Verify the integration and usage of `compress` field in `SstIndexCreator` initialization.

# Retrieve the definition of SstIndexCreator to confirm the presence of the compress field.
ast-grep --lang rust --pattern $'struct SstIndexCreator {
  compress: $_,
  $$$
}'

# Retrieve the implementation of the constructor to see how compress is initialized.
ast-grep --lang rust --pattern $'impl SstIndexCreator {
  pub fn new($$$) -> Self {
    $$$
  }
}' --debug

# Search for instances where SstIndexCreator is instantiated to verify the use of compress.
rg --type rust $'SstIndexCreator::new'

Length of output: 545


Script:

#!/bin/bash
# Description: Verify the integration and usage of `compress` field in `SstIndexCreator` initialization.

# Retrieve the definition of SstIndexCreator to confirm the presence of the compress field.
ast-grep --lang rust --pattern $'struct SstIndexCreator {
  compress: $_,
  $$$
}'

# Retrieve the implementation of the constructor to see how compress is initialized.
ast-grep --lang rust --pattern $'impl SstIndexCreator {
  pub fn new($$$) -> Self {
    $$$
  }
}'

# Retrieve the specific instance of SstIndexCreator::new in creator.rs to verify the use of compress.
rg --type rust --context 5 $'SstIndexCreator::new' src/mito2/src/sst/index/inverted_index/creator.rs

Length of output: 664


Script:

#!/bin/bash
# Description: Verify the integration and usage of `compress` field in `SstIndexCreator` initialization.

# Retrieve the full implementation of the new method to verify the initialization of the compress field.
ast-grep --lang rust --pattern $'impl SstIndexCreator {
  pub fn new($_, $_, $_, $_, $_) -> Self {
    $$$
  }
}'

# Search for other instances of SstIndexCreator instantiation to ensure the compress field is correctly handled.
rg --type rust $'SstIndexCreator::new'

Length of output: 250


Script:

#!/bin/bash
# Description: Verify the integration and usage of `compress` field in `SstIndexCreator` initialization.

# Manually search for the new method in creator.rs to inspect its implementation.
rg --type rust --context 10 $'pub fn new' src/mito2/src/sst/index/inverted_index/creator.rs

# Retrieve more context around the current instance of SstIndexCreator::new in creator.rs to verify the use of compress.
rg --type rust --context 10 $'SstIndexCreator::new' src/mito2/src/sst/index/inverted_index/creator.rs

Length of output: 1742

src/mito2/src/engine/basic_test.rs (1)

583-583: Verify the correctness of region_stat.sst_usage assertion value.

Ensure that the new assertion value 3026 is correct and reflects the expected behavior.

src/mito2/src/read/scan_region.rs (2)

48-49: New imports for SstIndexApplierBuilder and SstIndexApplierRef.

The added imports are necessary for the integration of the Puffin Manager with the SST indexer.


Line range hint 144-159:
Ensure correct integration of Puffin Manager with SST indexer.

The build_index_applier method correctly integrates the Puffin Manager using SstIndexApplierBuilder. It handles errors by logging a warning and returning None if the index applier cannot be built.

tests-integration/tests/http.rs (1)

Line range hint 1-1:
No changes to review.

This file does not contain any annotated changes.

src/mito2/src/flush.rs (2)

327-327: Ensure OperationType::Flush is appropriate.

The operation type is set to Flush. Verify if this is the correct operation type for this context.


334-334: LGTM!

The addition of inverted_index_config to the SstWriteRequest struct aligns with the refactor objective.

src/mito2/src/sst/parquet/reader.rs (1)

55-55: Verify the usage of SstIndexApplierRef.

The import statement use crate::sst::index::inverted_index::applier::SstIndexApplierRef; was added. Ensure that this import is necessary and used within the file.

Verification successful

Verified: The import statement is necessary and used within the file.

The import statement use crate::sst::index::inverted_index::applier::SstIndexApplierRef; is used within the file src/mito2/src/sst/parquet/reader.rs in both a struct field and a method signature.

  • index_applier: Option<SstIndexApplierRef>
  • pub(crate) fn index_applier(mut self, index_applier: Option<SstIndexApplierRef>) -> Self
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of `SstIndexApplierRef` in the file.

# Test: Search for the usage of `SstIndexApplierRef`. Expect: At least one occurrence.
rg --type rust 'SstIndexApplierRef' src/mito2/src/sst/parquet/reader.rs

Length of output: 281

src/mito2/src/worker.rs (3)

Line range hint 52-52:
Verify the usage of PuffinManagerFactory.

The import statement use crate::sst::index::puffin_manager::PuffinManagerFactory; was added. Ensure that this import is necessary and used within the file.

Verification successful

The usage of PuffinManagerFactory is verified.

The import statement use crate::sst::index::puffin_manager::PuffinManagerFactory; is necessary and correctly used within the file src/mito2/src/worker.rs.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of `PuffinManagerFactory` in the file.

# Test: Search for the usage of `PuffinManagerFactory`. Expect: At least one occurrence.
rg --type rust 'PuffinManagerFactory' src/mito2/src/worker.rs

Length of output: 400


152-152: Verify the integration of puffin_manager_factory in write_cache_from_config.

The puffin_manager_factory has been added to the write_cache_from_config call. Ensure that this integration is correct and necessary.

Verification successful

The integration of puffin_manager_factory in write_cache_from_config is correct.

The puffin_manager_factory is appropriately passed as an argument in the calls to write_cache_from_config.

  • src/mito2/src/worker.rs:
    • Line 152: puffin_manager_factory.clone()
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration of `puffin_manager_factory` in the `write_cache_from_config` call.

# Test: Search for the usage of `write_cache_from_config`. Expect: At least one occurrence with `puffin_manager_factory`.
rg --type rust 'write_cache_from_config' src/mito2/src/worker.rs

Length of output: 200


Script:

#!/bin/bash
# Description: Check the arguments passed to `write_cache_from_config` in `src/mito2/src/worker.rs`.

# Extract lines containing `write_cache_from_config` and the surrounding lines to examine the arguments.
rg --type rust --context 3 'write_cache_from_config' src/mito2/src/worker.rs

Length of output: 1035


284-284: Verify the integration of puffin_manager_factory and intermediate_manager in WorkerStarter.

The puffin_manager_factory and intermediate_manager have been added to the WorkerStarter struct. Ensure that this integration is correct and necessary.

Verification successful

Verification successful: Integration of puffin_manager_factory and intermediate_manager in WorkerStarter is correct and necessary.

The puffin_manager_factory and intermediate_manager have been correctly integrated and are actively used within the WorkerStarter struct.

  • Instances of puffin_manager_factory initialization and usage have been found.
  • Instances of intermediate_manager initialization and usage have been found.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the integration of `puffin_manager_factory` and `intermediate_manager` in the `WorkerStarter` struct.

# Test: Search for the usage of `WorkerStarter`. Expect: At least one occurrence with `puffin_manager_factory` and `intermediate_manager`.
rg --type rust 'WorkerStarter' src/mito2/src/worker.rs

Length of output: 181


Script:

#!/bin/bash
# Description: Verify the initialization and usage of `puffin_manager_factory` and `intermediate_manager` in `WorkerStarter`.

# Test: Search for `puffin_manager_factory` and `intermediate_manager` within `src/mito2/src/worker.rs`.
rg --type rust 'puffin_manager_factory' src/mito2/src/worker.rs
rg --type rust 'intermediate_manager' src/mito2/src/worker.rs

Length of output: 1376

src/mito2/src/test_util.rs (2)

72-72: Import looks good!

The import of PuffinManagerFactory is consistent with the PR objective of integrating the Puffin Manager with the SST Indexer.


608-626: Changes in create_write_cache look good!

The initialization of PuffinManagerFactory and IntermediateManager and their inclusion in WriteCache::new align with the PR objective of integrating the Puffin Manager with the SST Indexer. The code is correct and follows best practices.

config/config.md (1)

414-414: The new configuration parameter is well-documented.

The addition of region_engine.mito.inverted_index.compress is clear and consistent with the existing documentation.

src/mito2/src/flush.rs Outdated Show resolved Hide resolved
src/mito2/src/sst/parquet.rs Show resolved Hide resolved
@zhongzc
Copy link
Contributor Author

zhongzc commented Jul 4, 2024

@coderabbitai pause

Copy link
Contributor

coderabbitai bot commented Jul 4, 2024

Actions performed

Reviews paused.

@zhongzc zhongzc requested a review from fengjiachun July 4, 2024 11:01
Copy link

codecov bot commented Jul 4, 2024

Codecov Report

Attention: Patch coverage is 78.57143% with 78 lines in your changes missing coverage. Please review.

Project coverage is 85.02%. Comparing base (6e2c21d) to head (133d354).
Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4285      +/-   ##
==========================================
+ Coverage   84.96%   85.02%   +0.05%     
==========================================
  Files        1061     1066       +5     
  Lines      188124   188473     +349     
==========================================
+ Hits       159845   160251     +406     
+ Misses      28279    28222      -57     

config/datanode.example.toml Outdated Show resolved Hide resolved
src/mito2/src/sst/index.rs Outdated Show resolved Hide resolved
src/mito2/src/sst/index.rs Show resolved Hide resolved
src/mito2/src/sst/index/inverted_index/creator.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
Signed-off-by: Zhenchi <[email protected]>
Copy link
Contributor

@v0y4g3r v0y4g3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/mito2/src/sst/index/indexer/finish.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Zhenchi <[email protected]>
@zhongzc zhongzc enabled auto-merge July 5, 2024 06:57
@zhongzc zhongzc added this pull request to the merge queue Jul 5, 2024
Merged via the queue into GreptimeTeam:main with commit f71b7b9 Jul 5, 2024
51 checks passed
@zhongzc zhongzc deleted the zhongzc/fulltext-build-3 branch July 5, 2024 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants