Skip to content

Commit

Permalink
Expose and elaborate FilterBuildingContext (#6088)
Browse files Browse the repository at this point in the history
Summary:
This change enables custom implementations of FilterPolicy to
wrap a variety of NewBloomFilterPolicy and select among them based on
contextual information such as table level and compaction style.

* Moves FilterBuildingContext to public API and elaborates it with more
useful data. (It would be nice to put more general options-like data,
but at the time this object is constructed, we are using internal APIs
ImmutableCFOptions and MutableCFOptions and don't have easy access to
ColumnFamilyOptions that I can tell.)

* Renames BloomFilterPolicy::GetFilterBitsBuilderInternal to
GetBuilderWithContext, because it's now public.

* Plumbs through the table's "level_at_creation" for filter building
context.

* Simplified some tests by adding GetBuilder() to
MockBlockBasedTableTester.

* Adds test as DBBloomFilterTest.ContextCustomFilterPolicy, including
sample wrapper class LevelAndStyleCustomFilterPolicy.

* Fixes a cross-test bug in DBBloomFilterTest.OptimizeFiltersForHits
where it does not reset perf context.
Pull Request resolved: #6088

Test Plan: make check, valgrind on db_bloom_filter_test

Differential Revision: D18697817

Pulled By: pdillinger

fbshipit-source-id: 5f987a2d7b07cc7a33670bc08ca6b4ca698c1cf4
  • Loading branch information
pdillinger authored and facebook-github-bot committed Nov 27, 2019
1 parent 6d58ea9 commit ca3b6c2
Show file tree
Hide file tree
Showing 14 changed files with 300 additions and 96 deletions.
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
* TTL Compactions in Level compaction style now initiate successive cascading compactions on a key range so that it reaches the bottom level quickly on TTL expiry. `creation_time` table property for compaction output files is now set to the minimum of the creation times of all compaction inputs.
* With FIFO compaction style, options.periodic_compaction_seconds will have the same meaning as options.ttl. Whichever stricter will be used. With the default options.periodic_compaction_seconds value with options.ttl's default of 0, RocksDB will give a default of 30 days.
* Added an API GetCreationTimeOfOldestFile(uint64_t* creation_time) to get the file_creation_time of the oldest SST file in the DB.
* FilterPolicy now exposes additional API to make it possible to choose filter configurations based on context, such as table level and compaction style. See `LevelAndStyleCustomFilterPolicy` in db_bloom_filter_test.cc. While most existing custom implementations of FilterPolicy should continue to work as before, those wrapping the return of NewBloomFilterPolicy will require overriding new function `GetBuilderWithContext()`, because calling `GetFilterBitsBuilder()` on the FilterPolicy returned by NewBloomFilterPolicy is no longer supported.
* An unlikely usage of FilterPolicy is no longer supported. Calling GetFilterBitsBuilder() on the FilterPolicy returned by NewBloomFilterPolicy will now cause an assertion violation in debug builds, because RocksDB has internally migrated to a more elaborate interface that is expected to evolve further. Custom implementations of FilterPolicy should work as before, except those wrapping the return of NewBloomFilterPolicy, which will require a new override of a protected function in FilterPolicy.
* NewBloomFilterPolicy now takes bits_per_key as a double instead of an int. This permits finer control over the memory vs. accuracy trade-off in the new Bloom filter implementation and should not change source code compatibility.
* The option BackupableDBOptions::max_valid_backups_to_open is now only used when opening BackupEngineReadOnly. When opening a read/write BackupEngine, anything but the default value logs a warning and is treated as the default. This change ensures that backup deletion has proper accounting of shared files to ensure they are deleted when no longer referenced by a backup.
Expand Down
178 changes: 171 additions & 7 deletions db/db_bloom_filter_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -648,15 +648,17 @@ TEST_F(DBBloomFilterTest, BloomFilterReverseCompatibility) {
}

namespace {
// A wrapped bloom over default FilterPolicy
class WrappedBloom : public FilterPolicy {
// A wrapped bloom over block-based FilterPolicy
class TestingWrappedBlockBasedFilterPolicy : public FilterPolicy {
public:
explicit WrappedBloom(int bits_per_key)
explicit TestingWrappedBlockBasedFilterPolicy(int bits_per_key)
: filter_(NewBloomFilterPolicy(bits_per_key, true)), counter_(0) {}

~WrappedBloom() override { delete filter_; }
~TestingWrappedBlockBasedFilterPolicy() override { delete filter_; }

const char* Name() const override { return "WrappedRocksDbFilterPolicy"; }
const char* Name() const override {
return "TestingWrappedBlockBasedFilterPolicy";
}

void CreateFilter(const rocksdb::Slice* keys, int n,
std::string* dst) const override {
Expand All @@ -683,12 +685,13 @@ class WrappedBloom : public FilterPolicy {
};
} // namespace

TEST_F(DBBloomFilterTest, BloomFilterWrapper) {
TEST_F(DBBloomFilterTest, WrappedBlockBasedFilterPolicy) {
Options options = CurrentOptions();
options.statistics = rocksdb::CreateDBStatistics();

BlockBasedTableOptions table_options;
WrappedBloom* policy = new WrappedBloom(10);
TestingWrappedBlockBasedFilterPolicy* policy =
new TestingWrappedBlockBasedFilterPolicy(10);
table_options.filter_policy.reset(policy);
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

Expand Down Expand Up @@ -718,6 +721,166 @@ TEST_F(DBBloomFilterTest, BloomFilterWrapper) {
ASSERT_EQ(2U * maxKey, policy->GetCounter());
}

namespace {
// NOTE: This class is referenced by HISTORY.md as a model for a wrapper
// FilterPolicy selecting among configurations based on context.
class LevelAndStyleCustomFilterPolicy : public FilterPolicy {
public:
explicit LevelAndStyleCustomFilterPolicy(int bpk_fifo, int bpk_l0_other,
int bpk_otherwise)
: policy_fifo_(NewBloomFilterPolicy(bpk_fifo)),
policy_l0_other_(NewBloomFilterPolicy(bpk_l0_other)),
policy_otherwise_(NewBloomFilterPolicy(bpk_otherwise)) {}

// OK to use built-in policy name because we are deferring to a
// built-in builder. We aren't changing the serialized format.
const char* Name() const override { return policy_fifo_->Name(); }

FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext& context) const override {
if (context.compaction_style == kCompactionStyleFIFO) {
return policy_fifo_->GetBuilderWithContext(context);
} else if (context.level_at_creation == 0) {
return policy_l0_other_->GetBuilderWithContext(context);
} else {
return policy_otherwise_->GetBuilderWithContext(context);
}
}

FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override {
// OK to defer to any of them; they all can parse built-in filters
// from any settings.
return policy_fifo_->GetFilterBitsReader(contents);
}

// Defer just in case configuration uses block-based filter
void CreateFilter(const Slice* keys, int n, std::string* dst) const override {
policy_otherwise_->CreateFilter(keys, n, dst);
}
bool KeyMayMatch(const Slice& key, const Slice& filter) const override {
return policy_otherwise_->KeyMayMatch(key, filter);
}

private:
const std::unique_ptr<const FilterPolicy> policy_fifo_;
const std::unique_ptr<const FilterPolicy> policy_l0_other_;
const std::unique_ptr<const FilterPolicy> policy_otherwise_;
};

class TestingContextCustomFilterPolicy
: public LevelAndStyleCustomFilterPolicy {
public:
explicit TestingContextCustomFilterPolicy(int bpk_fifo, int bpk_l0_other,
int bpk_otherwise)
: LevelAndStyleCustomFilterPolicy(bpk_fifo, bpk_l0_other, bpk_otherwise) {
}

FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext& context) const override {
test_report_ += "cf=";
test_report_ += context.column_family_name;
test_report_ += ",cs=";
test_report_ +=
OptionsHelper::compaction_style_to_string[context.compaction_style];
test_report_ += ",lv=";
test_report_ += std::to_string(context.level_at_creation);
test_report_ += "\n";

return LevelAndStyleCustomFilterPolicy::GetBuilderWithContext(context);
}

std::string DumpTestReport() {
std::string rv;
std::swap(rv, test_report_);
return rv;
}

private:
mutable std::string test_report_;
};
} // namespace

TEST_F(DBBloomFilterTest, ContextCustomFilterPolicy) {
for (bool fifo : {true, false}) {
Options options = CurrentOptions();
options.statistics = rocksdb::CreateDBStatistics();
options.compaction_style =
fifo ? kCompactionStyleFIFO : kCompactionStyleLevel;

BlockBasedTableOptions table_options;
auto policy = std::make_shared<TestingContextCustomFilterPolicy>(15, 8, 5);
table_options.filter_policy = policy;
table_options.format_version = 5;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

CreateAndReopenWithCF({fifo ? "abe" : "bob"}, options);

const int maxKey = 10000;
for (int i = 0; i < maxKey / 2; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
// Add a large key to make the file contain wide range
ASSERT_OK(Put(1, Key(maxKey + 55555), Key(maxKey + 55555)));
Flush(1);
EXPECT_EQ(policy->DumpTestReport(),
fifo ? "cf=abe,cs=kCompactionStyleFIFO,lv=0\n"
: "cf=bob,cs=kCompactionStyleLevel,lv=0\n");

for (int i = maxKey / 2; i < maxKey; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
Flush(1);
EXPECT_EQ(policy->DumpTestReport(),
fifo ? "cf=abe,cs=kCompactionStyleFIFO,lv=0\n"
: "cf=bob,cs=kCompactionStyleLevel,lv=0\n");

// Check that they can be found
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ(Key(i), Get(1, Key(i)));
}
// Since we have two tables / two filters, we might have Bloom checks on
// our queries, but no more than one "useful" per query on a found key.
EXPECT_LE(TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL), maxKey);

// Check that we have two filters, each about
// fifo: 0.12% FP rate (15 bits per key)
// level: 2.3% FP rate (8 bits per key)
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ("NOT_FOUND", Get(1, Key(i + 33333)));
}
{
auto useful_count =
TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL);
EXPECT_GE(useful_count, maxKey * 2 * (fifo ? 0.9980 : 0.975));
EXPECT_LE(useful_count, maxKey * 2 * (fifo ? 0.9995 : 0.98));
}

if (!fifo) { // FIFO only has L0
// Full compaction
ASSERT_OK(db_->CompactRange(CompactRangeOptions(), handles_[1], nullptr,
nullptr));
EXPECT_EQ(policy->DumpTestReport(),
"cf=bob,cs=kCompactionStyleLevel,lv=1\n");

// Check that we now have one filter, about 9.2% FP rate (5 bits per key)
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ("NOT_FOUND", Get(1, Key(i + 33333)));
}
{
auto useful_count =
TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL);
EXPECT_GE(useful_count, maxKey * 0.90);
EXPECT_LE(useful_count, maxKey * 0.91);
}
}

// Destroy
ASSERT_OK(dbfull()->DropColumnFamily(handles_[1]));
dbfull()->DestroyColumnFamilyHandle(handles_[1]);
handles_[1] = nullptr;
}
}

class SliceTransformLimitedDomain : public SliceTransform {
const char* Name() const override { return "SliceTransformLimitedDomain"; }

Expand Down Expand Up @@ -1159,6 +1322,7 @@ TEST_F(DBBloomFilterTest, OptimizeFiltersForHits) {
options.table_factory.reset(NewBlockBasedTableFactory(bbto));
options.optimize_filters_for_hits = true;
options.statistics = rocksdb::CreateDBStatistics();
get_perf_context()->Reset();
get_perf_context()->EnablePerLevelPerfContext();
CreateAndReopenWithCF({"mypikachu"}, options);

Expand Down
5 changes: 5 additions & 0 deletions db/db_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -988,6 +988,11 @@ class DBTestBase : public testing::Test {
uint64_t TestGetTickerCount(const Options& options, Tickers ticker_type) {
return options.statistics->getTickerCount(ticker_type);
}

uint64_t TestGetAndResetTickerCount(const Options& options,
Tickers ticker_type) {
return options.statistics->getAndResetTickerCount(ticker_type);
}
};

} // namespace rocksdb
49 changes: 35 additions & 14 deletions include/rocksdb/filter_policy.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,12 @@
#include <string>
#include <vector>

#include "rocksdb/advanced_options.h"

namespace rocksdb {

class Slice;
struct BlockBasedTableOptions;

// A class that takes a bunch of keys, then generates filter
class FilterBitsBuilder {
Expand Down Expand Up @@ -80,8 +83,25 @@ class FilterBitsReader {
}
};

// Internal type required for FilterPolicy
struct FilterBuildingContext;
// Contextual information passed to BloomFilterPolicy at filter building time.
// Used in overriding FilterPolicy::GetBuilderWithContext().
struct FilterBuildingContext {
// This constructor is for internal use only and subject to change.
FilterBuildingContext(const BlockBasedTableOptions& table_options);

// Options for the table being built
const BlockBasedTableOptions& table_options;

// Name of the column family for the table (or empty string if unknown)
std::string column_family_name;

// The compactions style in effect for the table
CompactionStyle compaction_style = kCompactionStyleLevel;

// The table level at time of constructing the SST file, or -1 if unknown.
// (The table file could later be used at a different level.)
int level_at_creation = -1;
};

// We add a new format of filter block called full filter block
// This new interface gives you more space of customization
Expand Down Expand Up @@ -125,27 +145,28 @@ class FilterPolicy {

// Return a new FilterBitsBuilder for full or partitioned filter blocks, or
// nullptr if using block-based filter.
// NOTE: This function is only called by GetBuilderWithContext() below for
// custom FilterPolicy implementations. Thus, it is not necessary to
// override this function if overriding GetBuilderWithContext().
virtual FilterBitsBuilder* GetFilterBitsBuilder() const { return nullptr; }

// A newer variant of GetFilterBitsBuilder that allows a FilterPolicy
// to customize the builder for contextual constraints and hints.
// (Name changed to avoid triggering -Werror=overloaded-virtual.)
// If overriding GetFilterBitsBuilder() suffices, it is not necessary to
// override this function.
virtual FilterBitsBuilder* GetBuilderWithContext(
const FilterBuildingContext&) const {
return GetFilterBitsBuilder();
}

// Return a new FilterBitsReader for full or partitioned filter blocks, or
// nullptr if using block-based filter.
// As here, the input slice should NOT be deleted by FilterPolicy.
virtual FilterBitsReader* GetFilterBitsReader(
const Slice& /*contents*/) const {
return nullptr;
}

protected:
// An internal-use-only variant of GetFilterBitsBuilder that allows
// a built-in FilterPolicy to customize the builder for contextual
// constraints and hints. (Name changed to avoid triggering
// -Werror=overloaded-virtual.)
virtual FilterBitsBuilder* GetFilterBitsBuilderInternal(
const FilterBuildingContext&) const {
return GetFilterBitsBuilder();
}

friend FilterBuildingContext;
};

// Return a new filter policy that uses a bloom filter with approximately
Expand Down
Loading

0 comments on commit ca3b6c2

Please sign in to comment.