Memtable scan performance regression while time series amount < 100K #3467

evenyag · 2024-03-08T09:11:11Z

What type of bug is this?

Performance issue

What subsystems are affected?

Storage Engine

Minimal reproduce step

Use TSBS to generate some points with 4k hosts

tsbs_generate_data --use-case="cpu-only" --seed=123 --scale=4000 \
     --timestamp-start="2023-06-11T00:00:00Z" \
     --timestamp-end="2023-06-14T00:00:00Z" \
     --log-interval="10s" --format="influx" \
     > ./influx-data.lp

Load it into the db

tsbs_load_greptime \
    --urls=http:https://localhost:14000 \
    --file=./influx-data.lp \
    --batch-size=3000 \
    --gzip=false \
    --workers=6

Enable debug log of the storage engine

[logging]
level = "info,mito2=debug"

Select some data

mysql -u root -h 127.0.0.1 -P 14002

use benchmark;

select count(*) from CPU;

select count(*) from cpu where hostname = 'host_999';

What did you expect to see?

The memtable scan time should be close to the old memtable

2024-03-08T07:53:29.592218Z DEBUG mito2::memtable::time_series: Iter 4398046511104(1024, 0) time series memtable, metrics: Metrics { total_series: 4000, num_pruned_series: 0, num_rows: 1626000, num_batches: 4000, scan_cost: 114.815602ms }

2024-03-08T08:46:53.654775Z DEBUG mito2::memtable::time_series: Iter 4398046511104(1024, 0) time series memtable, metrics: Metrics { total_series: 4000, num_pruned_series: 3999, num_rows: 406, num_batches: 1, scan_cost: 11.982725ms }

What did you see instead?

The memtable scan time is quite high

0.24s
0.22s

2024-03-08T09:05:04.993351Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 1626000, batches fetched: 8091, scan elapsed: 0.241762785

2024-03-08T09:05:20.178469Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 406, batches fetched: 2, scan elapsed: 0.224319432

What operating system did you use?

NA

What version of GreptimeDB did you use?

0.7

Relevant log output and stack trace

2024-03-08T09:05:04.993351Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 1626000, batches fetched: 8091, scan elapsed: 0.241762785

2024-03-08T09:05:20.178458Z DEBUG mito2::memtable::merge_tree::partition: TreeIter pruning, before: 8090, after: 1, partition_read_source: 0.013266087s, partition_prune_pk: 0.008633765s, partition_data_batch_to_batch: 0.000018656s
2024-03-08T09:05:20.178469Z DEBUG mito2::memtable::merge_tree::tree: TreeIter partitions total: 1, partitions after prune: 0, rows fetched: 406, batches fetched: 2, scan elapsed: 0.224319432

The text was updated successfully, but these errors were encountered:

evenyag · 2024-03-08T09:19:43Z

There are several issues related to the slow scan speed

The default data freeze threshold is small as we don't implement compaction in the memtable
The shard never freezes a data buffer fix: freeze data buffer in shard #3468
last_yield_pk_index only stores matched keys so we have to re-prune unmatched keys
Data buffer and data parts always scan all pk indices
projection push down?
merge small parts?
explicit point get?

tisonkun · 2024-03-10T08:02:09Z

Is this issue the cause to https://github.com/orgs/GreptimeTeam/discussions/3461?

killme2008 · 2024-03-11T10:15:19Z

Is this issue the cause to https://github.com/orgs/GreptimeTeam/discussions/3461?

I think so.

evenyag · 2024-03-14T07:42:03Z

The old TimeSeriesMemtable always outperforms the new memtable when the number of time series is small. We still need to keep the old memtable and maybe use it as the default memtable. The new memtable is mainly optimized for the metric engine.

I think we can support per-table memtable option and enable the new memtable in the metric engine.

Support per-table memtable settings #3183
We might rename the new memtable to MetricsMemtable
We still need to optimize the new memtable while only reading a small amount of series.

killme2008 · 2024-05-24T07:20:02Z

I think we can close this issue currently. @evenyag

evenyag added C-performance Category Performance A-storage Involves code in storage engines labels Mar 8, 2024

evenyag self-assigned this Mar 8, 2024

This was referenced Mar 8, 2024

Tracking issue for the memtable improvement #2804

Closed

fix: freeze data buffer in shard #3468

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memtable scan performance regression while time series amount < 100K #3467

Memtable scan performance regression while time series amount < 100K #3467

evenyag commented Mar 8, 2024 •

edited

Loading

evenyag commented Mar 8, 2024 •

edited

Loading

tisonkun commented Mar 10, 2024

killme2008 commented Mar 11, 2024

evenyag commented Mar 14, 2024

killme2008 commented May 24, 2024

Memtable scan performance regression while time series amount < 100K #3467

Memtable scan performance regression while time series amount < 100K #3467

Comments

evenyag commented Mar 8, 2024 • edited Loading

What type of bug is this?

What subsystems are affected?

Minimal reproduce step

What did you expect to see?

What did you see instead?

What operating system did you use?

What version of GreptimeDB did you use?

Relevant log output and stack trace

evenyag commented Mar 8, 2024 • edited Loading

tisonkun commented Mar 10, 2024

killme2008 commented Mar 11, 2024

evenyag commented Mar 14, 2024

killme2008 commented May 24, 2024

evenyag commented Mar 8, 2024 •

edited

Loading

evenyag commented Mar 8, 2024 •

edited

Loading