Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: optimize RecordBatch to HttpOutput conversion #4178

Merged
merged 8 commits into from
Jun 20, 2024

Conversation

waynexia
Copy link
Member

@waynexia waynexia commented Jun 20, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Optimize HttpRecordsOutput::try_new() method. Reduce the consumption of converting 409600 rows from ~400ms to ~120ms. Saves ~70% CPU.

This patch also adds a microbenchmark for that method.

Key optimizations are:

  • Change the algorithm to do column-row conversion
  • Change the underlying type of Value::String from Bytes to String
  • Avoid all unnecessary Clones

Now the entire conversion method only has one clone that takes data from the data vector, and one write that writes the serialized content to the result vector.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
@waynexia waynexia added C-performance Category Performance A-query Involves code in query path labels Jun 20, 2024
@waynexia waynexia requested a review from a team as a code owner June 20, 2024 10:01
@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jun 20, 2024
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Copy link

codecov bot commented Jun 20, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 22 lines in your changes missing coverage. Please review.

Project coverage is 84.79%. Comparing base (cc2f7ef) to head (4923089).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4178      +/-   ##
==========================================
- Coverage   85.12%   84.79%   -0.33%     
==========================================
  Files        1020     1022       +2     
  Lines      179635   179887     +252     
==========================================
- Hits       152920   152543     -377     
- Misses      26715    27344     +629     

Signed-off-by: Ruihang Xia <[email protected]>
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@waynexia waynexia enabled auto-merge June 20, 2024 12:33
@waynexia waynexia added this pull request to the merge queue Jun 20, 2024
Merged via the queue into GreptimeTeam:main with commit 21c89f3 Jun 20, 2024
49 checks passed
@waynexia waynexia deleted the optimize-http-output branch June 20, 2024 12:54
zyy17 pushed a commit to zyy17/greptimedb that referenced this pull request Jun 22, 2024
* add benchmark

Signed-off-by: Ruihang Xia <[email protected]>

* save 70ms

Signed-off-by: Ruihang Xia <[email protected]>

* add profiler

Signed-off-by: Ruihang Xia <[email protected]>

* save 50ms

Signed-off-by: Ruihang Xia <[email protected]>

* save 160ms

Signed-off-by: Ruihang Xia <[email protected]>

* format toml file

Signed-off-by: Ruihang Xia <[email protected]>

* fix license header

Signed-off-by: Ruihang Xia <[email protected]>

* fix windows build

Signed-off-by: Ruihang Xia <[email protected]>

---------

Signed-off-by: Ruihang Xia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query Involves code in query path C-performance Category Performance docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants