Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace khash with maps in ddoc_cache_lru #4987

Merged
merged 1 commit into from
Feb 19, 2024
Merged

Conversation

nickva
Copy link
Contributor

@nickva nickva commented Feb 16, 2024

This is a companion PR to the previous one for couch_event_server [1].

Implementation notes:

  • Some functionality was moved to helper functions (do_refresh, get_entries)

  • The main concern was to make sure to return the updated map object, when before we just returned ok.

  • ddoc_cache_lru already had an almost 100% test coverage so opted to rely those tests.

  • Performance with maps seems to at least as good or better than with the khash nif [2].

[1] #4977
[2] https://gist.github.com/nickva/06f1511b7f9d0bbc2d9a0dfc2d36779e

Performance benchmark
(MacOS, Intel, Erlang/OTP 24)

With maps:

> ddoc_cache_bench:go().

Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       202.5        11.1
1000   5      5       192.1        21.9
10000  1      1       176.8        65.6
1      10000  1       180.3        30.7
1      1      10000   185.9        32.1

GC Runs: 4733375

With the khash NIF:

> ddoc_cache_bench:go().

Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       202.2        12.6
1000   5      5       193.2        26.0
10000  1      1       183.1        75.0
1      10000  1       182.8        40.0
1      1      10000   182.8        38.5

GC Runs: 4799318

end,
lists:foreach(Fun, [no_ddocid | DDocIdList]).

remove_entry({Key, Pid}, #st{pids = Pids, dbs = Dbs} = St) ->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: remove_entry/2 args were changed so that it can be directly used in a lists:foldl callback.

This is a companion PR to the previous one for couch_event_server [1].

Implementation notes:

 * Some functionality was moved to helper functions (do_refresh, get_entries)

 * The main concern was to make sure to return the updated map object, when
   before we just returned `ok`.

 * ddoc_cache_lru already had an almost 100% test coverage so opted to rely
   those tests.

 * Performance with maps seems to at least as good or better than with the
   khash nif [2].

[1] #4977
[2] https://gist.github.com/nickva/06f1511b7f9d0bbc2d9a0dfc2d36779e
Copy link
Member

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so_long_partner

Goodbye, khash! You were such a good nif all these years!

@nickva
Copy link
Contributor Author

nickva commented Feb 16, 2024

@davisp I sorry. It was a good hashing data structure long before maps existed!

@nickva
Copy link
Contributor Author

nickva commented Feb 17, 2024

I did a run on Linux (Ubuntu) with Erlang 24:

With maps

ddoc_cache_bench:go().

Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       182.0        8.8    
1000   5      5       178.6        17.6   
10000  1      1       159.3        54.9   
1      10000  1       160.6        28.6   
1      1      10000   166.1        28.4   

GC Runs: 4777049

With khash

> ddoc_cache_bench:go().

Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       185.5        11.7   
1000   5      5       180.3        22.7   
10000  1      1       167.5        67.9   
1      10000  1       164.8        33.7   
1      1      10000   167.8        31.3   

GC Runs: 4836877

@jaydoane
Copy link
Contributor

I ran some benchmarks with OTP 26 on an M1 Max.
With khash:

([email protected])1> ddoc_cache_bench:go().

Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       111.0        5.7
1000   5      5       105.0        12.7
10000  1      1       99.0         43.4
1      10000  1       98.8         20.5
1      1      10000   98.7         17.9

GC Runs: 6386402

With maps:


Dbs    DocIds Revs    Open(Dt/Key) Close(Dt/Key)
100    100    5       86.7         4.4
1000   5      5       81.8         10.7
10000  1      1       77.3         43.1
1      10000  1       76.4         14.4
1      1      10000   78.0         15.1

GC Runs: 6332692

Looks like nice across the board improvement!

@jaydoane
Copy link
Contributor

Have you considered adding ddoc_cache_bench.erl to the repo?

@nickva
Copy link
Contributor Author

nickva commented Feb 19, 2024

Hmm, that could work, but since ddoc_cache application restarts as part of test to get a clean slate, it would blow away the ddoc cache, so it might be a bit too aggressive.

Copy link
Contributor

@jaydoane jaydoane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice improvement!

@nickva nickva merged commit 2064f21 into main Feb 19, 2024
14 checks passed
@nickva nickva deleted the replace-khash-in-ddoc-cache branch February 19, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants