Ghost queue #306

xiaguan · 2024-04-07T08:31:05Z

What's changed and what's your intention?

Still thinking . A little bit confuse.
Pingora's tinyufo is the state of art of this zipf bench.

Checklist

I have written the necessary rustdoc comments
I have added the necessary unit tests and integration tests
I have passed make all (or make fast instead if the old tests are not modified) in my local environment.

Related issues or PRs (optional)

feat: impl count min sketch

Signed-off-by: MrCroxx <[email protected]>

feat: introduce intrusive double link list

* chore: set up ci Signed-off-by: MrCroxx <[email protected]> * make clippy happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* feat: introduce TinyLfu eviction policy Signed-off-by: MrCroxx <[email protected]> * make ci happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* feat: add foyer bench, reorg workspace Signed-off-by: MrCroxx <[email protected]> * sort cargo file Signed-off-by: MrCroxx <[email protected]> * make fmt and clippy happy Signed-off-by: MrCroxx <[email protected]> * fix bug Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* ci: add asan test Signed-off-by: MrCroxx <[email protected]> * fix Cargo.toml and CI Signed-off-by: MrCroxx <[email protected]> * regen ci Signed-off-by: MrCroxx <[email protected]> * rename CI step, test asan fail Signed-off-by: MrCroxx <[email protected]> * regen CI Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

* chore: add license checker Signed-off-by: MrCroxx <[email protected]> * fix license checker config Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

…x#15) Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* feat: introduce instrusive indexers and collections Signed-off-by: MrCroxx <[email protected]>

* feat: introduce FTL-like storage engine Signed-off-by: MrCroxx <[email protected]> * update ci Signed-off-by: MrCroxx <[email protected]> * sort cargo file Signed-off-by: MrCroxx <[email protected]> * fix memory leak Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* feat: enable direct i/o on linux target - enable direct i/o on linux target - refine flusher and reclaimer Signed-off-by: MrCroxx <[email protected]> * fix unit test Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

) * chore: remove unused old storage engien and other components Signed-off-by: MrCroxx <[email protected]> * update ci Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

…xx#31) Signed-off-by: MrCroxx <[email protected]>

* feat: impl storage recovery Signed-off-by: MrCroxx <[email protected]> * chore Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

* feat: add segment fifo eviction policy Signed-off-by: MrCroxx <[email protected]> * export fifo fs store Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

* feat: export mods Signed-off-by: MrCroxx <[email protected]> * make cargo sort happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

* chore: update license Signed-off-by: MrCroxx <[email protected]> * chore: fix license checker Signed-off-by: MrCroxx <[email protected]> * chore: update license header Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

Signed-off-by: MrCroxx <[email protected]>

PsiACE · 2024-04-23T14:31:47Z

Ghost is an important design, it is the key to achieving high hit rate in s3fifo. I also noticed some other work, but unfortunately, they did not indicate why ghost is important.

Although tinyufo claims that its tinylfu functions as a ghost, the mechanisms of the two are different, which may be why it performs poorly in real workloads.

MrCroxx · 2024-04-23T14:33:45Z

It is already supported via #400

PsiACE · 2024-04-23T14:38:51Z

It is already supported via #400

Great, I'm glad to see it implemented and the benchmark is as expected. Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

xiaguan · 2024-04-23T15:09:38Z

Ghost is an important design, it is the key to achieving high hit rate in s3fifo. I also noticed some other work, but unfortunately, they did not indicate why ghost is important.

Although tinyufo claims that its tinylfu functions as a ghost, the mechanisms of the two are different, which may be why it performs poorly in real workloads.

I did real workloads bench for tinyufo(cloudflare/pingora#162) .
I also think tinyufo doesn't really perform well in a real-world workload. The actual workload doesn't strictly follow the Zipf distribution, and different caching algorithms can perform very differently depending on the cache size.

So, given a workload, with a bunch of caching algorithms, cache sizes, and throughput options to pick from, and if it's a hybrid cache, you've also got to consider the mix of memory and SSD. What I'm getting at is, given a workload, how do we figure out the best setup for it? I think there was a paper at FAST '24 that touched on something like this?

As for s3-fifo, I think the small queue having a smaller proportion is key of high hit rate. The s3-fifo paper points out that with a massive capacity, the small queue ratio in s3-fifo needs to shrink even further, maybe even to 0.1%.
I'm testing out adding a mid_queue between the small queue and the main queue, trading off a slight hit in hit rate for hotter objects in memory.

xiaguan · 2024-04-23T15:11:44Z

Great, I'm glad to see it implemented and the benchmark is as expected. Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

Maybe you could try reaching out to Juncheng, the author of s3-fifo. I've hit him up on Twitter before with some questions, and he was really helpful in his responses. I'm pretty sure he's an absolute guru in this area.

xiaguan · 2024-04-23T15:29:32Z

I think there was a paper at FAST '24 that touched on something like this?

Kosmo: Efficient Online Miss Ratio Curve Generation for Eviction Policy Evaluation

PsiACE · 2024-04-23T15:58:05Z

Maybe you could try reaching out to Juncheng, the author of s3-fifo. I've hit him up on Twitter before with some questions, and he was really helpful in his responses. I'm pretty sure he's an absolute guru in this area.

I had a video call with him at the beginning of the month to ask for advice on some issues, and it was really helpful. Right now, my focus is mainly on admission policies as I think there might be more new opportunities here.

xiaguan · 2024-04-26T11:24:41Z

Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

I wanna point out is that the algorithms that perform really well (state of art?) on zipf usually have lower robustness, meaning they're a bit more limited in where they can be applied.

PsiACE · 2024-04-26T11:29:16Z

I wanna point out is that the algorithms that perform really well (state of art?) on zipf usually have lower robustness, meaning they're a bit more limited in where they can be applied.

I am making some improvements to make it compete with s3fifo. At least it has shown potential in some real workloads at present.

xiaguan · 2024-04-26T11:32:02Z

I am making some improvements to make it compete with s3fifo. At least it has shown potential in some real workloads at present.

Looking forward to seeing your thoughts published. Keep fighting 😍

PsiACE · 2024-04-26T11:36:46Z

Looking forward to seeing your thoughts published. Keep fighting 😍

s1 in mokabench, s3fifo powered by foyer, s3uno based on tinyufo and my new idea.

S3FIFO, 800000, 1, 1979736, 3995316, 50.449, 1.381
S3FIFO, 800000, 2, 1979686, 3995316, 50.450, 1.787
S3FIFO, 800000, 4, 1979597, 3995316, 50.452, 2.580
S3FIFO, 800000, 8, 1979488, 3995316, 50.455, 5.262
S3FIFO, 800000, 16, 1979362, 3995316, 50.458, 7.335
S3UNO, 800000, 1, 1872306, 3995316, 53.137, 2.863
S3UNO, 800000, 2, 1931857, 3995316, 51.647, 2.164
S3UNO, 800000, 4, 1950299, 3995316, 51.185, 1.230
S3UNO, 800000, 8, 1964574, 3995316, 50.828, 1.012
S3UNO, 800000, 16, 1973314, 3995316, 50.609, 0.893

s2

S3FIFO, 800000, 1, 7070341, 17253074, 59.020, 7.025
S3FIFO, 800000, 2, 7070293, 17253074, 59.020, 9.724
S3FIFO, 800000, 4, 7070163, 17253074, 59.021, 12.668
S3FIFO, 800000, 8, 7070442, 17253074, 59.019, 25.313
S3FIFO, 800000, 16, 7070639, 17253074, 59.018, 33.102
S3UNO, 800000, 1, 5805572, 17253074, 66.351, 13.772
S3UNO, 800000, 2, 5999752, 17253074, 65.225, 11.430
S3UNO, 800000, 4, 6065727, 17253074, 64.843, 5.739
S3UNO, 800000, 8, 6107475, 17253074, 64.601, 4.894
S3UNO, 800000, 16, 6136147, 17253074, 64.434, 4.362

xiaguan · 2024-04-26T11:41:44Z

s1 in mokabench, s3fifo powered by foyer, s3uno based on tinyufo and my new idea.

S3FIFO, 800000, 1, 1979736, 3995316, 50.449, 1.381
S3FIFO, 800000, 2, 1979686, 3995316, 50.450, 1.787
S3FIFO, 800000, 4, 1979597, 3995316, 50.452, 2.580
S3FIFO, 800000, 8, 1979488, 3995316, 50.455, 5.262
S3FIFO, 800000, 16, 1979362, 3995316, 50.458, 7.335
S3UNO, 800000, 1, 1872306, 3995316, 53.137, 2.863
S3UNO, 800000, 2, 1931857, 3995316, 51.647, 2.164
S3UNO, 800000, 4, 1950299, 3995316, 51.185, 1.230
S3UNO, 800000, 8, 1964574, 3995316, 50.828, 1.012
S3UNO, 800000, 16, 1973314, 3995316, 50.609, 0.893

Seems great !
But foyer's s3fifo impl has a bug, see #432

PsiACE · 2024-04-26T11:43:57Z

Seems great ! But foyer's s3fifo impl has a bug, see #432

I will test the latest version. Currently, I believe my focus should be on improving the hit rate as there is limited scope for optimizing s3uno's performance.

PsiACE · 2024-04-26T12:02:51Z

Seems great ! But foyer's s3fifo impl has a bug, see #432

Yes, #432 fixed some issues. s3fifo appears to be working fine with low capacity hit rates and shows some improvement with high capacity. However, s3uno is still comparable (or even better) than s3fifo when the capacity increases. I think I also need to check for any issues with low capacity in s3uno.

ben-manes · 2024-04-29T10:19:37Z

s3fifo does poorly in db, search, analytics, etc. and unable to cope with varying workloads. It’s very good for blob key-value stores like a cdn, but that’s on par with many other algorithms. You should try a variety of workloads at different sizes, and chain different pattern to see how your new policy responds.

Caffeine’s w-tinylfu uses hill climbing to adapt to the workload and jitter to correct for a mistakenly hot victim (e.g. hash flooding). If you run traces you’ll observe that it reliably has a high hit rate, equal or better than s3fifo in their workloads.

The question is if you want a general algorithm that is broadly strong or a specialized one that is very good in a targeted scenario.

MrCroxx · 2024-04-29T10:45:18Z

Hi @ben-manes , nice to hear the expert opinion from you. 🙌

You should try a variety of workloads at different sizes, and chain different pattern to see how your new policy responds.

Just as you advice, I'm working on it with mokabech (FYI, the PR) as a quick preview. And I'm planning to create a new bench tool that supports more features and better automatic operations.

And THANK YOU A LOT for creating Caffeine. When designing and implementing foyer, it inspires me a lot!! 🥰

And, pardon me, may I ask where you hear foyer from? I'm really curious because currently foyer is not mature in many aspects and is not widely used. 😃

ben-manes · 2024-04-29T16:34:36Z

oh, I periodically do a github-wide search on keywords and look at recent activity. That was originally to try to get api feedback by seeing what problems users ran into, then was helping users who seemed stuck, and finally just motivational to participate in fun discussions. I think that I saw your moka pr and some previously threads, and am curious to see @PsiACE new algorithm when unveiled.

When implementing a policy, I'd recommend running the author's simulator (if available) on a range of workload patterns as a validation check. It's really easy to make a mistake in an algorithm where nothing breaks and just the hit rates silently don't match. A small subtle difference can have a big impact, which is frustrating but enlightening to understand. I'll rewrite traces into the author's format, run a variety of patterns, try to get a perfect match on a shared policy (usually lru), and then on their own algorithm. That really helps too because then you have a correct, simple version of the algorithm to apply ideas on top of or port into your cache and validate against. Your cache will likely deviate in small ways for pragmatic or security reasons, but keeping it within a margin of error helps avoid confusion. I say that but I also mostly eyeball against Caffeine and sometimes forget to backport hit rate improvements to the idealized w-tinylfu version.

And fwiw, Caffeine was a lot of fun to write and learn from but took about a decade to figure out the design (~2008 onward). I'm really happy with how it turned out, but I had no clue what I was doing when deciding to take a hard problem as a weekend project as an excuse to learn from (e.g. by reading papers, playing with concurrent algorithms). I think having no expectations or time pressures helped a lot since it is just a hobby. So its really fun to see what ideas others come up with as they play around in this topic.

PsiACE · 2024-04-29T18:44:39Z

Thanks @ben-manes . Caffeine is so cool.

When implementing a policy...

If I had seen it earlier, I might have avoided many mistakes. That's what I've been doing lately.

...and am curious to see @PsiACE new algorithm when unveiled.

Currently, my algorithm shows better adaptability than s3fifo and tinyufo. I am now exploring how to make it compete with moka across more workloads, which is quite challenging. If it does well, I think I'll show it.

ben-manes · 2024-04-29T23:27:10Z

Not sure if this idea is helpful, but here's another technique that I was playing with a few years ago when exploring latency-aware caching. In those scenarios the hit / miss penalty varies across entries so overall user perceived performance is governed by responsiveness instead of the hit rate alone. For example analytical dashboards where charts have different load times and maximizing the hit rate might incorrectly favor retaining many fast loading charts over the slow lumbering ones. I was unable to gather enough trace data to experiment with so shelved it, but the general techniques might inspire you for other purposes.

My idea was to compute the running average miss penalty (via estimated weighted moving average or exponential smoothing), making it an O(1) cost to maintain. An entry's miss penalty would be normalized as a stepwise magnitude around the mean to be scored as cheap or expensive. This way if there was a network hiccup then that temporary skew would not pollute the cache by misidentifying expensive items by them being relative to the current baseline. Tinylfu admission would multiply the estimated frequency by the latency factor so a (popular x cheap) entry could be compared against a (unpopular x expensive) one on admission, with admission jitter able to bounce a mistakenly retained victim. The hill climber might instead monitor the total penalty instead of the hit rate over its observation period. This was a thought experiment - I have no idea if it would actually work!

Maybe that's more to say look outside of caching papers for ideas. Most papers make trivial changes to existing algorithms, using the same worn out techniques and merely adjust whether it is biased towards LRU or MRU. The major impacts come from adapting proven techniques that are being overlooked. There might be an approach used elsewhere (e.g. a PID controller) that makes for a surprisingly great result.

MrCroxx added 30 commits May 15, 2023 15:14

Merge pull request MrCroxx#1 from MrCroxx/xx/count-min-sketch

7b22ebb

feat: impl count min sketch

feat: introduce intrusive double link list

94c3b93

Signed-off-by: MrCroxx <[email protected]>

Merge pull request MrCroxx#2 from MrCroxx/xx/intrusive-dlist

048d622

feat: introduce intrusive double link list

chore: set up ci (MrCroxx#3)

c74657e

* chore: set up ci Signed-off-by: MrCroxx <[email protected]> * make clippy happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

feat: inroduce lru policy (MrCroxx#4)

86513b0

Signed-off-by: MrCroxx <[email protected]>

feat: introduce TinyLfu eviction policy (MrCroxx#5)

851b6bd

* feat: introduce TinyLfu eviction policy Signed-off-by: MrCroxx <[email protected]> * make ci happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

feat: introduce generic policy (MrCroxx#6)

0534695

Signed-off-by: MrCroxx <[email protected]>

feat: introduce container framework (MrCroxx#7)

cee5aaa

Signed-off-by: MrCroxx <[email protected]>

test: add simple test for container (MrCroxx#8)

3992c3f

Signed-off-by: MrCroxx <[email protected]>

feat: introduce read only file store (MrCroxx#9)

24cccaf

Signed-off-by: MrCroxx <[email protected]>

fix: add simple test for read only file store and fix bugs (MrCroxx#10)

7a2467c

Signed-off-by: MrCroxx <[email protected]>

fix: fix bug with bench (MrCroxx#12)

3e9cc5f

Signed-off-by: MrCroxx <[email protected]>

chore: add license checker (MrCroxx#14)

5c09e60

* chore: add license checker Signed-off-by: MrCroxx <[email protected]> * fix license checker config Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

chore: wrapper cache type, reorg utils, add metrics framework (MrCrox…

e38ad6a

…x#15) Signed-off-by: MrCroxx <[email protected]>

fix: fix read only store recovery, impl random drop (MrCroxx#16)

f44aec1

Signed-off-by: MrCroxx <[email protected]>

feat: add metrics (MrCroxx#17)

8e80c6c

Signed-off-by: MrCroxx <[email protected]>

feat: add log support (MrCroxx#19)

79610aa

Signed-off-by: MrCroxx <[email protected]>

feat: add write stall config for read only file store (MrCroxx#20)

23e2612

Signed-off-by: MrCroxx <[email protected]>

feat: introduce instrusive indexers and collections (MrCroxx#21)

506f3ef

* feat: introduce instrusive indexers and collections Signed-off-by: MrCroxx <[email protected]>

chore: remove unused deps (MrCroxx#28)

64b72b4

Signed-off-by: MrCroxx <[email protected]>

feat: foyer storage bench support flexible writers and readers (MrCro…

85d8341

…xx#31) Signed-off-by: MrCroxx <[email protected]>

feat: impl storage recovery (MrCroxx#32)

7c81272

* feat: impl storage recovery Signed-off-by: MrCroxx <[email protected]> * chore Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

feat: introduce segment fifo eviction policy (MrCroxx#35)

ad3f125

* feat: add segment fifo eviction policy Signed-off-by: MrCroxx <[email protected]> * export fifo fs store Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

feat: export mods (MrCroxx#37)

98a0af7

* feat: export mods Signed-off-by: MrCroxx <[email protected]> * make cargo sort happy Signed-off-by: MrCroxx <[email protected]> --------- Signed-off-by: MrCroxx <[email protected]>

fix: export extern crate (MrCroxx#38)

a8a56cc

Signed-off-by: MrCroxx <[email protected]>

MrCroxx and others added 6 commits March 31, 2024 09:26

Add S3Fifo eviction for memory

78fc9bc

fix: refine s3fifo, fix some bugs

457a000

Signed-off-by: MrCroxx <[email protected]>

fix: fix license

1667477

Signed-off-by: MrCroxx <[email protected]>

refactor: expose s3fifo, fix hakari, add s3fifo fuzzy test

6f3c413

Signed-off-by: MrCroxx <[email protected]>

bench: add s3fifo to hit ratio bench

9a1f53c

Signed-off-by: MrCroxx <[email protected]>

xiaguan marked this pull request as draft April 7, 2024 08:31

xiaguan mentioned this pull request Apr 7, 2024

feat: add S3Fifo eviction for memory #303

Merged

3 tasks

impl ghost queue

4d9e87d

MrCroxx force-pushed the main branch from 04d5ac3 to a2b91e3 Compare April 17, 2024 01:00

MrCroxx mentioned this pull request Apr 27, 2024

fix: fix the ghost queue bug with s3fifo #432

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ghost queue #306

Ghost queue #306

xiaguan commented Apr 7, 2024

PsiACE commented Apr 23, 2024 •

edited

Loading

MrCroxx commented Apr 23, 2024

PsiACE commented Apr 23, 2024

xiaguan commented Apr 23, 2024

xiaguan commented Apr 23, 2024

xiaguan commented Apr 23, 2024

PsiACE commented Apr 23, 2024

xiaguan commented Apr 26, 2024 •

edited

Loading

PsiACE commented Apr 26, 2024

xiaguan commented Apr 26, 2024

PsiACE commented Apr 26, 2024 •

edited

Loading

xiaguan commented Apr 26, 2024

PsiACE commented Apr 26, 2024

PsiACE commented Apr 26, 2024

ben-manes commented Apr 29, 2024

MrCroxx commented Apr 29, 2024

ben-manes commented Apr 29, 2024

PsiACE commented Apr 29, 2024

ben-manes commented Apr 29, 2024 •

edited

Loading

Ghost queue #306

Are you sure you want to change the base?

Ghost queue #306

Conversation

xiaguan commented Apr 7, 2024

What's changed and what's your intention?

Checklist

Related issues or PRs (optional)

PsiACE commented Apr 23, 2024 • edited Loading

MrCroxx commented Apr 23, 2024

PsiACE commented Apr 23, 2024

xiaguan commented Apr 23, 2024

xiaguan commented Apr 23, 2024

xiaguan commented Apr 23, 2024

PsiACE commented Apr 23, 2024

xiaguan commented Apr 26, 2024 • edited Loading

PsiACE commented Apr 26, 2024

xiaguan commented Apr 26, 2024

PsiACE commented Apr 26, 2024 • edited Loading

xiaguan commented Apr 26, 2024

PsiACE commented Apr 26, 2024

PsiACE commented Apr 26, 2024

ben-manes commented Apr 29, 2024

MrCroxx commented Apr 29, 2024

ben-manes commented Apr 29, 2024

PsiACE commented Apr 29, 2024

ben-manes commented Apr 29, 2024 • edited Loading

PsiACE commented Apr 23, 2024 •

edited

Loading

xiaguan commented Apr 26, 2024 •

edited

Loading

PsiACE commented Apr 26, 2024 •

edited

Loading

ben-manes commented Apr 29, 2024 •

edited

Loading