Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghost queue #306

Draft
wants to merge 245 commits into
base: main
Choose a base branch
from
Draft

Ghost queue #306

wants to merge 245 commits into from

Conversation

xiaguan
Copy link
Contributor

@xiaguan xiaguan commented Apr 7, 2024

What's changed and what's your intention?

Still thinking . A little bit confuse.
Pingora's tinyufo is the state of art of this zipf bench.

Checklist

  • I have written the necessary rustdoc comments
  • I have added the necessary unit tests and integration tests
  • I have passed make all (or make fast instead if the old tests are not modified) in my local environment.

Related issues or PRs (optional)

MrCroxx added 30 commits May 15, 2023 15:14
feat: introduce intrusive double link list
* chore: set up ci

Signed-off-by: MrCroxx <[email protected]>

* make clippy happy

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: introduce TinyLfu eviction policy

Signed-off-by: MrCroxx <[email protected]>

* make ci happy

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: add foyer bench, reorg workspace

Signed-off-by: MrCroxx <[email protected]>

* sort cargo file

Signed-off-by: MrCroxx <[email protected]>

* make fmt and clippy happy

Signed-off-by: MrCroxx <[email protected]>

* fix bug

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* ci: add asan test

Signed-off-by: MrCroxx <[email protected]>

* fix Cargo.toml and CI

Signed-off-by: MrCroxx <[email protected]>

* regen ci

Signed-off-by: MrCroxx <[email protected]>

* rename CI step, test asan fail

Signed-off-by: MrCroxx <[email protected]>

* regen CI

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* chore: add license checker

Signed-off-by: MrCroxx <[email protected]>

* fix license checker config

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: introduce instrusive indexers and collections

Signed-off-by: MrCroxx <[email protected]>
* feat: introduce FTL-like storage engine

Signed-off-by: MrCroxx <[email protected]>

* update ci

Signed-off-by: MrCroxx <[email protected]>

* sort cargo file

Signed-off-by: MrCroxx <[email protected]>

* fix memory leak

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: enable direct i/o on linux target

- enable direct i/o on linux target
- refine flusher and reclaimer

Signed-off-by: MrCroxx <[email protected]>

* fix unit test

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
)

* chore: remove unused old storage engien and other components

Signed-off-by: MrCroxx <[email protected]>

* update ci

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: impl storage recovery

Signed-off-by: MrCroxx <[email protected]>

* chore

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: add segment fifo eviction policy

Signed-off-by: MrCroxx <[email protected]>

* export fifo fs store

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
* feat: export mods

Signed-off-by: MrCroxx <[email protected]>

* make cargo sort happy

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
MrCroxx and others added 6 commits March 31, 2024 09:26
* chore: update license

Signed-off-by: MrCroxx <[email protected]>

* chore: fix license checker

Signed-off-by: MrCroxx <[email protected]>

* chore: update license header

Signed-off-by: MrCroxx <[email protected]>

---------

Signed-off-by: MrCroxx <[email protected]>
Signed-off-by: MrCroxx <[email protected]>
@xiaguan xiaguan marked this pull request as draft April 7, 2024 08:31
@xiaguan xiaguan mentioned this pull request Apr 7, 2024
3 tasks
@PsiACE
Copy link

PsiACE commented Apr 23, 2024

Ghost is an important design, it is the key to achieving high hit rate in s3fifo. I also noticed some other work, but unfortunately, they did not indicate why ghost is important.

Although tinyufo claims that its tinylfu functions as a ghost, the mechanisms of the two are different, which may be why it performs poorly in real workloads.

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 23, 2024

It is already supported via #400

@PsiACE
Copy link

PsiACE commented Apr 23, 2024

It is already supported via #400

Great, I'm glad to see it implemented and the benchmark is as expected. Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 23, 2024

Ghost is an important design, it is the key to achieving high hit rate in s3fifo. I also noticed some other work, but unfortunately, they did not indicate why ghost is important.

Although tinyufo claims that its tinylfu functions as a ghost, the mechanisms of the two are different, which may be why it performs poorly in real workloads.

I did real workloads bench for tinyufo(cloudflare/pingora#162) .
I also think tinyufo doesn't really perform well in a real-world workload. The actual workload doesn't strictly follow the Zipf distribution, and different caching algorithms can perform very differently depending on the cache size.

So, given a workload, with a bunch of caching algorithms, cache sizes, and throughput options to pick from, and if it's a hybrid cache, you've also got to consider the mix of memory and SSD. What I'm getting at is, given a workload, how do we figure out the best setup for it? I think there was a paper at FAST '24 that touched on something like this?

As for s3-fifo, I think the small queue having a smaller proportion is key of high hit rate. The s3-fifo paper points out that with a massive capacity, the small queue ratio in s3-fifo needs to shrink even further, maybe even to 0.1%.
I'm testing out adding a mid_queue between the small queue and the main queue, trading off a slight hit in hit rate for hotter objects in memory.

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 23, 2024

Great, I'm glad to see it implemented and the benchmark is as expected. Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

Maybe you could try reaching out to Juncheng, the author of s3-fifo. I've hit him up on Twitter before with some questions, and he was really helpful in his responses. I'm pretty sure he's an absolute guru in this area.

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 23, 2024

I think there was a paper at FAST '24 that touched on something like this?

Kosmo: Efficient Online Miss Ratio Curve Generation for Eviction Policy Evaluation

@PsiACE
Copy link

PsiACE commented Apr 23, 2024

Maybe you could try reaching out to Juncheng, the author of s3-fifo. I've hit him up on Twitter before with some questions, and he was really helpful in his responses. I'm pretty sure he's an absolute guru in this area.

I had a video call with him at the beginning of the month to ask for advice on some issues, and it was really helpful. Right now, my focus is mainly on admission policies as I think there might be more new opportunities here.

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 26, 2024

Recently, I designed another method that achieves state of the art in zipf, but it doesn't perform well with real workloads, so that's why I'm commenting here.

I wanna point out is that the algorithms that perform really well (state of art?) on zipf usually have lower robustness, meaning they're a bit more limited in where they can be applied.

@PsiACE
Copy link

PsiACE commented Apr 26, 2024

I wanna point out is that the algorithms that perform really well (state of art?) on zipf usually have lower robustness, meaning they're a bit more limited in where they can be applied.

I am making some improvements to make it compete with s3fifo. At least it has shown potential in some real workloads at present.

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 26, 2024

I am making some improvements to make it compete with s3fifo. At least it has shown potential in some real workloads at present.

Looking forward to seeing your thoughts published. Keep fighting 😍

@PsiACE
Copy link

PsiACE commented Apr 26, 2024

Looking forward to seeing your thoughts published. Keep fighting 😍

s1 in mokabench, s3fifo powered by foyer, s3uno based on tinyufo and my new idea.

S3FIFO, 800000, 1, 1979736, 3995316, 50.449, 1.381
S3FIFO, 800000, 2, 1979686, 3995316, 50.450, 1.787
S3FIFO, 800000, 4, 1979597, 3995316, 50.452, 2.580
S3FIFO, 800000, 8, 1979488, 3995316, 50.455, 5.262
S3FIFO, 800000, 16, 1979362, 3995316, 50.458, 7.335
S3UNO, 800000, 1, 1872306, 3995316, 53.137, 2.863
S3UNO, 800000, 2, 1931857, 3995316, 51.647, 2.164
S3UNO, 800000, 4, 1950299, 3995316, 51.185, 1.230
S3UNO, 800000, 8, 1964574, 3995316, 50.828, 1.012
S3UNO, 800000, 16, 1973314, 3995316, 50.609, 0.893

s2

S3FIFO, 800000, 1, 7070341, 17253074, 59.020, 7.025
S3FIFO, 800000, 2, 7070293, 17253074, 59.020, 9.724
S3FIFO, 800000, 4, 7070163, 17253074, 59.021, 12.668
S3FIFO, 800000, 8, 7070442, 17253074, 59.019, 25.313
S3FIFO, 800000, 16, 7070639, 17253074, 59.018, 33.102
S3UNO, 800000, 1, 5805572, 17253074, 66.351, 13.772
S3UNO, 800000, 2, 5999752, 17253074, 65.225, 11.430
S3UNO, 800000, 4, 6065727, 17253074, 64.843, 5.739
S3UNO, 800000, 8, 6107475, 17253074, 64.601, 4.894
S3UNO, 800000, 16, 6136147, 17253074, 64.434, 4.362

@xiaguan
Copy link
Contributor Author

xiaguan commented Apr 26, 2024

s1 in mokabench, s3fifo powered by foyer, s3uno based on tinyufo and my new idea.

S3FIFO, 800000, 1, 1979736, 3995316, 50.449, 1.381
S3FIFO, 800000, 2, 1979686, 3995316, 50.450, 1.787
S3FIFO, 800000, 4, 1979597, 3995316, 50.452, 2.580
S3FIFO, 800000, 8, 1979488, 3995316, 50.455, 5.262
S3FIFO, 800000, 16, 1979362, 3995316, 50.458, 7.335
S3UNO, 800000, 1, 1872306, 3995316, 53.137, 2.863
S3UNO, 800000, 2, 1931857, 3995316, 51.647, 2.164
S3UNO, 800000, 4, 1950299, 3995316, 51.185, 1.230
S3UNO, 800000, 8, 1964574, 3995316, 50.828, 1.012
S3UNO, 800000, 16, 1973314, 3995316, 50.609, 0.893

Seems great !
But foyer's s3fifo impl has a bug, see #432

@PsiACE
Copy link

PsiACE commented Apr 26, 2024

Seems great ! But foyer's s3fifo impl has a bug, see #432

I will test the latest version. Currently, I believe my focus should be on improving the hit rate as there is limited scope for optimizing s3uno's performance.

@PsiACE
Copy link

PsiACE commented Apr 26, 2024

Seems great ! But foyer's s3fifo impl has a bug, see #432

Yes, #432 fixed some issues. s3fifo appears to be working fine with low capacity hit rates and shows some improvement with high capacity. However, s3uno is still comparable (or even better) than s3fifo when the capacity increases. I think I also need to check for any issues with low capacity in s3uno.

@ben-manes
Copy link

s3fifo does poorly in db, search, analytics, etc. and unable to cope with varying workloads. It’s very good for blob key-value stores like a cdn, but that’s on par with many other algorithms. You should try a variety of workloads at different sizes, and chain different pattern to see how your new policy responds.

Caffeine’s w-tinylfu uses hill climbing to adapt to the workload and jitter to correct for a mistakenly hot victim (e.g. hash flooding). If you run traces you’ll observe that it reliably has a high hit rate, equal or better than s3fifo in their workloads.

The question is if you want a general algorithm that is broadly strong or a specialized one that is very good in a targeted scenario.

@MrCroxx
Copy link
Owner

MrCroxx commented Apr 29, 2024

Hi @ben-manes , nice to hear the expert opinion from you. 🙌

You should try a variety of workloads at different sizes, and chain different pattern to see how your new policy responds.

Just as you advice, I'm working on it with mokabech (FYI, the PR) as a quick preview. And I'm planning to create a new bench tool that supports more features and better automatic operations.

And THANK YOU A LOT for creating Caffeine. When designing and implementing foyer, it inspires me a lot!! 🥰

And, pardon me, may I ask where you hear foyer from? I'm really curious because currently foyer is not mature in many aspects and is not widely used. 😃

@ben-manes
Copy link

oh, I periodically do a github-wide search on keywords and look at recent activity. That was originally to try to get api feedback by seeing what problems users ran into, then was helping users who seemed stuck, and finally just motivational to participate in fun discussions. I think that I saw your moka pr and some previously threads, and am curious to see @PsiACE new algorithm when unveiled.

When implementing a policy, I'd recommend running the author's simulator (if available) on a range of workload patterns as a validation check. It's really easy to make a mistake in an algorithm where nothing breaks and just the hit rates silently don't match. A small subtle difference can have a big impact, which is frustrating but enlightening to understand. I'll rewrite traces into the author's format, run a variety of patterns, try to get a perfect match on a shared policy (usually lru), and then on their own algorithm. That really helps too because then you have a correct, simple version of the algorithm to apply ideas on top of or port into your cache and validate against. Your cache will likely deviate in small ways for pragmatic or security reasons, but keeping it within a margin of error helps avoid confusion. I say that but I also mostly eyeball against Caffeine and sometimes forget to backport hit rate improvements to the idealized w-tinylfu version.

And fwiw, Caffeine was a lot of fun to write and learn from but took about a decade to figure out the design (~2008 onward). I'm really happy with how it turned out, but I had no clue what I was doing when deciding to take a hard problem as a weekend project as an excuse to learn from (e.g. by reading papers, playing with concurrent algorithms). I think having no expectations or time pressures helped a lot since it is just a hobby. So its really fun to see what ideas others come up with as they play around in this topic.

@PsiACE
Copy link

PsiACE commented Apr 29, 2024

Thanks @ben-manes . Caffeine is so cool.

When implementing a policy...

If I had seen it earlier, I might have avoided many mistakes. That's what I've been doing lately.

...and am curious to see @PsiACE new algorithm when unveiled.

Currently, my algorithm shows better adaptability than s3fifo and tinyufo. I am now exploring how to make it compete with moka across more workloads, which is quite challenging. If it does well, I think I'll show it.

@ben-manes
Copy link

ben-manes commented Apr 29, 2024

Not sure if this idea is helpful, but here's another technique that I was playing with a few years ago when exploring latency-aware caching. In those scenarios the hit / miss penalty varies across entries so overall user perceived performance is governed by responsiveness instead of the hit rate alone. For example analytical dashboards where charts have different load times and maximizing the hit rate might incorrectly favor retaining many fast loading charts over the slow lumbering ones. I was unable to gather enough trace data to experiment with so shelved it, but the general techniques might inspire you for other purposes.

My idea was to compute the running average miss penalty (via estimated weighted moving average or exponential smoothing), making it an O(1) cost to maintain. An entry's miss penalty would be normalized as a stepwise magnitude around the mean to be scored as cheap or expensive. This way if there was a network hiccup then that temporary skew would not pollute the cache by misidentifying expensive items by them being relative to the current baseline. Tinylfu admission would multiply the estimated frequency by the latency factor so a (popular x cheap) entry could be compared against a (unpopular x expensive) one on admission, with admission jitter able to bounce a mistakenly retained victim. The hill climber might instead monitor the total penalty instead of the hit rate over its observation period. This was a thought experiment - I have no idea if it would actually work!

Maybe that's more to say look outside of caching papers for ideas. Most papers make trivial changes to existing algorithms, using the same worn out techniques and merely adjust whether it is biased towards LRU or MRU. The major impacts come from adapting proven techniques that are being overlooked. There might be an approach used elsewhere (e.g. a PID controller) that makes for a surprisingly great result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants