Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert LFU to LRU cache for blob storage #107130

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

jdconrad
Copy link
Contributor

@jdconrad jdconrad commented Apr 4, 2024

This changes our approach to caching for blob storage. This converts the least frequently used cache to a much simpler least recently used cached. The new cache has two lists called front and middle. New entries are inserted into the middle list to avoid thrashing. Any entry accessed another time is moved to the head of the front list. Only the last entry of the middle list is checked for eviction when the cache becomes full avoiding degenerate cases of moving through the entirety of the lists as it's very unlikely that anything being currently written to will make it to the end of the middle list.

@jdconrad jdconrad added :Search/Search Search-related issues that do not fall into other categories >refactoring v8.14.0 labels Apr 4, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Apr 4, 2024
@jdconrad
Copy link
Contributor Author

jdconrad commented Apr 4, 2024

@elasticmachine run elasticsearch-ci/part-3

Copy link
Contributor

@JVerwolf JVerwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! At a high level, this looks good - though I'm quite interested in what others say. I'm not qualified to approve, so my comments should not block - feel free to take them as ideas/suggestions :)

There are a few ideas that came to mind which are well into the realm of over-optimization. I'd be interested to see how this performs in the micro-benchmarks to see if further optimization is warranted, however.

On that topic, it seems like many of the operations are conceptually concurrent. For instance, we should be able to force-evict swaths of entries concurrently while also adding and removing to the queues concurrently. I'm not sure if the coordination costs are worth it, though (not to mention the code complexity).

I also wonder if there are things that could be done to improve reading/writing back and forth to main memory from the cpu caches (by storing the parts of the data structure which change most in such a way that they can stay cached), or alternatively by bulk updating (waiting for 100 writes to the "front" list before moving 100 entries to the back list all at once with a single set of pointer updates, etc).

But like I said, without benchmarks to show that any of this needs improving, I think it looks great.

}
}
}

private void unlink(final LFUCacheEntry entry) {
private void unlink(final LRUCacheEntry entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about testing this explicitly under the following conditions:

  • entry.list has one entry, which we unlink()
  • entry.list has one 2 entries, and we unlink() the first
  • entry.list has one 2 entries, and we unlink() the last
  • entry.list has one 3 entries, and we unlink() the middle

Maybe this is already done via the other tests in a transitive manner, as I didn't read through all the tests in detail. However, I noticed that neither this function, nor any of the methods that call this function, are under test. (I'm bit paranoid, but trust me I've earned it...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied below.

assert invariant(entry, true);
}

private void pushEntryToFront(final LRUCacheEntry entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with pushEntryToMiddle/Back(), this seems like the kind of thing that would be good to explicitly test. (Maybe this is already satisfactorily tested through transitive method calls, I confess I didn't read the entirety of the test files.)

I was thinking something along the lines of starting with an empty list, adding up to two entries, and for each added entry asserting that the pointers are correct.

We'd also want a test that sets the size to a low value, and checks that the entries are pushed to the middle list.

WDYT? Is that overkill? Is this behaviour already tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this should be tested transitively. I think it's preferable to keep the LRUCache classes private if possible. They also happen to be non-static so are tied to an instance of SharedBlobCacheService.

@jdconrad
Copy link
Contributor Author

@original-brownbear I added back the LFU cache. It's possible to switch between LFU and LRU with a new additional setting. LRU is the default. I also separated out the tests for LFU and LRU into separate files which accounts for the dramatic change in line count.

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite good thanks Jack! I'm wondering if we shouldn't go a little further here though and move to a more lock-free implementation if possible.
If any promotion to the head of the queue requires acquiring a global lock, that's not ideal. Maybe we could just make the moving of an item to the head of the queue lock-free for now, since that's what really matters for read performance, and leave the rest as is for maybe looking into later? WDYT?

assert invariant(entry, true);
}

private void pushEntryToFront(final LRUCacheEntry entry) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a way for us to not need this kind of lock any longer. Can't we insert
Because it's basically: point new node at current head, set head to new node, but only if current head is still what we just pointed at (CAS)? No need for locking is there?
And then basically precede that whole operation by evicting the last element in the cache in case we insert a new page instead of just cycling a page to the front?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's chat about this again and if it's still relevant once we start talking about using time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking back at past discussions, it seems lock-freedom was a key motivation for going to an LRU so would be good to proof out that this works as part of this (POC?).

@jdconrad
Copy link
Contributor Author

@original-brownbear @JVerwolf I have removed the middle list from this PR.

@javanna javanna added :Search Foundations/Search Catch all for Search Foundations and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants