You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The query cache is a nice trick to speed up frequently run filters. But it's also pretty expensive as we allocate 10% of a node's heap to it. Using 10% of the heap to improve query efficiency makes sense to me for a Search use-case. But less for Security and Observability use-cases that optimize for density.
First, you can't cache many filters with 10% of a density-optimized node's heap. 30GB of heap gives 3GB of query cache. If we consider a node that has 10TB of data where every document takes about 1kB of space, that's about 10B documents, which would take 1.15GB in the cache assuming 1 bit per document. So you can store at most 2 filters in the cache. It's not that terrible in practice due to the fact that the older indices get queried less frequently than recent indices, but there's still not much you can put in the cache, so it's likely not very efficient at improving query efficiency.
Second, query efficiency is not the primary concern for Security and Observability use-cases and these 10% of heap would arguably be better spent on running heavy aggregations for instance.
I haven't spent much time thinking about it, but I think we have a few ways to implement this:
Index level: automatically disable the index cache (index.queries.cache.enabled) on hidden indices created via data streams, or possibly add an ILM transition if we think we should keep the cache enabled on the Hot tier.
Node level: automatically disable the node cache (indices.queries.cache.size set to 0) on warm nodes and beyond, which requires formalizing node tiers.
The text was updated successfully, but these errors were encountered:
The query cache is a nice trick to speed up frequently run filters. But it's also pretty expensive as we allocate 10% of a node's heap to it. Using 10% of the heap to improve query efficiency makes sense to me for a Search use-case. But less for Security and Observability use-cases that optimize for density.
First, you can't cache many filters with 10% of a density-optimized node's heap. 30GB of heap gives 3GB of query cache. If we consider a node that has 10TB of data where every document takes about 1kB of space, that's about 10B documents, which would take 1.15GB in the cache assuming 1 bit per document. So you can store at most 2 filters in the cache. It's not that terrible in practice due to the fact that the older indices get queried less frequently than recent indices, but there's still not much you can put in the cache, so it's likely not very efficient at improving query efficiency.
Second, query efficiency is not the primary concern for Security and Observability use-cases and these 10% of heap would arguably be better spent on running heavy aggregations for instance.
I haven't spent much time thinking about it, but I think we have a few ways to implement this:
index.queries.cache.enabled
) on hidden indices created via data streams, or possibly add an ILM transition if we think we should keep the cache enabled on the Hot tier.indices.queries.cache.size
set to 0) on warm nodes and beyond, which requires formalizing node tiers.The text was updated successfully, but these errors were encountered: