Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support query result caching #20854

Open
osscm opened this issue Feb 28, 2024 · 2 comments
Open

Support query result caching #20854

osscm opened this issue Feb 28, 2024 · 2 comments

Comments

@osscm
Copy link
Contributor

osscm commented Feb 28, 2024

Query Result Caching involves the temporary storage of previously executed query results to expedite future data retrieval requests. When a query is executed against a dataset, the result set is cached in memory or disk storage. Subsequent identical or similar queries can then be served directly from the cache rather than re-executing the query against the underlying data source.

This caching mechanism significantly improves query performance and reduces computational overhead, especially for frequently accessed or computationally intensive queries. Query Result Caching is a critical optimization technique employed in big data analytics platforms to enhance query response times, improve system scalability, and optimize resource utilization.

Trino users seeking this functionality often implement it within their application layer. Different commercial solutions offer variations of this feature to cater to diverse needs.

BigQuery: https://cloud.google.com/bigquery/docs/cached-results
Snowflake: https://docs.snowflake.com/en/user-guide/querying-persisted-results
Athena: https://aws.amazon.com/about-aws/whats-new/2022/11/amazon-athena-query-result-reuse-accelerate-queries/
SB: https://docs.starburst.io/latest/admin/cache-service.html

@osscm osscm changed the title Support Query Result caching Support query result caching [main issue] Feb 28, 2024
@osscm
Copy link
Contributor Author

osscm commented Feb 28, 2024

I think we can collaborate to define the scope and consider gradually implementing it, taking into account various use cases and the time/resources teams are willing to allocate. Since there may be multiple implementation approaches with varying limitations, certain options might be suitable for specific use cases.

We can add more subtasks to scope the individual tasks.

@osscm osscm changed the title Support query result caching [main issue] Support query result caching Feb 28, 2024
@YuriyGavrilov
Copy link

Talk about query caching https://youtu.be/t7ndqYUhKSA?si=1wg1TWIVG7FQZzBR at Microsoft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants