Accelerating Queries with SQL Cache
Principle
For detailed implementation principles of SQL Cache, please refer to SQL Cache.
Use Case
The following case illustrates how to enable and use SQL Cache in Doris:
Ensure that
cache_enable_sql_mode
infe.conf
is set totrue
(default istrue
):vim fe/conf/fe.conf
cache_enable_sql_mode=true
Set the variable in the MySQL command line:
MySQL [(none)]> set global enable_sql_cache=true;
Note:
GLOBAL
indicates a global variable, not just for the current session.In Doris versions 2.1.3 and above, you can control the number of cache key entries and the cleanup time with the following commands:
MySQL [(none)]> ADMIN SET FRONTEND CONFIG ('sql_cache_manage_num' = '100');
MySQL [(none)]> ADMIN SET FRONTEND CONFIG ('expire_sql_cache_in_fe_second' = '300');
Execute the query
Suppose we have a table named "sales" containing information on date, product, and sales amount. We need to query the total sales for each product over the past 30 days:
SELECT product, SUM(amount) as total_sales
FROM sales
WHERE date >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
GROUP BY product
ORDER BY total_sales DESC
LIMIT 10;When this query is executed for the first time, Doris retrieves the results from the BE and stores them in the cache. Subsequent executions of the same query will retrieve the results directly from the cache if the data has not been updated, significantly improving query speed.
Cache Conditions
After the initial query, the query results will be cached if the following three conditions are met:
(Current time - last update time of the query partition) is greater than
cache_last_version_interval_second
infe.conf
.The number of query result rows is less than
cache_result_max_row_count
infe.conf
.The number of bytes in the query results is less than
cache_result_max_data_size
infe.conf
.
Summary
SQL Cache is a query optimization mechanism provided by Doris that can significantly improve query performance. When using it, please note:
SQL Cache is not suitable for queries containing functions that generate random values (such as
random()
), as this can cause the query results to lose randomness.Currently, it does not support using cached results of some metrics to satisfy queries for more metrics. For example, cached results for two metrics cannot be used for queries involving three metrics.
By reasonably using SQL Cache, you can significantly improve Doris's query performance, especially in scenarios with low data update frequency. In practical applications, cache parameters need to be adjusted based on specific data characteristics and query patterns to achieve optimal performance improvements.