File Cache

In a decoupled architecture, data is stored in remote storage. The Doris database accelerates data access by utilizing a cache on local disks and employs an advanced multi-queue LRU (Least Recently Used) strategy to efficiently manage cache space. This strategy particularly optimizes the access paths for indexes and metadata, aiming to maximize the caching of frequently accessed user data. For multi-compute group (Compute Group) scenarios, Doris also provides a cache warming feature to quickly load specific data (such as tables or partitions) into the cache when a new compute group is established, thereby enhancing query performance.

Multi-Queue LRU

LRU

LRU manages the cache by maintaining a data access queue. When data is accessed, it is moved to the front of the queue. Newly added data to the cache is also placed at the front of the queue to prevent it from being evicted too early. When the cache space reaches its limit, data at the tail of the queue will be removed first.

TTL (Time-To-Live)

The TTL strategy ensures that newly imported data remains in the cache for a certain period without being evicted. During this time, the data has the highest priority, and all TTL data are treated equally. When cache space is insufficient, the system will prioritize evicting data from other queues to ensure that TTL data can be written to the cache.
Application scenarios: The TTL strategy is particularly suitable for small-scale data tables that require local persistence. For resident tables, a longer TTL value can be set to protect their data; for dynamically partitioned tables, the TTL value can be set according to the active time of Hot Partitions.
Note: Currently, the system does not support directly viewing the proportion of TTL data in the cache.

Multi-Queue

Doris adopts a multi-queue strategy based on LRU, categorizing data into four types based on TTL attributes and data properties, and placing them in TTL queues, Index queues, NormalData queues, and Disposable queues. Data with a TTL attribute is placed in the TTL queue, index data without a TTL attribute is placed in the Index queue, normal data without a TTL attribute is placed in the NormalData queue, and temporary data is placed in the Disposable queue.
During data read and write processes, Doris selects the queues to fill and read from to maximize cache utilization. The specific mechanism is as follows:

Operation	Queue Filled on Miss	Queue Filled on Write
Import	TTL / Index / NormalData	TTL / Index / NormalData
Query	TTL / Index / NormalData	N/A
Schema Change	Disposable	TTL / Index / NormalData
Compaction	Disposable	TTL / Index / NormalData
Warm-up	N/A	TTL / Index / NormalData

Eviction

To maximize the use of cache space for data access acceleration, Doris will utilize available cache space as much as possible. Cache eviction can be triggered in two scenarios: deleting remote storage data or writing when cache space is insufficient. When remote storage data is deleted, the corresponding data in the cache is directly removed. When writing when cache space is insufficient, eviction occurs in the order of Disposable, Normal Data, Index, and TTL. Data in the TTL queue will be moved to the Normal Data queue when it expires.

Cache Warming

In a decoupled mode, Doris supports multi-compute group deployment, where compute groups share data but do not share cache. When a new compute group is created, its cache is empty, which may affect query performance. Therefore, Doris provides a cache warming feature that allows users to actively pull data from remote storage to local cache. This feature supports the following three modes:

Inter-Compute Group Warming: Warm the cache data of Compute Group A to Compute Group B. Doris periodically collects hotspot information of tables/partitions accessed in each compute group over a period and selectively warms certain tables/partitions based on this information.
Table Data Warming: Specify to warm the data of Table A to the new compute group.
Partition Data Warming: Specify to warm the data of partition p1 of Table A to the new compute group.

Compute Group Scaling

When scaling Compute Groups, to avoid cache fluctuations, Doris will first remap the affected tablets and warm the data.

Cache Observation

Hotspot Information

Doris collects cache hotspot information from each compute group every 10 minutes into an internal system table. You can view hotspot information using the SHOW CACHE HOTSPOT '/' command.

Cache Space and Hit Rate

Doris BE nodes can obtain cache statistics by using curl {be_ip}:{brpc_port}/vars (where brpc_port defaults to 8060), and the names of the metrics start with the disk path.

In the above example, the metric prefix for File Cache is the path, for example, the prefix "mnt_disk1_gavinchou_debug_doris_cloud_be0_storage_file_cache" indicates "/mnt/disk1/gavinchou/debug/doris-cloud/be0_storage_file_cache/" The part after the prefix is the statistical metric, for example, "file_cache_cache_size" indicates that the current size of the File Cache at this path is 26111 bytes.

The following table lists the meanings of all metrics (all size units are in bytes):

Metric Name (excluding path prefix)	Meaning
file_cache_cache_size	Current total size of the File Cache
file_cache_disposable_queue_cache_size	Current size of the disposable queue
file_cache_disposable_queue_element_count	Current number of elements in the disposable queue
file_cache_disposable_queue_evict_size	Total amount of data evicted from the disposable queue since startup
file_cache_index_queue_cache_size	Current size of the index queue
file_cache_index_queue_element_count	Current number of elements in the index queue
file_cache_index_queue_evict_size	Total amount of data evicted from the index queue since startup
file_cache_normal_queue_cache_size	Current size of the normal queue
file_cache_normal_queue_element_count	Current number of elements in the normal queue
file_cache_normal_queue_evict_size	Total amount of data evicted from the normal queue since startup
file_cache_total_evict_size	Total amount of data evicted from the entire File Cache since startup
file_cache_ttl_cache_evict_size	Total amount of data evicted from the TTL queue since startup
file_cache_ttl_cache_lru_queue_element_count	Current number of elements in the TTL queue
file_cache_ttl_cache_size	Current size of the TTL queue

SQL Profile

Cache-related metrics in the SQL profile are found under SegmentIterator, including:

Metric Name	Meaning
BytesScannedFromCache	Amount of data read from the File Cache
BytesScannedFromRemote	Amount of data read from remote storage
BytesWriteIntoCache	Amount of data written into the File Cache
LocalIOUseTimer	Time taken to read from the File Cache
NumLocalIOTotal	Number of times the File Cache was read
NumRemoteIOTotal	Number of times remote storage was read
NumSkipCacheIOTotal	Number of times data read from remote storage did not enter the File Cache
RemoteIOUseTimer	Time taken to read from remote storage
WriteCacheIOUseTimer	Time taken to write to the File Cache

You can view query performance analysis through Query Performance Analysis.

Usage

Setting TTL Strategy

When creating a table, set the corresponding PROPERTY to use the TTL strategy for caching that table's data.

file_cache_ttl_seconds: The expected time for newly imported data to remain in the cache, in seconds.

CREATE TABLE IF NOT EXISTS customer (
  C_CUSTKEY     INTEGER NOT NULL,
  C_NAME        VARCHAR(25) NOT NULL,
  C_ADDRESS     VARCHAR(40) NOT NULL,
  C_NATIONKEY   INTEGER NOT NULL,
  C_PHONE       CHAR(15) NOT NULL,
  C_ACCTBAL     DECIMAL(15,2)   NOT NULL,
  C_MKTSEGMENT  CHAR(10) NOT NULL,
  C_COMMENT     VARCHAR(117) NOT NULL
)
DUPLICATE KEY(C_CUSTKEY, C_NAME)
DISTRIBUTED BY HASH(C_CUSTKEY) BUCKETS 32
PROPERTIES(
    "file_cache_ttl_seconds"="300"
)

In the above table, all newly imported data will be retained in the cache for 300 seconds. The system currently supports modifying the TTL time of the table, and users can extend or shorten the TTL time based on actual needs.

ALTER TABLE customer set ("file_cache_ttl_seconds"="3000");

Note

The modified TTL value will not take effect immediately and will have a certain delay.

If no TTL is set when creating the table, users can also modify the table's TTL attribute by executing the ALTER statement.

Cache Warming

Currently, three cache warming modes are supported:

Warm the cache data of compute_group_name0 to compute_group_name1.

When executing the following SQL, the compute_group_name1 compute group will obtain the access information of the compute_group_name0 compute group to restore the cache as closely as possible to that of compute_group_name0.

WARM UP COMPUTE GROUP compute_group_name1 WITH COMPUTE GROUP compute_group_name0

View the most frequently accessed tables in all current compute groups.

SHOW CACHE HOTSPOT '/';

View the most frequently accessed partitions among all tables under compute_group_name0.

SHOW CACHE HOTSPOT '/compute_group_name0';

View the access information of the table regression_test_cloud_load_copy_into_tpch_sf1_p1.customer under compute_group_name0.

SHOW CACHE HOTSPOT '/compute_group_name0/regression_test_cloud_load_copy_into_tpch_sf1_p1.customer';

Warm the data of the table customer to compute_group_name1. Executing the following SQL will pull all data of that table from remote storage to local.

WARM UP COMPUTE GROUP compute_group_name1 WITH TABLE customer

Warm the data of partition p1 of the table customer to compute_group_name1. Executing the following SQL will pull all data of that partition from remote storage to local.

WARM UP COMPUTE GROUP compute_group_name1 with TABLE customer PARTITION p1

The above three cache warming SQL statements will return a JobID result. For example:

WARM UP COMPUTE GROUP cloud_warm_up WITH TABLE test_warm_up;

You can then check the cache warming progress with the following SQL.

SHOW WARM UP JOB WHERE ID = 13418; 

You can determine the current task progress based on FinishBatch and AllBatch, with each Batch's data size being approximately 10GB. Currently, only one warming job is supported to run at a time within a compute group. Users can stop an ongoing warming job.

CANCEL WARM UP JOB WHERE id = 13418;

Practical Case

A user has a series of data tables with a total data volume exceeding 3TB, while the available cache capacity is only 1.2TB. Among them, there are two tables with high access frequency: one is a dimension table of size 200MB (dimension_table), and the other is a fact table of size 100GB (fact_table), which has new data imported daily and requires T+1 query operations. Additionally, other large tables have low access frequency.

Under the LRU caching strategy, if large table data is queried, it may replace the small table data that needs to remain in the cache, causing performance fluctuations. To solve this problem, the user adopts a TTL caching strategy, setting the TTL times for the two tables to 1 year and 1 day, respectively.

ALTER TABLE dimension_table set ("file_cache_ttl_seconds"="31536000");

ALTER TABLE fact_table set ("file_cache_ttl_seconds"="86400");

For the dimension table, due to its smaller size and less variability, the user sets a TTL time of 1 year to ensure that its data can be accessed quickly within a year; for the fact table, the user needs to perform a table backup daily and then conduct a full import, so the TTL time is set to 1 day.

Multi-Queue LRU​

LRU​

TTL (Time-To-Live)​

Multi-Queue​

Eviction​

Cache Warming​

Compute Group Scaling​

Cache Observation​

Hotspot Information​

Cache Space and Hit Rate​

SQL Profile​

Usage​

Setting TTL Strategy​

Cache Warming​

Practical Case​

Multi-Queue LRU

LRU

TTL (Time-To-Live)

Multi-Queue

Eviction

Cache Warming

Compute Group Scaling

Cache Observation

Hotspot Information

Cache Space and Hit Rate

SQL Profile

Usage

Setting TTL Strategy

Cache Warming

Practical Case