Cache Memory Analysis
The caches managed by Doris are currently all LRU elimination strategies, and all support controlling the capacity and elimination time through parameters separately.
Doris Cache Type
- Page Cache
Used to speed up data scanning.
- DataPageCache: Cache data Page.
- IndexPageCache: Cache data Page index.
- PKIndexPageCache: Cache Page primary key index.
- Metadata Cache
Used to speed up metadata reading.
- SegmentCache: Cache open Segments, such as index information.
- SchemaCache: Cache Rowset Schema.
- TabletSchemaCache: Cache Tablet Schema.
- CreateTabletRRIdxCache: Cache Create Tabelt index.
- MowTabletVersionCache: Cache Mow Tablet Version.
- MowDeleteBitmapAggCache: Cache Mow DeleteBitmap.
- Cloud Cache
A dedicated cache on the cloud.
- CloudTabletCache: Cache Tablet on the Cloud.
- CloudTxnDeleteBitmapCache: Cache DeleteBitmap on Cloud.
- Inverted Index Cache
Speed up inverted index.
- InvertedIndexSearcherCache
- InvertedIndexQueryCache
- Point Query Cache
Speed up point query execution, mainly used for log analysis.
- PointQueryRowCache
- PointQueryLookupConnectionCache
- Other Cache
- FileCache: File cache used by external table queries and Cloud.
- CommonObjLRUCache
- LastSuccessChannelCache
Doris Cache View Method
There are three ways to view Doris Cache related indicators.
- Doris BE Metrics
Web page http://http://{be_host}:{be_web_server_port}/metrics
can see BE process memory monitoring (Metrics), including each cache capacity, usage, number of elements, search and hit times and other indicators.
- `doris_be_cache_capacity{name="TabletSchemaCache"} 102400`: Cache capacity, two limiting methods: memory size or number of elements.
- `doris_be_cache_usage{name="TabletSchemaCache"} 40838`: Cache usage, memory size or number of elements, corresponding to the limit of cache capacity.
- `doris_be_cache_usage_ratio{name="TabletSchemaCache"} 0.398809`: Cache usage, equal to `(cache_usage / cache_capacity)`.
- `doris_be_cache_element_count{name="TabletSchemaCache"} 1628`: Number of cache elements, equal to Cache Usage when the cache capacity limits the number of elements.
- `doris_be_cache_lookup_count{name="TabletSchemaCache"} 63393`: Number of cache lookups.
- `doris_be_cache_hit_count{name="TabletSchemaCache"} 61765`: Number of hits when looking up the cache.
- `doris_be_cache_hit_ratio{name="TabletSchemaCache"} 0.974319`: Hit ratio, equal to `(hit_count / lookup_count)`
- Doris BE Bvar
Web page http://http://{be_host}:{brpc_port}/vars/*cache*
can show some unique metrics of some caches.
In the future, the indicators in Doris BE Metrics will be moved to Doris BE Bvar.
- Memory Trakcer
To view the memory size occupied by each cache in real time, refer to Global Memory Analysis. When there is a memory error, you can find the Memory Tracker Summary
in the be/log/be.INFO
log, which contains the cache memory size at that time.
Doris Cache Memory Analysis
There are various caches when Doris BE is running. Usually, there is no need to pay attention to the cache memory, because when the BE process has insufficient available memory, the memory GC will be triggered to clean up the cache first.
However, if the cache is too large, it will increase the pressure of memory GC, increase the risk of insufficient available memory for query or load error processes, and increase the risk of BE process OOM Crash. So if the memory is constantly tight, you can consider lowering the upper limit of the cache, closing the cache, or reducing the survival time of the cache entry. A smaller cache may reduce query performance in some scenarios, but it is usually tolerable in a production environment. After adjustment, you can observe the query and load performance for a period of time.
Before Doris 2.1, Memory GC was not perfect. When the memory was insufficient, the cache might not be released in time. If the memory was constantly tight, you often had to consider manually lowering the cache limit.
After Doris 2.1.6, if you want to manually clean up all caches during BE operation, execute curl http://{be_host}:{be_web_server_port}/api/clear_cache/all
, and the released memory size will be returned.
The following analyzes the situation where different caches use more memory.
DataPageCache uses more memory
After Doris 2.1.6, execute
curl http://{be_host}:{be_web_server_port}/api/clear_cache/DataPageCache
to manually clean up during BE operation.Execute
curl -X POST http://{be_host}:{be_web_server_port}/api/update_config?disable_storage_page_cache=true
to disable DataPageCache for the running BE, and clear it after a maximum of 10 minutes by default. However, this is a temporary method. DataPageCache will take effect again after BE restarts.If you are sure that you want to reduce the memory usage of DataPageCache for a long time, refer to BE Configuration Items, reduce
storage_page_cache_limit
inconf/be.conf
to reduce the capacity of DataPageCache, or reducedata_page_cache_stale_sweep_time_sec
to reduce the effective time of DataPageCache cache, or increasedisable_storage_page_cache=true
to disable DataPageCache, and then restart the BE process.
SegmentCache uses a lot of memory
After Doris 2.1.6, execute
curl http://{be_host}:{be_web_server_port}/api/clear_cache/SegmentCache
to manually clean up during BE operation.Execute
curl -X POST http:/{be_host}:{be_web_server_port}/api/update_config?disable_segment_cache=true
to disable SegmentCache for the running BE, and clear it after a maximum of 10 minutes by default, but this is a temporary method, and SegmentCache will take effect again after BE restarts.If you are sure that you want to reduce the memory usage of SegmentCache for a long time, refer to BE Configuration Items, adjust
segment_cache_capacity
orsegment_cache_memory_percentage
inconf/be.conf
to reduce the capacity of SegmentCache, or reducetablet_rowset_stale_sweep_time_sec
to reduce the effective time of SegmentCache cache, or adddisable_segment_cache=true
inconf/be.conf
to disable SegmentCache and restart the BE process.
PKIndexPageCache uses a lot of memory
After Doris 2.1.6, execute
curl http://{be_host}:{be_web_server_port}/api/clear_cache/PKIndexPageCache
to manually clean up during BE operation.Refer to BE configuration items, reduce the capacity of PKIndexPageCache by lowering
pk_storage_page_cache_limit
inconf/be.conf
, or reduce the effective time of PKIndexPageCache by loweringpk_index_page_cache_stale_sweep_time_sec
, or adddisable_pk_storage_page_cache=true
inconf/be.conf
to disable PKIndexPageCache, and then restart the BE process.