DefragThreshold
Contents
[
Hide
]
DefragThreshold is the fraction of KV cache holes above which the engine triggers defragmentation. Useful for long-running services where repeated cleanup creates fragmentation.
Quick reference
| Type | float? |
| Default | null (disabled — same as negative value) |
| Range | < 0 = disabled; 0.0 – 1.0 enables |
| Category | KV cache maintenance |
| Field on | ContextParameters.DefragThreshold |
What it does
When messages are evicted from the KV cache (by CacheCleanupStrategy), their slots become holes. Over many cycles, the cache may hold scattered used slots interspersed with holes, wasting capacity.
If DefragThreshold is set, the engine monitors the hole fraction. When it crosses the threshold, the engine compacts the cache — moves live tokens together and frees the tail.
nullor negative — disabled. Cache is never compacted.0.1–0.5— typical active values. Compact when 10-50 % of the cache is holes.
When to change it
| Scenario | Value |
|---|---|
| Default (short-lived or bounded sessions) | null |
| Long-running service with many evictions | 0.3 |
| Aggressive compaction | 0.1 |
Compaction has a one-time cost when triggered. For bursty workloads where cache usage oscillates, defrag helps sustained throughput.
Example
var preset = new Qwen25Preset();
preset.ContextParameters.DefragThreshold = 0.3f;
// Compact when >30 % of KV slots are holes.
Interactions
CacheCleanupStrategy— the policy that creates the holes defrag compacts.ContextSize— larger caches benefit more from defrag.
What’s next
- Cache management — cleanup strategies and compaction together.
- Context parameters hub — all context knobs.