SwaFull
Contents
[
Hide
]
SwaFull controls whether the engine stores the full, uncompressed SWA (sliding-window attention) cache for models that use sliding-window attention. Only relevant for models with SWA layers.
Quick reference
| Type | bool? |
| Default | null (use native default) |
| Category | KV cache (SWA-specific) |
| Field on | ContextParameters.SwaFull |
What it does
Sliding-window attention (used by some Mistral, Gemma, and other architectures) attends only to a bounded recent window. The engine can store this window either:
- Compressed (
SwaFull = falseornull) — smaller memory footprint, typical default. - Full (
SwaFull = true) — uncompressed, larger memory footprint, may be faster in specific workloads.
For models without SWA, this field has no effect.
When to change it
| Scenario | Value |
|---|---|
| Default | null |
| Benchmarking SWA performance | true to test uncompressed path |
| Memory constrained on SWA model | null or false |
Few models currently on the built-in preset list use SWA extensively. If you are unsure, leave null.
Example
var preset = new Qwen25Preset(); // not SWA — SwaFull has no effect
preset.ContextParameters.SwaFull = null; // default
Interactions
What’s next
- Context parameters hub — all context knobs.
- Supported presets — check which presets use SWA.