TopNSigma
TopNSigma is an experimental filter from recent llama.cpp versions. It keeps only tokens whose logit is within N standard deviations of the top logit, discarding the rest.
Quick reference
| Type | float |
| Default | -1.0 (disabled) |
| Range | > 0 enables (typical 1.0 – 3.0); ≤ 0 disables |
| Category | Advanced / experimental filter |
| Field on | SamplerParameters.TopNSigma |
What it does
Compute the standard deviation of the logit distribution at a generation step. Take the maximum logit (logit_max). Keep only tokens whose logit is at least logit_max - N × stddev. Discard the rest.
This filter adapts automatically to distribution shape: on peaked distributions it keeps few tokens (the tail is far from the mean); on flat distributions it keeps many (the whole distribution fits within N sigmas).
TopNSigma = -1(default) — disabled.TopNSigma = 1.0— tight; keeps only tokens very close to the top.TopNSigma = 2.0— moderate; keeps tokens within two standard deviations.TopNSigma = 3.0— wide; covers ~99.7 % of a normal distribution.
This is a newer filter and interactions with other knobs are less well-studied than classic TopP / TopK. Reserve for experimentation.
When to change it
| Scenario | Value |
|---|---|
| Default (disabled) | -1.0 |
| Experimental usage | 1.5 – 2.5 |
Stick with TopP + TopK + MinP for production unless you have a specific reason.
Example
var preset = new Qwen25Preset();
preset.SamplerParameters.TopP = 1.0f; // disable nucleus
preset.SamplerParameters.TopK = 0; // disable top-K
preset.SamplerParameters.TopNSigma = 2.0f; // use sigma-based filter instead
using var api = AsposeLLMApi.Create(preset);
Interactions
Temperature— applied beforeTopNSigma.TopP— can coexist; experimental combinations are not well-studied.TopK— can coexist.MinP— can coexist.MinKeep— floor applies.Mirostat— bypassesTopNSigmawhen active.
What’s next
- Sampler parameters hub — all sampler knobs at a glance.
- TypicalP — another experimental filter.
- TopP — the standard alternative.