TopNSigma

TopNSigma is an experimental filter from recent llama.cpp versions. It keeps only tokens whose logit is within N standard deviations of the top logit, discarding the rest.

Quick reference

Type float
Default -1.0 (disabled)
Range > 0 enables (typical 1.03.0); ≤ 0 disables
Category Advanced / experimental filter
Field on SamplerParameters.TopNSigma

What it does

Compute the standard deviation of the logit distribution at a generation step. Take the maximum logit (logit_max). Keep only tokens whose logit is at least logit_max - N × stddev. Discard the rest.

This filter adapts automatically to distribution shape: on peaked distributions it keeps few tokens (the tail is far from the mean); on flat distributions it keeps many (the whole distribution fits within N sigmas).

  • TopNSigma = -1 (default) — disabled.
  • TopNSigma = 1.0 — tight; keeps only tokens very close to the top.
  • TopNSigma = 2.0 — moderate; keeps tokens within two standard deviations.
  • TopNSigma = 3.0 — wide; covers ~99.7 % of a normal distribution.

This is a newer filter and interactions with other knobs are less well-studied than classic TopP / TopK. Reserve for experimentation.

When to change it

Scenario Value
Default (disabled) -1.0
Experimental usage 1.52.5

Stick with TopP + TopK + MinP for production unless you have a specific reason.

Example

var preset = new Qwen25Preset();
preset.SamplerParameters.TopP = 1.0f;          // disable nucleus
preset.SamplerParameters.TopK = 0;             // disable top-K
preset.SamplerParameters.TopNSigma = 2.0f;     // use sigma-based filter instead

using var api = AsposeLLMApi.Create(preset);

Interactions

  • Temperature — applied before TopNSigma.
  • TopP — can coexist; experimental combinations are not well-studied.
  • TopK — can coexist.
  • MinP — can coexist.
  • MinKeep — floor applies.
  • Mirostat — bypasses TopNSigma when active.

What’s next