PenaltyContextSize

PenaltyContextSize sets the number of recent tokens considered when applying repetition, presence, and frequency penalties. Only the last PenaltyContextSize tokens influence penalty calculations.

Quick reference

Type int
Default -1 (use the model’s full context size)
Range -1 or positive integer
Category Penalty window
Field on SamplerParameters.PenaltyContextSize

What it does

Before each sampling step, the three penalty knobs (RepetitionPenalty, PresencePenalty, FrequencyPenalty) need to know which prior tokens to examine. PenaltyContextSize defines that window.

  • PenaltyContextSize = -1 — use the full context (equivalent to ContextParameters.ContextSize). Maximum recall; penalties apply across the entire conversation.
  • PenaltyContextSize = 256 — only the last 256 tokens contribute. Penalties are local; the model can freely reuse words that appeared earlier than that.
  • PenaltyContextSize = 64 — very local window; penalties essentially prevent immediate repetition only.

Narrow windows make penalties local (avoid recent verbatim repeats); wide windows make them global (avoid any mention of a token anywhere in history).

When to change it

Scenario Value
Default — penalize repetition across full context -1
Fresh-style writing that can revisit topics 256512
Strict anti-repetition for short outputs 128
Very local penalty (only consecutive repeats) 64

Longer conversations may benefit from a smaller penalty window so the model isn’t punished for reusing common words across a long dialogue. For short-form answers, the default -1 is usually fine.

Example

var preset = new Qwen25Preset();
preset.SamplerParameters.PenaltyContextSize = 256;
preset.SamplerParameters.RepetitionPenalty = 1.15f;
// Discourage repetition within the last 256 tokens; older history doesn't trigger the penalty.

using var api = AsposeLLMApi.Create(preset);

Interactions

What’s next