NSeqMax
Contents
[
Hide
]
NSeqMax is the maximum number of distinct sequences the engine handles in parallel, each with its own state. It matters for recurrent or state-tracking models. Standard transformer chat presets do not require tuning it.
Quick reference
| Type | uint? |
| Default | null (native default, typically 1 for transformer models) |
| Range | 1 and above; power-of-two values recommended for advanced scenarios |
| Category | Context size and batching |
| Field on | ContextParameters.NSeqMax |
What it does
For recurrent or state-space models (Mamba, RWKV, hybrid architectures), each independent sequence carries its own recurrent state. NSeqMax caps how many such states the engine maintains simultaneously.
NSeqMax = 1(default for standard transformers) — no parallel state tracking needed.NSeqMax = 4+— enables parallel recurrent-model sequences.
Transformer models (Qwen, Llama, Gemma, Phi, etc.) do not maintain per-sequence hidden state in this sense. NSeqMax = 1 is correct for them.
When to change it
| Scenario | Value |
|---|---|
| Standard transformer chat presets | Leave null or 1 |
| Recurrent / state-space model | Set to the number of parallel sequences you serve |
If you are not building against a recurrent-model-specific preset, leave NSeqMax at the default.
Example
// Standard transformer use case — no change needed.
var preset = new Qwen25Preset();
// preset.ContextParameters.NSeqMax = null; // (default)
using var api = AsposeLLMApi.Create(preset);
Interactions
What’s next
- Context parameters hub — all context knobs.
- NBatch — batch size for prompts.