NUbatch
Contents
[
Hide
]
NUbatch is the physical maximum batch size — the largest chunk actually processed in a single kernel call. Normally set equal to or smaller than NBatch.
Quick reference
| Type | uint? |
| Default | null (native default, typically equal to NBatch) |
| Range | ≤ NBatch |
| Category | Context size and batching |
| Field on | ContextParameters.NUbatch |
What it does
NBatch defines the logical batch — the largest number of tokens submitted at once. NUbatch defines the largest chunk the engine processes in a single kernel invocation. When NUbatch < NBatch, the engine splits one logical batch into multiple kernel calls.
NUbatch = NBatch(simplest case) — one logical batch = one kernel call.NUbatch < NBatch— one logical batch dispatched as several smaller kernel invocations.
The split matters mainly in specific multi-sequence scenarios where sequential processing of sub-batches is required. For single-sequence chat, NUbatch = NBatch is typical.
When to change it
| Scenario | Value |
|---|---|
Default — match NBatch |
same as NBatch |
| Advanced multi-sequence workflows | Smaller than NBatch |
Most deployments set NUbatch = NBatch and never touch this field.
Example
var preset = new Qwen25Preset();
preset.ContextParameters.NBatch = 4096;
preset.ContextParameters.NUbatch = 4096; // match the logical batch
using var api = AsposeLLMApi.Create(preset);
Interactions
NBatch— upper bound;NUbatch ≤ NBatch.NSeqMax— parallel sequence cap, related in multi-sequence scenarios.
What’s next
- NBatch — logical batch cap.
- NSeqMax — parallel sequences.
- Context parameters hub — all context knobs.