NUbatch

NUbatch is the physical maximum batch size — the largest chunk actually processed in a single kernel call. Normally set equal to or smaller than NBatch.

Quick reference

Type uint?
Default null (native default, typically equal to NBatch)
Range ≤ NBatch
Category Context size and batching
Field on ContextParameters.NUbatch

What it does

NBatch defines the logical batch — the largest number of tokens submitted at once. NUbatch defines the largest chunk the engine processes in a single kernel invocation. When NUbatch < NBatch, the engine splits one logical batch into multiple kernel calls.

  • NUbatch = NBatch (simplest case) — one logical batch = one kernel call.
  • NUbatch < NBatch — one logical batch dispatched as several smaller kernel invocations.

The split matters mainly in specific multi-sequence scenarios where sequential processing of sub-batches is required. For single-sequence chat, NUbatch = NBatch is typical.

When to change it

Scenario Value
Default — match NBatch same as NBatch
Advanced multi-sequence workflows Smaller than NBatch

Most deployments set NUbatch = NBatch and never touch this field.

Example

var preset = new Qwen25Preset();
preset.ContextParameters.NBatch = 4096;
preset.ContextParameters.NUbatch = 4096;  // match the logical batch

using var api = AsposeLLMApi.Create(preset);

Interactions

  • NBatch — upper bound; NUbatch ≤ NBatch.
  • NSeqMax — parallel sequence cap, related in multi-sequence scenarios.

What’s next