PoolingType

PoolingType selects the strategy the engine uses to reduce per-token embeddings to a single vector for the full input. Relevant only when Embeddings is true.

Quick reference

Type PoolingType? enum
Default null (use model default)
Values Unspecified, None, Mean, Cls, Last, Rank
Category Embeddings
Field on ContextParameters.PoolingType

What it does

Value Behavior
Unspecified (-1) Use model default.
None (0) Return per-token embeddings without reduction.
Mean (1) Average all token embeddings. Good default for sentence-level semantic similarity.
Cls (2) Use the first (CLS) token’s embedding. Common for BERT-family.
Last (3) Use the last token’s embedding. Common for causal-LM embeddings.
Rank (4) Rank-based pooling (experimental).

Pick the pooling strategy the model was trained with. Mismatched pooling produces embeddings of degraded quality.

When to change it

Scenario Value
Default chat — not used null
Causal-LM embeddings Last
BERT-style embedder Cls
Sentence-transformer-style Mean

Example

using Aspose.LLM.Abstractions.Models;

var preset = new Qwen25Preset();
preset.ContextParameters.Embeddings = true;
preset.ContextParameters.PoolingType = PoolingType.Mean;
preset.ContextParameters.AttentionType = AttentionType.NonCausal;

using var api = AsposeLLMApi.Create(preset);
// Embedding-only configuration.

Interactions

  • Embeddings — must be true for PoolingType to take effect.
  • AttentionType — usually NonCausal with embedding-specific pooling.

What’s next