AttentionType

AttentionType selects between causal (autoregressive) and non-causal (bidirectional) attention. Standard chat models use causal attention; some embedding models use non-causal.

Quick reference


Type	`AttentionType?` enum
Default	`null` (use model default)
Values	`Unspecified`, `Causal`, `NonCausal`
Category	Attention
Field on	`ContextParameters.AttentionType`

What it does

Value	Behavior
`Unspecified` (`-1`)	Use the model’s metadata-declared type.
`Causal` (`0`)	Each token attends only to earlier tokens. Standard for chat / autoregressive generation.
`NonCausal` (`1`)	Each token attends to all tokens. Used for some embedding models and masked-language workflows.

All built-in chat presets use Causal implicitly (via model metadata). Change to NonCausal only for embedding extraction with a model trained for bidirectional attention.

When to change it

Scenario	Value
Default — chat / text generation	`Unspecified` (model wins)
Bidirectional embedding extraction	`NonCausal`

Example

using Aspose.LLM.Abstractions.Models;

var preset = new Qwen25Preset();
preset.ContextParameters.AttentionType = AttentionType.NonCausal;
preset.ContextParameters.Embeddings = true;
preset.ContextParameters.PoolingType = PoolingType.Mean;
// Embedding-only configuration. Chat generation is not meaningful here.

Interactions

Embeddings — embedding extraction usually pairs with NonCausal.
PoolingType — how embeddings are pooled.

What’s next

Embeddings — extraction mode flag.
PoolingType — embedding pooling.
Context parameters hub — all context knobs.

YarnOrigCtx FlashAttentionMode