Embeddings

Embeddings is a boolean flag. When true, the engine extracts embedding vectors alongside (or instead of) logits. Use it with a PoolingType that matches the model’s training regime.

Quick reference

Type bool?
Default null (disabled)
Category Embeddings
Field on ContextParameters.Embeddings

What it does

  • null or false — standard generation mode. Only logits are produced; no embedding extraction.
  • true — the engine configures the pipeline to output embeddings per input.

Embeddings are typically used for semantic search, clustering, classification, or as retrieval keys in RAG systems. The SDK’s current chat API (SendMessageAsync) focuses on text generation; embedding workflows require reaching into the Engine and ChatSession APIs directly.

When to change it

Scenario Value
Default chat null
Extract embeddings true, paired with PoolingType and often AttentionType = NonCausal

A dedicated use case for embeddings is on the documentation roadmap but not covered in this version.

Example

using Aspose.LLM.Abstractions.Models;

var preset = new Qwen25Preset();
preset.ContextParameters.Embeddings = true;
preset.ContextParameters.PoolingType = PoolingType.Mean;
preset.ContextParameters.AttentionType = AttentionType.NonCausal;

using var api = AsposeLLMApi.Create(preset);
// Direct chat methods do not surface embeddings; use Engine/ChatSession internals.

Interactions

What’s next