Documentation – Chat parameters

Net: SystemPrompt

Fri, 24 Apr 2026 00:00:00 +0000

SystemPrompt is the default system prompt applied to every new session created from this preset. It sets the assistant’s role, tone, and constraints.

Quick reference


Type	`string`
Default	`""` (empty)
Category	Chat session
Field on	`ChatParameters.SystemPrompt`

What it does

When a chat session starts — either explicitly via StartNewChatAsync or implicitly on the first SendMessageAsync — the engine injects a system turn with this text at the top of the conversation. The model sees the system prompt before any user input and uses it to shape its behavior across the session.

"" (default) — no system turn. Some presets (certain Gemma variants) prefer this.
A short instruction — role and tone. For example, “You are a concise technical assistant.”
A longer instruction — include format constraints, forbidden topics, preferred output structure.

The system prompt is applied once per session. Changes to SystemPrompt after AsposeLLMApi.Create do not affect already-running sessions.

When to change it

Scenario	Value
Preset-specific default	Whatever the preset ships with
Role specialization	`"You are a ..."`
Format enforcement	Explicit format rules
Safety / content filtering	Instructions to refuse certain inputs

Keep system prompts concise — 50-300 tokens. Every token in the system prompt counts against ContextParameters.ContextSize.

Example

var preset = new Qwen25Preset();
preset.ChatParameters.SystemPrompt =
    "You are a precise technical assistant. Answer in at most two sentences. " +
    "Say 'I do not know' when you are unsure.";

using var api = AsposeLLMApi.Create(preset);

Interactions

History — seeded history is appended after the system prompt.
CacheCleanupStrategy — most strategies preserve the system prompt; the cleanup policy anchors on it.
ContextSize — system prompt consumes tokens from the window.

What’s next

System prompt recipes — effective patterns.
CacheCleanupStrategy — how the system prompt interacts with cache trimming.
Chat parameters hub — all chat knobs.

Net: History

Fri, 24 Apr 2026 00:00:00 +0000

History is an optional list of ChatMessage objects used to pre-seed every new session. Useful for few-shot priming, restoring a conversation from external storage, or warming the model with a specific example set.

Quick reference


Type	`List<ChatMessage>?`
Default	`null` (no pre-seeded history)
Category	Chat session
Field on	`ChatParameters.History`

What it does

When a new session is created, the engine appends each entry of History after the system prompt, before any user message in the current turn. The model sees these turns as if they had been exchanged earlier.

null (default) — fresh session with only the system prompt.
Explicit list — every new session starts with these turns already in the KV cache.

History is applied at session creation; changing the list after Create has no effect on already-running sessions.

When to change it

Scenario	Value
Default — blank session	`null`
Few-shot priming for consistent output	2-4 example turns
Long-running personality — reinforce tone with examples	3-5 stylistic turns
Seed from stored transcript	Application-specific

Example

using Aspose.LLM.Abstractions.Models;

var preset = new Qwen25Preset();
preset.ChatParameters.History = new List<ChatMessage>
{
    ChatMessage.CreateUserMessage("Translate to French: The cat sleeps."),
    ChatMessage.CreateAssistantMessage("Le chat dort."),
    ChatMessage.CreateUserMessage("Translate to French: The dog barks."),
    ChatMessage.CreateAssistantMessage("Le chien aboie."),
};

using var api = AsposeLLMApi.Create(preset);
// Every new session now starts with these four priming turns.

Interactions

SystemPrompt — applied before History.
ContextSize — pre-seeded turns consume tokens from the window.
ChatMessage.CreateUserMessage / CreateAssistantMessage / CreateSystemMessage — factories for building entries.

What’s next

Chat history reference — ChatMessage structure in detail.
System prompt recipes — priming patterns.
Chat parameters hub — all chat knobs.

Net: MaxTokens

Fri, 24 Apr 2026 00:00:00 +0000

MaxTokens is the upper bound on tokens the engine generates for a single assistant response. The default 2048 fits most general-purpose tasks; raise it for reasoning models (Qwen3, DeepSeek-R1) that emit hidden <think> blocks before the answer.

Quick reference


Type	`int`
Default	`2048`
Range	`> 0`
Category	Chat session
Field on	`ChatParameters.MaxTokens`

What it does

Once the assistant turn begins generation, the engine counts produced tokens. When the counter reaches MaxTokens, generation stops — even mid-sentence. The rest of the response is never produced.

256 — short answers; classifications; yes/no responses.
512 – 1024 — conversational replies, brief explanations.
2048 (default) — general-purpose.
2048 – 4096 — reasoning-model output (Qwen3, DeepSeek-R1). Leaves room for <think> block plus the final answer.
4096+ — long-form writing, essays, code generation.

Reasoning model budget. Qwen3, DeepSeek-R1, and similar chain-of-thought models emit hidden reasoning tokens (<think>…</think>) that consume 300-500 tokens before the actual answer. Set MaxTokens to at least 1024 — ideally 2048-4096 — when using these models, or the response truncates mid-reasoning and produces no visible answer.

MaxTokens is a cap, not an allocation. Raising it does not cost memory or compute upfront; the engine generates only as many tokens as the model actually produces up to the limit.

When to change it

Scenario	Value
Classifications, yes/no	`128` – `256`
Conversational chat	`512` – `1024`
General-purpose (default)	`2048`
Reasoning models	`1024` – `4096`
Essays, code, long-form	`4096` – `8192`

Example

var preset = new Qwen25Preset();
preset.ChatParameters.MaxTokens = 1024;
// Generous cap for conversational output.

using var api = AsposeLLMApi.Create(preset);

For DeepSeek-R1:

var preset = new DeepseekR1Qwen3Preset();
preset.ChatParameters.MaxTokens = 2048; // room for <think> + answer

Interactions

ContextSize — input plus output plus history must fit; a high MaxTokens leaves less room for input.
CacheCleanupStrategy — trims history as output approaches the context cap.

What’s next

Chat parameters hub — all chat knobs.
Garbled output troubleshooting — truncation symptoms.
Tune for speed vs quality — response length trade-offs.

Net: CacheCleanupStrategy

Fri, 24 Apr 2026 00:00:00 +0000

CacheCleanupStrategy is the policy the engine applies when a session’s KV cache would overflow ContextSize. Five named strategies, each keeping a different subset of history.

Quick reference


Type	`CacheCleanupStrategy` enum
Default	`RemoveOldestMessages`
Values	5 — see table below
Category	Chat session
Field on	`ChatParameters.CacheCleanupStrategy`

What it does

When the next generation step would exceed the context window, the engine trims the cache per the active strategy, freeing tokens before continuing. The policy is applied automatically during generation and can also be triggered explicitly via AsposeLLMApi.ForceCacheCleanup(strategy).

Strategy	Keeps	Typical use
`RemoveOldestMessages` (default)	System prompt + most recent turns	General-purpose; preserves recency.
`KeepSystemPromptOnly`	System prompt only	Hard reset of session context.
`KeepSystemPromptAndHalf`	System prompt + newer half of history	Balanced recall and room for new turns.
`KeepSystemPromptAndFirstUserMessage`	System prompt + first user turn	Recall-heavy tasks where the original ask matters.
`KeepSystemPromptAndLastUserMessage`	System prompt + most recent user turn	Focus on current question, drop middle.

When to change it

Scenario	Value
Default conversational chat	`RemoveOldestMessages`
Anchored on a big original ask (debugging, iterative refinement)	`KeepSystemPromptAndFirstUserMessage`
Sequence of independent Q&A	`KeepSystemPromptAndLastUserMessage`
Hard reset when switching topics	`KeepSystemPromptOnly` via `ForceCacheCleanup`
Long dialogues with gradual trimming	`KeepSystemPromptAndHalf`

Example

using Aspose.LLM.Abstractions.Models;

var preset = new Qwen25Preset();
preset.ChatParameters.SystemPrompt =
    "You are a careful analyst. Always ground your answer in the user's original ask.";
preset.ChatParameters.CacheCleanupStrategy =
    CacheCleanupStrategy.KeepSystemPromptAndFirstUserMessage;

using var api = AsposeLLMApi.Create(preset);
// Even after 50 follow-ups, the model can still refer back to the first user turn.

Force a reset mid-session:

api.ForceCacheCleanup(CacheCleanupStrategy.KeepSystemPromptOnly);

Interactions

SystemPrompt — all strategies preserve it.
ContextSize — the ceiling this strategy serves.
DefragThreshold — compacts holes left behind by cleanup.
AsposeLLMApi.ForceCacheCleanup(strategy) — manual trigger with an override strategy.

What’s next

Cache management — full guide with practical patterns.
Multi-turn chat use case — cache management in practice.
Chat parameters hub — all chat knobs.