<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Documentation – Chat parameters</title>
    <link>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/</link>
    <description>Recent content in Chat parameters on Documentation</description>
    <generator>Hugo -- gohugo.io</generator>
    <lastBuildDate>Thu, 23 Apr 2026 00:00:00 +0000</lastBuildDate>
    
	  <atom:link href="https://docs.aspose.com/llm/net/developer-reference/parameters/chat/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Net: SystemPrompt</title>
      <link>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/system-prompt/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      
      <guid>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/system-prompt/</guid>
      <description>
        
        
        &lt;p&gt;&lt;code&gt;SystemPrompt&lt;/code&gt; is the default system prompt applied to every new session created from this preset. It sets the assistant&amp;rsquo;s role, tone, and constraints.&lt;/p&gt;
&lt;h2 id=&#34;quick-reference&#34;&gt;Quick reference&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Default&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;quot;&amp;quot;&lt;/code&gt; (empty)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Category&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Chat session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field on&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ChatParameters.SystemPrompt&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-it-does&#34;&gt;What it does&lt;/h2&gt;
&lt;p&gt;When a chat session starts — either explicitly via &lt;code&gt;StartNewChatAsync&lt;/code&gt; or implicitly on the first &lt;code&gt;SendMessageAsync&lt;/code&gt; — the engine injects a system turn with this text at the top of the conversation. The model sees the system prompt before any user input and uses it to shape its behavior across the session.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;&amp;quot;&amp;quot;&lt;/code&gt; (default) — no system turn. Some presets (certain Gemma variants) prefer this.&lt;/li&gt;
&lt;li&gt;A short instruction — role and tone. For example, &amp;ldquo;You are a concise technical assistant.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;A longer instruction — include format constraints, forbidden topics, preferred output structure.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The system prompt is applied once per session. Changes to &lt;code&gt;SystemPrompt&lt;/code&gt; after &lt;code&gt;AsposeLLMApi.Create&lt;/code&gt; do not affect already-running sessions.&lt;/p&gt;
&lt;h2 id=&#34;when-to-change-it&#34;&gt;When to change it&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Preset-specific default&lt;/td&gt;
&lt;td&gt;Whatever the preset ships with&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Role specialization&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;quot;You are a ...&amp;quot;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Format enforcement&lt;/td&gt;
&lt;td&gt;Explicit format rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safety / content filtering&lt;/td&gt;
&lt;td&gt;Instructions to refuse certain inputs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Keep system prompts concise — 50-300 tokens. Every token in the system prompt counts against &lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextParameters.ContextSize&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;kt&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Qwen25Preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;();&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;SystemPrompt&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;
    &lt;span class=&#34;s&#34;&gt;&amp;#34;You are a precise technical assistant. Answer in at most two sentences. &amp;#34;&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;+&lt;/span&gt;
    &lt;span class=&#34;s&#34;&gt;&amp;#34;Say &amp;#39;I do not know&amp;#39; when you are unsure.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;

&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;api&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AsposeLLMApi&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;Create&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;interactions&#34;&gt;Interactions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/history/&#34;&gt;&lt;code&gt;History&lt;/code&gt;&lt;/a&gt; — seeded history is appended after the system prompt.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/cache-cleanup-strategy/&#34;&gt;&lt;code&gt;CacheCleanupStrategy&lt;/code&gt;&lt;/a&gt; — most strategies preserve the system prompt; the cleanup policy anchors on it.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextSize&lt;/code&gt;&lt;/a&gt; — system prompt consumes tokens from the window.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/use-cases/system-prompt-recipes/&#34;&gt;System prompt recipes&lt;/a&gt; — effective patterns.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/cache-cleanup-strategy/&#34;&gt;CacheCleanupStrategy&lt;/a&gt; — how the system prompt interacts with cache trimming.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/&#34;&gt;Chat parameters hub&lt;/a&gt; — all chat knobs.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Net: History</title>
      <link>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/history/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      
      <guid>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/history/</guid>
      <description>
        
        
        &lt;p&gt;&lt;code&gt;History&lt;/code&gt; is an optional list of &lt;code&gt;ChatMessage&lt;/code&gt; objects used to pre-seed every new session. Useful for few-shot priming, restoring a conversation from external storage, or warming the model with a specific example set.&lt;/p&gt;
&lt;h2 id=&#34;quick-reference&#34;&gt;Quick reference&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;List&amp;lt;ChatMessage&amp;gt;?&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Default&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;null&lt;/code&gt; (no pre-seeded history)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Category&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Chat session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field on&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ChatParameters.History&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-it-does&#34;&gt;What it does&lt;/h2&gt;
&lt;p&gt;When a new session is created, the engine appends each entry of &lt;code&gt;History&lt;/code&gt; after the system prompt, before any user message in the current turn. The model sees these turns as if they had been exchanged earlier.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;null&lt;/code&gt; (default) — fresh session with only the system prompt.&lt;/li&gt;
&lt;li&gt;Explicit list — every new session starts with these turns already in the KV cache.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;History&lt;/code&gt; is applied at session creation; changing the list after &lt;code&gt;Create&lt;/code&gt; has no effect on already-running sessions.&lt;/p&gt;
&lt;h2 id=&#34;when-to-change-it&#34;&gt;When to change it&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Default — blank session&lt;/td&gt;
&lt;td&gt;&lt;code&gt;null&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Few-shot priming for consistent output&lt;/td&gt;
&lt;td&gt;2-4 example turns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-running personality — reinforce tone with examples&lt;/td&gt;
&lt;td&gt;3-5 stylistic turns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Seed from stored transcript&lt;/td&gt;
&lt;td&gt;Application-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;Aspose.LLM.Abstractions.Models&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;

&lt;span class=&#34;kt&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Qwen25Preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;();&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;History&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;List&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;
    &lt;span class=&#34;n&#34;&gt;ChatMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CreateUserMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Translate to French: The cat sleeps.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
    &lt;span class=&#34;n&#34;&gt;ChatMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CreateAssistantMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Le chat dort.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
    &lt;span class=&#34;n&#34;&gt;ChatMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CreateUserMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Translate to French: The dog barks.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
    &lt;span class=&#34;n&#34;&gt;ChatMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CreateAssistantMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;s&#34;&gt;&amp;#34;Le chien aboie.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;),&lt;/span&gt;
&lt;span class=&#34;p&#34;&gt;};&lt;/span&gt;

&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;api&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AsposeLLMApi&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;Create&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt;
&lt;span class=&#34;c1&#34;&gt;// Every new session now starts with these four priming turns.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;interactions&#34;&gt;Interactions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/system-prompt/&#34;&gt;&lt;code&gt;SystemPrompt&lt;/code&gt;&lt;/a&gt; — applied before &lt;code&gt;History&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextSize&lt;/code&gt;&lt;/a&gt; — pre-seeded turns consume tokens from the window.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ChatMessage.CreateUserMessage&lt;/code&gt; / &lt;code&gt;CreateAssistantMessage&lt;/code&gt; / &lt;code&gt;CreateSystemMessage&lt;/code&gt; — factories for building entries.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/chat-sessions/chat-history/&#34;&gt;Chat history reference&lt;/a&gt; — &lt;code&gt;ChatMessage&lt;/code&gt; structure in detail.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/use-cases/system-prompt-recipes/&#34;&gt;System prompt recipes&lt;/a&gt; — priming patterns.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/&#34;&gt;Chat parameters hub&lt;/a&gt; — all chat knobs.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Net: MaxTokens</title>
      <link>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/max-tokens/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      
      <guid>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/max-tokens/</guid>
      <description>
        
        
        &lt;p&gt;&lt;code&gt;MaxTokens&lt;/code&gt; is the upper bound on tokens the engine generates for a single assistant response. The default &lt;code&gt;2048&lt;/code&gt; fits most general-purpose tasks; raise it for reasoning models (Qwen3, DeepSeek-R1) that emit hidden &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; blocks before the answer.&lt;/p&gt;
&lt;h2 id=&#34;quick-reference&#34;&gt;Quick reference&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;int&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Default&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;2048&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Range&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;gt; 0&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Category&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Chat session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field on&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ChatParameters.MaxTokens&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-it-does&#34;&gt;What it does&lt;/h2&gt;
&lt;p&gt;Once the assistant turn begins generation, the engine counts produced tokens. When the counter reaches &lt;code&gt;MaxTokens&lt;/code&gt;, generation stops — even mid-sentence. The rest of the response is never produced.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;256&lt;/code&gt; — short answers; classifications; yes/no responses.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;512&lt;/code&gt; – &lt;code&gt;1024&lt;/code&gt; — conversational replies, brief explanations.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;2048&lt;/code&gt; (default) — general-purpose.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;2048&lt;/code&gt; – &lt;code&gt;4096&lt;/code&gt; — reasoning-model output (Qwen3, DeepSeek-R1). Leaves room for &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; block plus the final answer.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;4096+&lt;/code&gt; — long-form writing, essays, code generation.&lt;/li&gt;
&lt;/ul&gt;


&lt;div class=&#34;alert alert-primary&#34; role=&#34;alert&#34;&gt;

&lt;strong&gt;Reasoning model budget.&lt;/strong&gt; Qwen3, DeepSeek-R1, and similar chain-of-thought models emit hidden reasoning tokens (&lt;code&gt;&amp;lt;think&amp;gt;…&amp;lt;/think&amp;gt;&lt;/code&gt;) that consume 300-500 tokens before the actual answer. Set &lt;code&gt;MaxTokens&lt;/code&gt; to &lt;strong&gt;at least 1024&lt;/strong&gt; — ideally 2048-4096 — when using these models, or the response truncates mid-reasoning and produces no visible answer.
&lt;/div&gt;

&lt;p&gt;&lt;code&gt;MaxTokens&lt;/code&gt; is a cap, not an allocation. Raising it does not cost memory or compute upfront; the engine generates only as many tokens as the model actually produces up to the limit.&lt;/p&gt;
&lt;h2 id=&#34;when-to-change-it&#34;&gt;When to change it&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Classifications, yes/no&lt;/td&gt;
&lt;td&gt;&lt;code&gt;128&lt;/code&gt; – &lt;code&gt;256&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversational chat&lt;/td&gt;
&lt;td&gt;&lt;code&gt;512&lt;/code&gt; – &lt;code&gt;1024&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;General-purpose (default)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;2048&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reasoning models&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1024&lt;/code&gt; – &lt;code&gt;4096&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Essays, code, long-form&lt;/td&gt;
&lt;td&gt;&lt;code&gt;4096&lt;/code&gt; – &lt;code&gt;8192&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;kt&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Qwen25Preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;();&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;MaxTokens&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;1024&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;
&lt;span class=&#34;c1&#34;&gt;// Generous cap for conversational output.
&lt;/span&gt;&lt;span class=&#34;c1&#34;&gt;&lt;/span&gt;
&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;api&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AsposeLLMApi&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;Create&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For DeepSeek-R1:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;kt&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;DeepseekR1Qwen3Preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;();&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;MaxTokens&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;m&#34;&gt;2048&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt; &lt;span class=&#34;c1&#34;&gt;// room for &amp;lt;think&amp;gt; + answer
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;interactions&#34;&gt;Interactions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextSize&lt;/code&gt;&lt;/a&gt; — input plus output plus history must fit; a high &lt;code&gt;MaxTokens&lt;/code&gt; leaves less room for input.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/cache-cleanup-strategy/&#34;&gt;&lt;code&gt;CacheCleanupStrategy&lt;/code&gt;&lt;/a&gt; — trims history as output approaches the context cap.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/&#34;&gt;Chat parameters hub&lt;/a&gt; — all chat knobs.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/troubleshooting/garbled-output/&#34;&gt;Garbled output troubleshooting&lt;/a&gt; — truncation symptoms.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/how-to/tune-for-speed-vs-quality/&#34;&gt;Tune for speed vs quality&lt;/a&gt; — response length trade-offs.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Net: CacheCleanupStrategy</title>
      <link>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/cache-cleanup-strategy/</link>
      <pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate>
      
      <guid>https://docs.aspose.com/llm/net/developer-reference/parameters/chat/cache-cleanup-strategy/</guid>
      <description>
        
        
        &lt;p&gt;&lt;code&gt;CacheCleanupStrategy&lt;/code&gt; is the policy the engine applies when a session&amp;rsquo;s KV cache would overflow &lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextSize&lt;/code&gt;&lt;/a&gt;. Five named strategies, each keeping a different subset of history.&lt;/p&gt;
&lt;h2 id=&#34;quick-reference&#34;&gt;Quick reference&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CacheCleanupStrategy&lt;/code&gt; enum&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Default&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RemoveOldestMessages&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Values&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5 — see table below&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Category&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Chat session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field on&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ChatParameters.CacheCleanupStrategy&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;what-it-does&#34;&gt;What it does&lt;/h2&gt;
&lt;p&gt;When the next generation step would exceed the context window, the engine trims the cache per the active strategy, freeing tokens before continuing. The policy is applied automatically during generation and can also be triggered explicitly via &lt;code&gt;AsposeLLMApi.ForceCacheCleanup(strategy)&lt;/code&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;Keeps&lt;/th&gt;
&lt;th&gt;Typical use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RemoveOldestMessages&lt;/code&gt; (default)&lt;/td&gt;
&lt;td&gt;System prompt + most recent turns&lt;/td&gt;
&lt;td&gt;General-purpose; preserves recency.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptOnly&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System prompt only&lt;/td&gt;
&lt;td&gt;Hard reset of session context.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndHalf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System prompt + newer half of history&lt;/td&gt;
&lt;td&gt;Balanced recall and room for new turns.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndFirstUserMessage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System prompt + first user turn&lt;/td&gt;
&lt;td&gt;Recall-heavy tasks where the original ask matters.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndLastUserMessage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System prompt + most recent user turn&lt;/td&gt;
&lt;td&gt;Focus on current question, drop middle.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;when-to-change-it&#34;&gt;When to change it&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Default conversational chat&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RemoveOldestMessages&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anchored on a big original ask (debugging, iterative refinement)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndFirstUserMessage&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sequence of independent Q&amp;amp;A&lt;/td&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndLastUserMessage&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hard reset when switching topics&lt;/td&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptOnly&lt;/code&gt; via &lt;code&gt;ForceCacheCleanup&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long dialogues with gradual trimming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;KeepSystemPromptAndHalf&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;Aspose.LLM.Abstractions.Models&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;

&lt;span class=&#34;kt&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;new&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Qwen25Preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;();&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;SystemPrompt&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;
    &lt;span class=&#34;s&#34;&gt;&amp;#34;You are a careful analyst. Always ground your answer in the user&amp;#39;s original ask.&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;
&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ChatParameters&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CacheCleanupStrategy&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;
    &lt;span class=&#34;n&#34;&gt;CacheCleanupStrategy&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;KeepSystemPromptAndFirstUserMessage&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;;&lt;/span&gt;

&lt;span class=&#34;k&#34;&gt;using&lt;/span&gt; &lt;span class=&#34;nn&#34;&gt;var&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;api&lt;/span&gt; &lt;span class=&#34;p&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;AsposeLLMApi&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;Create&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;preset&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt;
&lt;span class=&#34;c1&#34;&gt;// Even after 50 follow-ups, the model can still refer back to the first user turn.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Force a reset mid-session:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-csharp&#34; data-lang=&#34;csharp&#34;&gt;&lt;span class=&#34;n&#34;&gt;api&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;ForceCacheCleanup&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;CacheCleanupStrategy&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;KeepSystemPromptOnly&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;interactions&#34;&gt;Interactions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/system-prompt/&#34;&gt;&lt;code&gt;SystemPrompt&lt;/code&gt;&lt;/a&gt; — all strategies preserve it.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/context-size/&#34;&gt;&lt;code&gt;ContextSize&lt;/code&gt;&lt;/a&gt; — the ceiling this strategy serves.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/context/defrag-threshold/&#34;&gt;&lt;code&gt;DefragThreshold&lt;/code&gt;&lt;/a&gt; — compacts holes left behind by cleanup.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AsposeLLMApi.ForceCacheCleanup(strategy)&lt;/code&gt; — manual trigger with an override strategy.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/cache-management/&#34;&gt;Cache management&lt;/a&gt; — full guide with practical patterns.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/use-cases/multi-turn-chat/&#34;&gt;Multi-turn chat use case&lt;/a&gt; — cache management in practice.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aspose.com/llm/net/developer-reference/parameters/chat/&#34;&gt;Chat parameters hub&lt;/a&gt; — all chat knobs.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
  </channel>
</rss>
