Chat templates

Every vision model family has its own prompt format — a specific way to mark where images are inserted and how text turns are wrapped. Aspose.LLM for .NET ships eight templates and selects the right one automatically from the model’s GGUF metadata at load time.

You do not configure templates yourself in the current release. The selection is automatic and internal. This page exists so you can identify which templates are supported and recognize the model families behind them.

Supported templates

Template	Model family	Typical models
LLaVA	LLaVA-style vision fine-tunes	LLaVA-1.5, LLaVA-1.6, various community LLaVAs
Qwen2VL	Qwen 2 / 2.5 VL	`Qwen2-VL-`, `Qwen2.5-VL-`
Qwen3VL	Qwen 3 VL	`Qwen3-VL-*`
Pixtral	Mistral Pixtral	`Pixtral-12B` and derivatives
InternVL	InternVL series	`InternVL2-`, `InternVL3-`
Gemma3	Gemma 3 Vision	`gemma-3-*-vision` variants
Llama4	Llama 4 Vision	`Llama-4-*-Vision`
MiniCPMV	MiniCPM-V	`MiniCPM-V-*`

How selection works

When the engine loads a vision model, it:

Reads the model’s metadata (architecture name, base model name, and vision-specific keys) from the GGUF file.
Matches those values against the template dispatch table in Aspose.LLM.Interop.Multimodal.VisualModelChatTemplates.
Picks the template whose marker tokens and turn format match the detected model family.

If no template matches, the engine falls back to the default text template. Vision turns then emit a generic marker that the model may or may not recognize — output quality degrades.

Built-in preset → template mapping

Each built-in vision preset is already tested against its matching template:

Preset	Template auto-selected
`Qwen25VL3BPreset`	Qwen2VL
`Qwen3VL2BPreset`	Qwen3VL
`Gemma3VisionPreset`	Gemma3
`Ministral3VisionPreset`	Pixtral (Mistral vision models share the Pixtral template)

If you extend one of these presets or use a custom GGUF from the same family, template detection still works.

Custom vision models

If you build a custom preset pointing at a non-Aspose GGUF from one of the eight supported families, template auto-selection should work out of the box — the engine relies on metadata keys that most upstream GGUF conversions preserve.

If selection fails — the response is garbled or includes literal marker tokens like <image> in the output — the model’s metadata is missing the expected keys. Options:

Pick a different GGUF export of the same model with richer metadata.
File a support request via the Aspose Support Forum with the model’s Hugging Face URL so the team can add detection for that specific export.

The current SDK version does not expose a public override for forcing a specific template. A future release may add that surface.

Checking which template was picked

Enable debug logging and look for the [MM] tags:

preset.EngineParameters.EnableDebugLogging = true;

Lines like [MM] selected template: Qwen3VL appear shortly after model load. See Debugging vision for the full log taxonomy.

What’s next

Vision presets — built-in presets with their matching templates.
Debugging vision — inspect the selected template in logs.
Attaching images — the sending side.

Attaching images Debugging vision