Product overview
Contents
[
Hide
]
Aspose.LLM for .NET is a library for integrating large language models into your .NET applications. You run models on your own infrastructure — on-premise or in a controlled cloud environment — and interact with them through a managed API: create an instance from a preset, start chat sessions, send messages, and optionally save or load conversation state.
This section covers what the SDK does, how it is built, and what it supports.
Sections
- Architecture — four-layer design (facade, engine, P/Invoke, native), runtime flow on first
Create, memory footprint, and lifecycle. - Features — capabilities in detail, plus explicit scope limits (no streaming, no function calling, no fine-tuning, no audio).
- Supported presets — built-in text and vision presets with their Hugging Face model sources and default parameters.
- Supported acceleration — CUDA, HIP, Metal, Vulkan, CPU backends with platform × backend matrix and first-run download sizes.
At a glance
- Preset-based setup — built-in presets for Qwen 2.5 / Qwen 3, Gemma 3, Llama 3.2, Phi 4, DeepSeek, and gpt-oss-20b. Extend
PresetCoreBaseto bring your own GGUF model. - Chat sessions — create sessions with
StartNewChatAsync, send messages withSendMessageAsyncorSendMessageToSessionAsync, and maintain multi-turn conversations per session. - Session persistence —
SaveChatSessionandLoadChatSessionserialize a session to disk and restore it later. - Optional multimodal input — pass images (JPEG, PNG, BMP, GIF, WebP; up to 50 MB each) alongside prompts when using a vision preset.
- Hardware acceleration — CUDA, HIP, Metal, Vulkan, or CPU with AVX2 / AVX512. Native binaries download automatically on first use.
- Single instance per process — one
AsposeLLMApiinstance at a time. Create it once and reuse it for all sessions. - Licensing — apply a commercial license via
License.SetLicense; check status withLicense.IsLicensed. A free temporary license is available for evaluation and proof-of-concept work. Inference requires an applied license — the SDK does not run chat APIs in evaluation mode.
What’s next
- Architecture — layered design and runtime flow.
- Features — full capability list and scope limits.
- Supported presets — pick a preset for your model and hardware.
- Supported acceleration — platform / backend matrix.
- Getting started — install, license, and run the first example.