Supported acceleration

Aspose.LLM for .NET ships native llama.cpp binaries for five acceleration backends. At first AsposeLLMApi.Create, BinaryManager picks the matching variant for your platform and downloads it from GitHub. You can also force a specific backend via BinaryManagerParameters.PreferredAcceleration.

This page is an at-a-glance sheet for planning deployments. For per-backend setup (drivers, GpuLayers, multi-GPU split), see the detail pages.

Platform × backend matrix

Backend	Windows	Linux	macOS	Typical use
CUDA	✅	✅	❌	NVIDIA GPUs. Highest throughput on modern nVidia cards.
HIP / ROCm	❌	✅	❌	AMD Instinct and RDNA 3 Radeon cards.
Metal	❌	❌	✅	Apple Silicon (M1/M2/M3/M4).
Vulkan	✅	✅	❌	Cross-vendor GPU fallback. NVIDIA, AMD, Intel. Windows AMD users.
CPU	✅	✅	✅	No GPU required. AVX512/AVX2/AVX/NoAVX variants.

Auto-detection

When PreferredAcceleration is null (default), the SDK picks in this order:

CUDA — if an NVIDIA GPU with driver 525+ is present.
HIP — if a ROCm-capable AMD GPU is present (Linux only).
Metal — on Apple Silicon.
Vulkan — any Vulkan-capable GPU.
CPU — highest AVX level available (AVX512 > AVX2 > AVX > NoAVX).

Override by setting BinaryManagerParameters.PreferredAcceleration. See Binary manager parameters.

First-run download sizes

Binaries are downloaded once per preset’s ReleaseTag and cached locally.

Backend	Typical download
CUDA	400-800 MB
HIP	300-500 MB
Vulkan	200-400 MB
Metal	100-200 MB
CPU (AVX2/AVX512)	80-150 MB

Picking a backend

Scenario	Try
NVIDIA host (dev or server)	CUDA (best throughput)
AMD host on Linux with ROCm	HIP
AMD host on Windows	Vulkan
Apple Silicon Mac	Metal (auto-detected)
Intel iGPU	Vulkan
No GPU or GPU not supported	CPU
Cross-vendor deployments from one codepath	Vulkan

What’s next

CUDA, HIP / ROCm, Metal, Vulkan, CPU — per-backend setup.
Binary manager parameters — how PreferredAcceleration and BinaryPath work.
System requirements — OS / driver / memory prerequisites.
GPU not detected — common pitfalls.

Supported presets