Troubleshooting

When the SDK misbehaves, the failure is usually in one of seven well-known buckets. This section covers each one with the same structure on every page: Symptom (what you see), Cause (what is happening), Resolution (how to fix), and optional Prevention.

Pick the page that matches your symptom from the topics list. If none matches, use the diagnostic flow below to narrow the problem, then check the closest page, then ask for help.

Pre-flight checks

Before diving into a specific page, confirm the basics. The majority of tickets sent to support turn out to be one of these:

  • License applied: Aspose.LLM.License.IsLicensed returns true before any chat method is called. The SDK does not run inference in evaluation mode — see License errors.
  • Debug logging on: set EngineParameters.EnableDebugLogging = true and pass an ILogger to AsposeLLMApi.Create(preset, logger). Native tagged lines ([MM], [CTX], [KV]) reveal where a failure happens. See Logging and diagnostics.
  • Known good preset: reproduce with a built-in preset like Qwen25Preset before suspecting the SDK. Custom presets or manual overrides are the most common source of garbled output.
  • Minimal repro: strip down to the smallest possible snippet that fails. If the minimal snippet passes, the problem is in your integration, not the SDK.

Diagnostic flow

Walk this decision tree when the symptom is not obvious.

  1. Does Create return at all?

  2. Does the first chat call throw?

    • Not licensed for this method → see License errors.
    • Out-of-memory → see Out of memory.
    • Other → capture the full stack trace and open a support ticket.
  3. Does chat return, but slowly?

  4. Does chat return, but output is wrong?

    • Nonsense / literal marker tokens / repetition loops → Garbled output.
    • Truncated mid-sentence → raise ChatParameters.MaxTokens; see Chat parameters.
  5. Does memory grow across long sessions?

Symptom → page shortcut

Symptom Start here
HttpRequestException during Create Binary download fails
InvalidOperationException during model load Model not loading
cudaErrorOutOfMemory, OutOfMemoryException Out of memory
Inference runs at CPU speed despite a GPU present GPU not detected
Replies contain <image>, <|im_start|>, etc. verbatim Garbled output
Output loops or repeats Garbled output
Not licensed for this method License errors
Unexpected high first-token latency Performance issues
Throughput well below hardware expectation Performance issues

Topics

  • Binary download failsBinaryManager cannot reach GitHub, TLS interception, disk space.
  • Out of memory — GPU VRAM, system RAM, KV cache growth.
  • GPU not detected — driver, CUDA version, PreferredAcceleration, container flags.
  • Model not loading — corrupt GGUF, unsupported architecture, wrong file name.
  • Garbled output — template mismatch, repetition loops, truncation, vision misalignment.
  • License errors — missing SetLicense, expired temporary license, embedded resource mis-naming.
  • Performance issues — low throughput, latency spikes, thread contention, thermal throttling.

Asking for help

When none of the above match, or the fix does not stick, open a thread on the Aspose Support Forum with:

  • SDK version (Aspose.LLM NuGet version).
  • Host OS and architecture.
  • GPU model and driver version (if applicable).
  • Preset class name and any overrides you applied.
  • Full log from a reproducing run with EnableDebugLogging = true.
  • A minimal code sample that reproduces the issue.
  • The expected versus actual output.

For paid support, use the Aspose Helpdesk.

What’s next