Dependency injection
Aspose.LLM for .NET provides an extension method for IServiceCollection that registers the engine and its dependencies in a Microsoft.Extensions.DependencyInjection container. This is the idiomatic entry point for ASP.NET Core, Worker Service, and any application that wires services through DI.
The AsposeLLMApi facade and the DI path are two ways to reach the same engine. Use whichever suits your application shape:
- Facade — simple console apps, CLI tools, scripts. One object to instantiate and dispose.
- DI — ASP.NET Core, hosted services, Workers. Engine and its dependencies become DI-resolvable singletons.
Method signature
namespace Aspose.LLM.Core.DependencyInjection;
public static class Configuration
{
public static IServiceCollection AddLlamaServices(
this IServiceCollection services,
PresetCoreBase preset,
Action<PresetCoreBase>? configure = null);
}
Arguments:
services— the DI container’s service collection.preset— the preset that supplies engine, model, context, sampler, and binary settings.configure— optional callback to tweak the preset programmatically before registration.
What gets registered
AddLlamaServices registers these services as singletons:
| Service | Purpose |
|---|---|
PresetCoreBase |
The preset itself, resolvable by type or subtype. |
EngineParameters |
Extracted from the preset. |
BinaryManagerParameters |
Extracted from the preset. |
LocalFilesystemProvider |
Resolves local-path models. |
HuggingFaceProvider |
Downloads from Hugging Face. |
ModelManager |
Resolves and loads models. |
BinaryManager |
Deploys native binaries. |
Engine |
The core inference engine. |
NativeLoggerAdapter |
Bridges llama.cpp native logs to ILogger. |
ILoggerFactory & friends |
Console + file logging, level controlled by EnableDebugLogging. |
Engine is constructed with presetToLoad: preset, which means the model is loaded synchronously the first time Engine is resolved — same as the facade’s Create.
Logging is automatically configured:
- A
consolelogger provider. - A file logger provider that writes to
EngineParameters.LogDirectoryPath. - Level
DebugwhenEngineParameters.EnableDebugLogging = true; otherwiseNone.
Minimal ASP.NET Core setup
using Aspose.LLM;
using Aspose.LLM.Abstractions.Parameters.Presets;
using Aspose.LLM.Core.DependencyInjection;
using Aspose.LLM.Core.Services;
var builder = WebApplication.CreateBuilder(args);
// Apply license once during startup.
var license = new Aspose.LLM.License();
license.SetLicense("Aspose.LLM.lic");
// Register LLM services.
builder.Services.AddLlamaServices(new Qwen25Preset());
var app = builder.Build();
app.MapPost("/chat", async (string message, Engine engine, CancellationToken ct) =>
{
string sessionId = await engine.InitiateNewSession(
preset: app.Services.GetRequiredService<PresetCoreBase>());
return await engine.GetChatSessionResponse(sessionId, message, null, ct);
});
app.Run();
Resolve Engine — or any registered service — via constructor injection or app.Services.GetRequiredService<T>().
Customizing the preset at registration
Pass a configure callback to adjust the preset before it is stored:
builder.Services.AddLlamaServices(new Qwen25Preset(), preset =>
{
preset.ContextParameters.ContextSize = 16384;
preset.SamplerParameters.Temperature = 0.3f;
preset.ChatParameters.SystemPrompt = "You are a concise enterprise assistant.";
preset.BaseModelInferenceParameters.GpuLayers = 999;
});
The callback runs once, before the preset is registered as a singleton. Mutations inside the callback are visible to every resolved service.
Worker Service example
using Aspose.LLM;
using Aspose.LLM.Abstractions.Parameters.Presets;
using Aspose.LLM.Core.DependencyInjection;
using Aspose.LLM.Core.Services;
using Microsoft.Extensions.Hosting;
await Host.CreateDefaultBuilder(args)
.ConfigureServices((ctx, services) =>
{
var license = new Aspose.LLM.License();
license.SetLicense("Aspose.LLM.lic");
services.AddLlamaServices(new Qwen25Preset());
services.AddHostedService<ChatWorker>();
})
.Build()
.RunAsync();
public class ChatWorker : BackgroundService
{
private readonly Engine _engine;
private readonly PresetCoreBase _preset;
public ChatWorker(Engine engine, PresetCoreBase preset)
{
_engine = engine;
_preset = preset;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
string sessionId = await _engine.InitiateNewSession(preset: _preset);
while (!stoppingToken.IsCancellationRequested)
{
string reply = await _engine.GetChatSessionResponse(
sessionId,
"Summarize the latest event queue.",
null,
stoppingToken);
// ... do something with reply
await Task.Delay(TimeSpan.FromMinutes(1), stoppingToken);
}
}
}
Constraints that still apply
-
Single instance per process. The underlying
Engineshares the single-instance guard withAsposeLLMApi. RegisteringAddLlamaServicesin a web host and also creating anAsposeLLMApion the side throws. Pick one entry point. -
License is still required before chat methods. Apply the license before any request handler calls
Enginemethods. -
Model loads on first resolve — the first request hitting
Enginecan take minutes on a cold machine. Consider eager resolution at startup:app.Services.GetRequiredService<Engine>(); // trigger first-time model loadOr warm up in a hosted service so the first user request is not slow.
What’s next
- AsposeLLMApi facade — the simpler non-DI entry point.
- Engine parameters — logging and threading defaults applied via
AddLlamaServices. - Licensing — license application in host startup.