UseMemoryLocking
Contents
[
Hide
]
UseMemoryLocking requests the OS to lock model memory pages, preventing them from being swapped out. Requires elevated privileges or raised ulimits.
Quick reference
| Type | bool? |
| Default | null (native default — usually false) |
| Category | Model loading |
| Field on | ModelInferenceParameters.UseMemoryLocking |
What it does
true— the engine callsmlock(Linux/macOS) orVirtualLock(Windows) on the model memory. The OS will not page it out.falseornull— no locking. OS may page model memory under pressure.
Paging inference model memory is catastrophic for performance — suddenly generation stalls for seconds while the kernel pages weights back from disk. UseMemoryLocking = true prevents that.
Cost: requires appropriate privileges. On Linux, the user must have sufficient RLIMIT_MEMLOCK (raise via ulimit -l or /etc/security/limits.conf). On Windows, the process needs “Lock Pages in Memory” permission.
When to change it
| Scenario | Value |
|---|---|
| Default | null (disabled) |
| Shared host under memory pressure | true (requires privilege) |
| Container without memlock capability | null (do not attempt) |
| Dedicated inference machine with ample RAM | null (unnecessary) |
Example
var preset = new Qwen25Preset();
preset.BaseModelInferenceParameters.UseMemoryLocking = true;
// Requires the process to have the required OS-level privilege.
Linux ulimit bump (at the shell, before running):
ulimit -l unlimited
dotnet run
Interactions
UseMemoryMapping— with mmap on,mlocklocks the mapped pages as they fault in.- System-level configuration —
mlockavailability depends on OS limits.
What’s next
- UseMemoryMapping — companion load-time knob.
- Model inference hub — all inference knobs.