Skip to content

membacking: add private anonymous memory backend for guest RAM#2849

Draft
jstarks wants to merge 6 commits intomicrosoft:mainfrom
jstarks:private_mem
Draft

membacking: add private anonymous memory backend for guest RAM#2849
jstarks wants to merge 6 commits intomicrosoft:mainfrom
jstarks:private_mem

Conversation

@jstarks
Copy link
Member

@jstarks jstarks commented Feb 27, 2026

Summary

Add a --private-memory mode that allocates guest RAM using private anonymous
memory (VirtualAlloc on Windows, mmap MAP_ANONYMOUS on Linux) instead of
page-file-backed shared memory sections. This is the foundation for supporting
confidential VM scenarios where guest memory must not be aliased or backed by a
named file mapping.

Motivation

Today, OpenVMM allocates guest RAM as a shared memory section (file mapping)
that gets mapped into the VMM process. This section-backed model makes it
easy to share RAM between processes, but it's fundamentally incompatible with
hardware-enforced private memory (SEV-SNP, TDX, Hyper-V VSM): once a page is
assigned to a guest as private, the host can't map it. For those scenarios we
need a memory model where the VMM manages virtual address ranges directly
using commit/decommit, without a backing file object.

Approach

Layer 1: sparse_mmapcommit() / decommit()

  • Windows: VirtualAlloc2(MEM_COMMIT) / VirtualFreeEx(MEM_DECOMMIT)
  • Linux: no-op commit (pages auto-commit on fault) / madvise(MADV_DONTNEED)

Layer 2: VaMapper — private-RAM mode

  • New private_ram flag on VaMapper; when set, page_fault() commits memory
    on demand (Windows) or returns Fail since Linux auto-faults.
  • alloc_range() eagerly commits a VA range; decommit() releases it.
  • new_private_mapper() on MappingManagerClient creates a private mapper using
    the existing MAPPER_CACHE for correct single-instancing.

Layer 3: GuestMemoryBuilder — conditional allocation

  • New .private_memory(true) builder option.
  • Skips alloc_shared_memory() for guest RAM; uses the private mapper instead.
  • Eagerly commits RAM regions via alloc_range() during build.
  • Skips add_mapping() for RAM ranges (no file-backed mapping to register).

Layer 4: CLI / config wiring

  • --private-memory flag in cli_args.rs, plumbed through MemoryConfig to
    the builder.

Layer 5: Petri boot test

  • boot_private_memory test in multiarch.rs validates end-to-end boot with
    private memory enabled.

Testing

  • 6 sparse_mmap unit tests + 5 membacking unit tests, all passing on
    both Linux and Windows (x86_64-pc-windows-msvc cross-compile)
  • Runtime-tested: boots a Linux guest on KVM and WHV with --private-memory
  • cargo xtask fmt --fix and cargo doc clean

Future work

  • Balloon integration: wire decommit() to the balloon driver so deflated
    pages release physical memory (Phase 4 from the design).
  • Save/restore: persist commit state across live migration.
  • aarch64 petri test: blocked on petri: support pipette on linux direct tests on aarch64 #1798 (linux_direct_aarch64 support).

Add commit() and decommit() methods to SparseMapping for managing the
commit state of pages within an existing VA reservation.

On Windows:
- decommit() calls VirtualFreeEx with MEM_DECOMMIT to release physical
  pages while keeping the VA range reserved
- commit() calls VirtualAlloc2 with MEM_COMMIT to make pages accessible,
  working idempotently on already-committed pages

On Linux:
- decommit() calls madvise(MADV_DONTNEED) to release pages back to the
  kernel (subsequent reads return zeroes)
- commit() is a no-op since the kernel handles page faults transparently
  for MAP_ANONYMOUS regions

These primitives enable a private memory mode where guest RAM uses
anonymous virtual memory instead of file-backed shared memory sections.
Add a private_ram flag to VaMapper that enables a mode where the backing
SparseMapping uses committed anonymous memory instead of file-backed
mappings for guest RAM.

In private-RAM mode:
- page_fault() on Windows commits 64KB-aligned chunks via
  SparseMapping::commit() and returns PageFaultAction::Retry, allowing
  the hypervisor to retry the faulting access
- page_fault() on Linux returns PageFaultAction::Fail since the kernel
  handles faults transparently for MAP_ANONYMOUS regions
- alloc_range() eagerly commits a range of pages
- decommit() releases pages back (for future balloon/free-page-reporting)

Add new_private_mapper() to MappingManagerClient that creates a VaMapper
in private-RAM mode, bypassing the mapper cache since private mappers
cannot be shared across partitions.
Add a private_memory option to GuestMemoryBuilder that allocates guest
RAM from committed anonymous virtual memory instead of a shared memory
section.

When private_memory is enabled:
- No shared memory file/section is created (guest_ram is None)
- A private VaMapper is used via new_private_mapper()
- RAM ranges are eagerly committed with alloc_range()
- File-backed add_mapping() calls are skipped for RAM regions
- Memory prefetch is disabled (pages are already committed)

Validation rejects private_memory combined with:
- Legacy memory layout (x86_legacy_support) which requires shared memory
- Pre-existing memory backing (shared_memory_region_base)

The guest_ram field is changed to Option<Mappable> to reflect that
private memory mode does not produce a shareable backing object.
Add the --private-memory CLI flag to OpenVMM and wire it through the
configuration pipeline:

- Add private_memory field to MemoryConfig in openvmm_defs
- Add --private-memory boolean flag to CLI argument parser
- Wire the flag from CLI args to MemoryConfig in openvmm_entry
- Pass private_memory to GuestMemoryBuilder in dispatch.rs
- Default private_memory to false in ttrpc and petri config construction
Add unit tests covering the new private memory functionality:

sparse_mmap (3 tests):
- test_decommit_zeros_pages: verify decommit returns pages to zero
- test_commit_after_decommit: verify commit restores access after decommit
- test_commit_idempotent: verify commit on already-committed pages is safe

membacking (4 tests):
- test_private_ram_alloc_write_read: basic alloc/write/read cycle
- test_private_ram_decommit_zeros: verify decommit zeros private pages
- test_private_ram_recommit_after_decommit: verify recommit after decommit
- test_private_ram_commit_idempotent: verify double-commit is safe

All tests use direct SparseMapping operations to avoid async infrastructure
complexity, exercising the actual platform commit/decommit code paths.
@github-actions github-actions bot added the unsafe Related to unsafe code label Feb 27, 2026
@github-actions
Copy link

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant