Skip to content

Support snapshotting UEFI and nested-virtualization VMs#3744

Open
bitranox wants to merge 1 commit into
microsoft:mainfrom
bitranox:snapshot-save-restore
Open

Support snapshotting UEFI and nested-virtualization VMs#3744
bitranox wants to merge 1 commit into
microsoft:mainfrom
bitranox:snapshot-save-restore

Conversation

@bitranox

Copy link
Copy Markdown
Contributor

A snapshot save/restore of a KVM-backed VM failed in two cases. This fixes both and updates the docs.

UEFI restore. The UEFI device's logger and watchdog platform resolvers were registered only in the LoadMode::Uefi arm of load(). A restore sets the load mode to None (firmware is not reloaded from the snapshot) while still instantiating the UEFI device from the chipset config, so restoring a UEFI VM failed with no resolver for uefi_logger:platform. The resolvers are now registered whenever a UEFI device is present, independent of the load mode.

Nested virtualization. The KVM backend's nested_state()/set_nested_state() returned NotSupported, so saving or restoring a partition with nested virtualization enabled failed: the vCPU's nested VMX/SVM state (VMCS12 and the per-vCPU nested control fields) was never captured, and a nested guest running an L2 at save time could not be resumed. This adds the KVM_GET_NESTED_STATE/KVM_SET_NESTED_STATE ioctls to the kvm crate:

  • get_nested_state() sizes the buffer from KVM_CHECK_EXTENSION(KVM_CAP_NESTED_STATE) (issued on the VM fd, which the vcpu fd rejects with EINVAL) and returns an empty blob when the host lacks the capability.
  • set_nested_state() pushes the opaque blob back; an empty blob takes the reset path (the kernel clears nested state with the vCPU reset), so a guest-initiated reset is not aborted.

NestedState is already ordered last in the per-VP state list, after registers, MSRs, and CPUID, which the kernel validates the blob against on restore.

Docs. The snapshots guide notes that nested-virtualization state is captured on the KVM backend, and save-snapshot's interactive-console help now points at the file-backed-memory requirement.

Testing. The prepush gate passes (fmt, clippy --workspace --all-targets, build -p openvmm, doc --workspace, and the touched-crate tests). Manually verified on the KVM backend that a VM running a nested guest saves and restores (previously the save failed with NotSupported), and that a UEFI VM restores without the resolver error.

@bitranox bitranox requested a review from a team as a code owner June 15, 2026 10:36
Copilot AI review requested due to automatic review settings June 15, 2026 10:36
@github-actions github-actions Bot added Guide unsafe Related to unsafe code labels Jun 15, 2026
@github-actions

Copy link
Copy Markdown

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Enable capturing/restoring KVM nested-virtualization (VMX/SVM) state as part of x86_64 vCPU state, and document updated snapshot requirements/behavior.

Changes:

  • Implement KVM nested-state get/set plumbing and expose it through vCPU state accessors.
  • Make UEFI platform resolvers register based on device presence (including snapshot restore paths).
  • Update snapshot documentation and REPL help text for file-backed memory requirements.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vmm_core/virt_kvm/src/arch/x86_64/vp_state.rs Wire nested-state into VP state access and skip empty restore ioctls.
vm/kvm/src/lib.rs Add KVM nested-state ioctls, errors, and Processor get/set APIs.
openvmm/openvmm_entry/src/repl.rs Clarify snapshot CLI prerequisite wording.
openvmm/openvmm_core/src/worker/dispatch.rs Register UEFI platform resolvers whenever the UEFI device exists.
Guide/src/user_guide/openvmm/snapshots.md Document that KVM snapshots include nested-virtualization state.

Comment thread vm/kvm/src/lib.rs
Comment thread vm/kvm/src/lib.rs Outdated
Comment thread vm/kvm/src/lib.rs
Comment thread vmm_core/virt_kvm/src/arch/x86_64/vp_state.rs Outdated
@bitranox bitranox force-pushed the snapshot-save-restore branch from 36461c3 to aa5dbc1 Compare June 15, 2026 10:50
@github-actions

Copy link
Copy Markdown

let mut deps_hyperv_firmware_pcat = None;
// The UEFI device's platform resolvers (logger, watchdog) are needed
// whenever the device is present, including on a snapshot restore where
// the load mode is None because firmware is not reloaded.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstarks we should probably fix something in how we treat load_mode rather than inspecting the list of devices, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resolver registration used to live in the LoadMode::Uefi arm. On a snapshot restore the entry point sets load_mode = LoadMode::None (the restore branch in openvmm_entry), because the firmware isn't reloaded from a snapshot, so that arm was skipped and the UEFI device's logger and watchdog resolvers never got registered even though the device is still in the config. That's the missing-resolver failure this fixes.

I keyed registration off the UEFI device being present because those resolvers are a property of the device, not of the load action. The downside is the d.name == "uefi" match.

If you'd rather drive it through load_mode, the gap is that LoadMode::None on restore doesn't carry the firmware kind, so we'd need to thread that through (a firmware-kind field, or a load mode that means "UEFI, don't reload"). If you prefer, I can rework it whichever way you and @jstarks land on.

Comment thread Guide/src/user_guide/openvmm/snapshots.md Outdated
Copilot AI review requested due to automatic review settings June 15, 2026 20:37
@bitranox bitranox force-pushed the snapshot-save-restore branch from aa5dbc1 to b7de430 Compare June 15, 2026 20:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Comment thread vm/kvm/src/lib.rs Outdated
Comment thread vm/kvm/src/lib.rs Outdated
Comment thread vm/kvm/src/lib.rs Outdated
Comment thread vm/kvm/src/lib.rs
Two gaps prevented a successful snapshot save/restore of a KVM-backed VM.

UEFI restore: the UEFI device's logger and watchdog platform resolvers were
only registered in the LoadMode::Uefi arm. A restore sets the load mode to
None (firmware is not reloaded) while still instantiating the UEFI device
from the chipset config, so restoring a UEFI VM failed with "no resolver for
uefi_logger:platform". Register those resolvers whenever the UEFI device is
present, independent of the load mode.

Nested virtualization: the KVM backend's nested_state()/set_nested_state()
returned NotSupported, so saving or restoring a partition with nested
virtualization enabled failed: the vCPU's nested VMX/SVM state (VMCS12 and
the per-vCPU nested control fields) was never captured, and a nested guest
running an L2 at save time could not be resumed. Add the KVM_GET_NESTED_STATE
/ KVM_SET_NESTED_STATE ioctls to the kvm crate; get_nested_state() sizes the
buffer from KVM_CHECK_EXTENSION(KVM_CAP_NESTED_STATE) and returns an empty
blob when the host lacks the capability, and set_nested_state() pushes the
opaque blob back. NestedState save/restore is already ordered last in the
per-VP state list, after registers, MSRs, and CPUID, which the kernel
validates the blob against on restore. An empty blob takes the reset path,
so a guest-initiated reset is not aborted.

Note the nested-state support in the snapshots guide and point save-snapshot's
help at the file-backed-memory requirement.
@bitranox bitranox force-pushed the snapshot-save-restore branch from b7de430 to 0437a2e Compare June 15, 2026 20:52
@bitranox bitranox requested a review from smalis-msft June 15, 2026 21:09
@github-actions

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Guide unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants