Skip to content

Add ARM CCA FVP (Fixed Virtual Platform) support to Flowey#2866

Open
weiding-msft wants to merge 18 commits intomicrosoft:mainfrom
weiding-msft:fvp-cca-enablement
Open

Add ARM CCA FVP (Fixed Virtual Platform) support to Flowey#2866
weiding-msft wants to merge 18 commits intomicrosoft:mainfrom
weiding-msft:fvp-cca-enablement

Conversation

@weiding-msft
Copy link

This PR adds a new cca-fvp pipeline to Flowey that enables testing of ARM Confidential Compute Architecture (CCA) configurations using the Arm Fixed Virtual Platform with Shrinkwrap.

Overview
The new pipeline automates the setup and execution of CCA-enabled ARM64 virtual machines by:

  • Installing and configuring the ARM GNU toolchain
  • Building a custom Linux kernel with CCA, 9P, and Hyper-V support
  • Building TMK (Trusted Measurement Kernel) components for plane0 support
  • Setting up Shrinkwrap for FVP orchestration
  • Injecting OpenVMM/OpenHCL components into the rootfs
  • Running the FVP with configurable platform and overlay files

New Flowey Jobs:

  • local_install_shrinkwrap: Installs dependencies, clones required repos (OHCL-Linux-Kernel, OpenVMM-TMK, Shrinkwrap, cca_config), builds kernel and TMK binaries
  • local_shrinkwrap_build: Builds Shrinkwrap artifacts from YAML configurations
  • local_shrinkwrap_run: Modifies rootfs with OpenVMM/TMK binaries and launches FVP

Pipeline Features:

  • Simple CLI: cargo xflowey cca-fvp with sensible defaults
  • Configurable paths for platform, overlay, and rootfs files
  • Support for environment variable (SHRINKWRAP_PACKAGE) to locate artifacts
  • Automatic path resolution (simple filenames → shrinkwrap/config/, relative paths validated against --dir
  • Smart defaults for common use cases (cca-3world.yaml, buildroot.yaml, planes.yaml)

Code Quality Improvements:

  • Refactored repeated git clone operations into reusable helper function
  • Extracted kernel config and Rust binary build logic into dedicated functions
  • Fixed kernel config command to enable configs individually (avoids shell parsing issues)
  • Robust filesystem unmount logic with retry and lazy unmount for busy devices
  • Python venv creation with proper validation

Implementation Details

Kernel Configuration:

  • Enables required CCA guest support (CONFIG_VIRT_DRIVERS, CONFIG_ARM_CCA_GUEST)
  • Enables 9P filesystem support for host file sharing
  • Enables Hyper-V/MSHV support for paravisor functionality

TMK Integration:

  • Builds simple_tmk for plane0 (aarch64-minimal_rt-none target)
  • Builds tmk_vmm for host (aarch64-unknown-linux-gnu target)
  • Injects both binaries into rootfs at /cca/ directory

Shrinkwrap Integration:

  • Auto-clones planes.yaml configuration for planes-enabled stack
  • Modifies rootfs.ext2 using Docker-based e2fsck and resize2fs tools
  • Handles busy mount points with retry logic and lazy unmount

Compatibility:

  • Updated to work with RustRuntimeServices
  • Uses flowey::shell_cmd! macro and rt.sh for shell operations
  • No direct xshell dependency required

Usage

# Basic usage (uses defaults)
cargo xflowey cca-fvp

# With custom configuration
cargo xflowey cca-fvp \
  --dir target/my-fvp \
  --platform cca-3world.yaml \
  --overlay buildroot.yaml \
  --overlay planes.yaml \
  --rootfs ~/.shrinkwrap/package/cca-3world/rootfs.ext2

Testing
Successfully builds and runs on Ubuntu dev container with:

  • ARM GNU Toolchain 14.3
  • Docker for rootfs modification
  • Python 3 with venv support

weiding-msft and others added 18 commits March 2, 2026 13:21
Use absolution path for shrinkwrap directory, since shrinkwrap will
change directory during execution.
Fix a potential bug which has jobs executed out of order. The order of
.dep_on() doesn't ensure execution ordering. Dependency must be
implemented on artifact or side effect
  - for unused destructured fields, add ': _' to ignore them
  - delete cca_fvp_test.rs which is unused, we should to center
    fvp logic inside cca_fvp.rs

We probably should even split cca_fvp.rs into fvp.rs and cca.rs,
fvp.rs to implement FVP install/build and CCA is for CCA related
configuration because FVP can be used for testing other architecture
features
1. Git clone openvmm and set up encironment + build openvmm and tmk_vmm
2. Fix an issue for fetching incorrect OHCL-Linux-Kernel branch
A temporary clone of openvmm with CCA support is used. This duplication
is intentional for the initial implementation and will be removed once
upstream openvmm gains full ARM CCA support.
For the initial implementation, planes.yaml is hosted in a remote
repository so cca-fvp can clone and use it directly. This avoids manual
configuration.
Extract duplicated logic into a function to improve reuse and
maintainability.
Factor out common xshell command execution into helpers and organize
kernel config flags into logical groups (CCA, 9P, Hyper-V). This
reduces duplication and improves maintainability without changing
behavior. Same for the rust binary build.
Remove the hardcoded ~/.shrinkwrap rootfs path and resolve it via
configuration/environment variables to support portable setups.
Add a helper to resolve platform/overlay paths consistently by
supporting absolute paths, legacy target/cca-fvp/shrinkwrap paths,
and relative paths assumed to live under shrinkwrap/config.
The build failed with:
  Failed to enable CCA kernel configs

because multiple CONFIG options were passed as a single --enable
argument to scripts/config. Split options correctly so CONFIG_VIRT_DRIVERS
and CONFIG_ARM_CCA_GUEST are enabled independently.
Resolve simple --platform/--overlay filenames via --dir and move
target/cca-fvp output to the repo root.

Accept simple filenames for --platform and --overlay and locate
them relative to <dir>/shrinkwrap/config.
Add default values for common cca-fvp options:
- Default --dir to target/cca-fvp
- Default --platform to cca-3world.yaml
- Provide default overlays and btvars when not specified
- Compute default rootfs path from SHRINKWRAP_PACKAGE or HOME

This improves usability while preserving explicit user overrides.

Remove unused variables.
Improve handling of "Device or resource busy" errors by:
- Attempting a normal unmount first, then falling back to a lazy unmount (-l)
- Allowing the script to continue even if unmount attempts fail
- Retrying unmount operations with delays to handle transient usage

This reduces flakiness during mount cleanup and file injection.
…gration

Adapts local_install_shrinkwrap to use the new RustRuntimeServices.sh pattern
instead of direct xshell::Shell::new(). Updates all shell commands to use
flowey::shell_cmd! macro.
@weiding-msft weiding-msft reopened this Mar 2, 2026
@weiding-msft
Copy link
Author

@chris-oo Can you please take a look for the draft PR? Thank you!

@weiding-msft
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

@weiding-msft
Copy link
Author

@microsoft-github-policy-service agree

@weiding-msft weiding-msft marked this pull request as ready for review March 3, 2026 16:04
@weiding-msft weiding-msft requested a review from a team as a code owner March 3, 2026 16:04
@chris-oo
Copy link
Member

chris-oo commented Mar 3, 2026

I'll try to review this today. @justus-camp-microsoft can you also help take a look?

Copy link
Contributor

@justus-camp-microsoft justus-camp-microsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments regarding flowey best practices. I see a couple things are pointed to your own forks - are there more people than just you that will use this pipeline?

let venv_dir = shrinkwrap_dir.join("venv");
let venv_bin = venv_dir.join("bin");

let mut cmd = std::process::Command::new(&shrinkwrap_exe);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use std::process::Command anywhere. You should use flowey::shell_cmd! any place you're using std::process::Command and then use the builder args to get the same behavior you're getting here. flowey::shell_cmd! is relatively new (before I would've told you to use xshell::Cmd) so let me know if any of the builder commands you need aren't available.

done,
} = request;

ctx.emit_rust_step("install shrinkwrap", |ctx| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is basically a "mega-step" that does everything. Flowey best practices dictate the use of reusable, composable nodes and to leverage them when you can. For instance, there are several existing nodes that could be used here to significantly simplify the node:

  1. The inline apt-get install commands should be using the install_dist_pkg node.
  2. clone_or_update_repo is essentially a re-implementation of the git_checkout node.
  3. I believe the rustup target add commands are covered by the install_rust node.
  4. The cargo build commands should be using the run_cargo_build node.
  5. The cross-compilation setup should be handled by the init_cross_build node.

There are few things here that should likely have nodes implemented for but I wouldn't block the PR for doing them inline here. Namely those are compiling the kernel, installing python, docker setup, and downloading the ARM GNU toolchain. For the ARM GNU toolchain, there's an argument that they should be downloaded as part of the restore-packages pipeline and then have a corresponding magicpath node (there are other magicpath nodes lying around to show what I mean) that puts it in a known place for use here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right that this node is currently acting as a “mega-step”, and it would be better aligned with Flowey’s composable model concept by leveraging the existing reusable nodes.

I’ll refactor this to:
• Replace inline apt-get install calls with install_dist_pkg
• Use git_checkout instead of the custom clone_or_update_repo
• Replace the rustup target add logic with install_rust
• Use run_cargo_build for the cargo builds
• Move the cross-compilation setup into init_cross_build

For the remaining inline steps (kernel build, Python setup, Docker setup, ARM GNU toolchain download), I agree these likely deserve dedicated nodes or integration into existing pipelines. I probably won’t block this PR on introducing new nodes, but I’ll structure the code so they can be cleanly extracted later.

For the ARM GNU toolchain specifically, integrating it into the restore-packages pipeline with a corresponding magicpath node sounds like the right long-term solution. I’ll leave a TODO and follow up with a separate proposal for that.

Appreciate the guidance — these should make the flow much more maintainable and consistent with the rest of the project.

verbose,
} = self;

let openvmm_repo = flowey_lib_common::git_checkout::RepoSource::ExistingClone(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a bail_if_running_in_ci() guard here (assuming this isn't supposed to be run as part of CI)

use std::path::Path;

const ARM_GNU_TOOLCHAIN_URL: &str = "https://developer.arm.com/-/media/Files/downloads/gnu/14.3.rel1/binrel/arm-gnu-toolchain-14.3.rel1-x86_64-aarch64-none-elf.tar.xz";
const OHCL_LINUX_KERNEL_REPO: &str = "https://github.com/weiding-msft/OHCL-Linux-Kernel.git";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally a forked repo?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is intentional for the initial implementation and will be removed once upstream openvmm gains full ARM CCA support.

Comment on lines +40 to +45
#[clap(long, default_value_t = true)]
pub install_missing_deps: bool,

/// If repo already exists, attempt `git pull --ff-only`
#[clap(long, default_value_t = true)]
pub update_shrinkwrap_repo: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's actually a way to turn these off if you default them to true (I think usage would be --install-missing-deps which is already true so this would be a no-op). We should have these default to false or flip the flag to --no-install-missing-deps.

Comment on lines +29 to +31
pub mod local_install_shrinkwrap;
pub mod local_shrinkwrap_build;
pub mod local_shrinkwrap_run;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alphabetize the mod.rs file

@chris-oo
Copy link
Member

chris-oo commented Mar 3, 2026

The intention here is to eventually stand up a CI pass so we can validate CCA changes in CI until we have hardware available, if that gives you more context?

@justus-camp-microsoft
Copy link
Contributor

The intention here is to eventually stand up a CI pass so we can validate CCA changes in CI until we have hardware available, if that gives you more context?

Ok that makes sense - we'll have to add it to .flowey.toml for it to actually generate yaml that will run in CI. We should try that locally to see if it has any issues generating (even if we don't commit it here).

@chris-oo
Copy link
Member

chris-oo commented Mar 4, 2026

Right, I think it's fine to stage this first PR to make local work, then we can iterate on another PR to standup the right CI pass. It probably needs to be it's own new pass since it takes a while to setup all the deps & run the simulator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants