Skip to content

Add EP cache versioning to avoid crashes from outdated caches after ORT updates#27660

Open
yiliangbetter wants to merge 1 commit intomicrosoft:mainfrom
yiliangbetter:feature_univeral_cache_invalidation
Open

Add EP cache versioning to avoid crashes from outdated caches after ORT updates#27660
yiliangbetter wants to merge 1 commit intomicrosoft:mainfrom
yiliangbetter:feature_univeral_cache_invalidation

Conversation

@yiliangbetter
Copy link

@yiliangbetter yiliangbetter commented Mar 15, 2026

Description

This PR adds support for execution provider (EP) cache versioning so that cache directories are tied to the ONNX Runtime version. When enabled, known EP cache paths are automatically suffixed with the ORT version (e.g. .caches.caches/1.20.0). That invalidates old caches when ORT is upgraded and reduces crashes or bad behavior from loading incompatible cached artifacts.

Changes:

  • Session option: session.ep_cache_use_ort_version (C++: kOrtSessionOptionsEpCacheUseOrtVersion). When set to "1", known EP cache directory options are suffixed with the ORT version. Set this before appending execution providers.
  • Supported cache options: CoreML ModelCacheDirectory; TensorRT trt_engine_cache_path and trt_timing_cache_path; MIGraphX migraphx_model_cache_dir; NvTensorRtRtx nv_runtime_cache_path.
  • Python helper: onnxruntime.get_versioned_ep_cache_path(base_path) for building versioned cache paths manually (e.g. for provider options).
  • Core logic: New ep_cache_versioning.cc / .h apply version suffixes using the ORT_VERSION macro; wired into SessionOptionsAppendExecutionProvider and InitializeSession so both config options and provider options get versioned paths when the flag is on.
  • Docs: InferenceSession docstring updated with a short “EP cache versioning” section and usage notes.
  • Tests: Unit tests in test_session_options.cc for config and provider path versioning (on/off, multiple EPs, empty paths, unknown providers/options, case insensitivity, no mutation of inputs).

Motivation and Context

Many execution providers (CoreML, TensorRT, MIGraphX, etc.) compile and cache artifacts to speed up session creation. Those caches are tied to the ORT build; after an upgrade, loading them can cause crashes, wrong results, or ABI issues in detail please see (#27487)

This change:

  1. Reduces crashes by storing caches in versioned directories so old caches are not loaded after an upgrade.
  2. Makes upgrades safer by keeping cache layout under ORT’s control when the option is enabled.
  3. Keeps flexibility by offering both a global session option and a Python helper for manual path construction.

How to use

Automatic (session option):

sess_options = ort.SessionOptions()
sess_options.add_session_config_entry("session.ep_cache_use_ort_version", "1")
session = ort.InferenceSession("model.onnx", sess_options, providers=[...])

…T version

When set to "1", execution provider cache directory options (CoreML
ModelCacheDirectory, TensorRT trt_engine_cache_path, MIGraphX
migraphx_model_cache_dir, etc.) are automatically suffixed with the
ONNX Runtime version string. This invalidates caches when ORT is updated
and avoids crashes from loading outdated EP caches.

Also adds get_versioned_ep_cache_path() in Python for manual path
versioning.

Made-with: Cursor
@yiliangbetter yiliangbetter force-pushed the feature_univeral_cache_invalidation branch from 1d69fdd to 33a90e8 Compare March 15, 2026 03:38
@yiliangbetter yiliangbetter marked this pull request as ready for review March 15, 2026 03:41
@henryruhs
Copy link

henryruhs commented Mar 15, 2026

@yiliangbetter Disclaimer: I never wrote a single line of C++ so my review is eventual bias.

  1. Did you consolidate the existing caching logic for TensorRT - from what I know there is already something in the codebase?

  2. How I read the code changes, the caching module has knowledge about the EPs to deal with. In my opinion every EP related logic has to live in the EP code themselve and paths for caching or even the need for caching should be exposed via an hook. That means the caching module should be "dumb" and iterates over loaded EPs rather than having hard coded values like this.

const std::array<std::pair<std::string, std::string>, 5> kEpCachePathOptions = {
    std::pair<std::string, std::string>{"coreml", "ModelCacheDirectory"},
    std::pair<std::string, std::string>{"tensorrt", "trt_engine_cache_path"},
    std::pair<std::string, std::string>{"tensorrt", "trt_timing_cache_path"},
    std::pair<std::string, std::string>{"migraphx", "migraphx_model_cache_dir"},
    std::pair<std::string, std::string>{"nvtensorrtrtx", "nv_runtime_cache_path"},
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants