Skip to content

Add support to fetch device type from EP subgraph assignment#27610

Draft
adrastogi wants to merge 1 commit intomainfrom
adrastogi/device-type-subgraph
Draft

Add support to fetch device type from EP subgraph assignment#27610
adrastogi wants to merge 1 commit intomainfrom
adrastogi/device-type-subgraph

Conversation

@adrastogi
Copy link
Contributor

Description

#26781 added support for retrieving subgraph metadata for the assigned EPs in the session. There was a request via #27167 to add device type support, so this change attempts to implement that suggestion.

Motivation and Context

See #27167 for details.

(Will add more details to the description if I am able to move this out of draft status.)

* Returns the default device type of the execution provider that claimed this subgraph.
* This is the device type the EP registered with (e.g., CPU, GPU, or NPU).
*
* \note For execution providers that internally manage multiple device types (e.g., OpenVINO in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part of the design I am somewhat stuck on- for multi-device EPs (like OpenVINO), I don't think it is sufficient to pick the default registered type on the EP. But then I couldn't figure out where the mapping happens between the subgraph and the device.

  1. ComputeCapability - has an IndexedSubGraph (node indices + optional MetaDef for fusion), but no device field.
  2. IndexedSubGraph - has nodes, MetaDef, and resource accounting, but no device field.
  3. OrtEpGraphSupportInfo (plugin EP API) - the newer plugin API lets EPs declare supported node groupings (single nodes or fused groups). Each NodeGrouping has a kind + nodes + fusion options. No device field.
  4. IExecutionProvider::GetCapability() - Returns vector. The EP says "I can handle these subgraphs" but doesn't say "this subgraph goes to GPU, that one to NPU".

Copy link
Contributor

@adrianlizarraga adrianlizarraga Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you alluded, we could add a OrtHardwareDevice field to the OrtNodeFusionOptions. A plugin EP can optionally set that to indicate the exact hw device that executes the subgraph (for observability purposes). I don't think we can force EPs to set the device since that would break existing plugin EPs.

If the plugin EP does not set the OrtHardwareDevice on the OrtNodeFusionOptions, then maybe we fallback to my idea in this comment: #27610 (comment) . Perhaps we just take the first ep_device[0].hardware_device, which should be correct 99% of the time.

// Call custom function provided by owner of GraphPartitioner whenever a subgraph is assigned to an EP.
// This can be used, for example, to collect partitioning information.
on_partition_assignment_fn(graph, *capability, type);
on_partition_assignment_fn(graph, *capability, type, current_ep.GetDevice());
Copy link
Contributor

@adrianlizarraga adrianlizarraga Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't think using OrtDevice will work as we want. For example, QNN and the other npu-based EPs use the default "cpu" OrtDevice to ensure that their inputs are placed on CPU.

I think we may be able to do the following:

  • Add a new virtual function IExecutionProvider::GetEpDevices() that returns an empty result by default.
  • Update the PluginExecutionProvider class to define a PluginExecutionProvider::GetEpDevices() implementation that returns all of its OrtEpDevices (which are already stored in a member field).
  • We pass the OrtEpDevices to on_partition_assignment_fn.
  • Add one (or more!) OrtHardwareDevices to OrtEpAssignedSubgraph.

Hopefully the above is somewhat right :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants