Add QDQFloatActivationsTransformer to remove activation Q→DQ pairs and enable MatMulNBits fusion by jambayk · Pull Request #27636 · microsoft/onnxruntime

jambayk · 2026-03-12T18:41:47Z

Description

Adds a new Level 2 graph transformer (QDQFloatActivationsTransformer) that removes activation Q→DQ pairs from fully quantized (QDQ) models, allowing unfused ops to run in float precision. This is gated behind the session.qdq_float_activations session option.

Motivation

In fully QDQ models, after Level 1 QDQSelectorActionTransformer fuses compute ops (e.g., Conv→QLinearConv), leftover activation Q→DQ pairs remain around ops that don't have QDQ fusions. These pairs add unnecessary quantize/dequantize overhead. Additionally, the DQMatMulToMatMulNBits fusion at Level 1 requires exactly 1 DQ input to MatMul — when an activation Q→DQ pair is present, the MatMul sees 2 DQ inputs and the fusion is rejected.

Changes

New transformer (qdq_float_activations_transformer.h/cc):
- Sub-pass A: Removes all adjacent Q→DQ pairs where Q and DQ share matching scale/zero-point. Handles multiple DQ consumers per Q node, and DQ nodes producing graph outputs (via Identity rewiring).
- Sub-pass B: After Q→DQ removal, re-scans MatMul nodes and applies DQMatMulToMatMulNBits and DQCastMatMulToMatMulNBits fusions on newly eligible patterns.
Session option (session.qdq_float_activations): When set to "1", enables the transformer and also skips DropQDQNodesRules/SplitQDQRules in QDQSelectorActionTransformer so data-movement ops keep their Q/DQ wrappers adjacent.
Pipeline integration (graph_transformer_utils.cc): Registered at Level 2, ordered after MatMulNBitsFusion and before QDQFinalCleanupTransformer.

Copilot

Pull request overview

Adds a new “float activations” mode for QDQ models by introducing a dedicated Level2 transformer and a new session option to control behavior. This fits into the existing QDQ optimization pipeline by (a) preserving Q/DQ wrappers around data-movement ops and (b) removing remaining activation Q->DQ pairs after compute-op QDQ fusions.

Changes:

Add QDQFloatActivationsTransformer to remove eligible activation Q->DQ pairs and re-attempt MatMulNBits fusion after activation dequant removal.
Extend QDQSelectorActionTransformer with an option to skip data-movement QDQ rules when float-activations mode is enabled.
Introduce session.qdq_float_activations config key and add comprehensive unit tests covering key scenarios (simple removal, chained pairs, graph outputs, Conv fusion interplay, MatMulNBits enabling case).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
onnxruntime/test/optimizer/qdq_float_activations_transformer_test.cc	New unit tests for float-activations mode behavior and interactions with existing fusions.
onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selector_action_transformer.h	Add constructor parameter to optionally skip data-movement QDQ rules.
onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selector_action_transformer.cc	Gate Split/DropQDQNodes rules behind the new “skip data movement” flag.
onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.h	New transformer interface and high-level behavior documentation.
onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.cc	Implementation of activation Q->DQ removal and post-removal MatMulNBits fusion attempt.
onnxruntime/core/optimizer/graph_transformer_utils.cc	Wire new session option into Level2 pipeline; pass through to QDQSelectorActionTransformer; add new transformer when enabled.
include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h	Add public session option key and explanatory comment for float-activations mode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h

onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.cc

onnxruntime/core/optimizer/graph_transformer_utils.cc

onnxruntime/test/optimizer/qdq_float_activations_transformer_test.cc

onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.h

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.cc

jambayk requested a review from Copilot March 12, 2026 18:42

Copilot started reviewing on behalf of jambayk March 12, 2026 18:43 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

jambayk requested a review from Copilot March 12, 2026 19:16

Copilot started reviewing on behalf of jambayk March 12, 2026 19:17 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

onnxruntime/core/optimizer/qdq_transformer/qdq_float_activations_transformer.cc Outdated Show resolved Hide resolved

jambayk force-pushed the jambayk/qdq-opt branch 2 times, most recently from e827aa1 to 903db7d Compare March 13, 2026 23:11

jambayk added 5 commits March 14, 2026 06:26

init

b778a67

constant folding on DQ weights

52c1355

address reviews

649a7f3

remove references to dqcastmatmul

fc8df15

resolve graph

52bf578

jambayk force-pushed the jambayk/qdq-opt branch from 1d35391 to 52bf578 Compare March 14, 2026 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QDQFloatActivationsTransformer to remove activation Q→DQ pairs and enable MatMulNBits fusion#27636

Add QDQFloatActivationsTransformer to remove activation Q→DQ pairs and enable MatMulNBits fusion#27636
jambayk wants to merge 5 commits intomainfrom
jambayk/qdq-opt

jambayk commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jambayk commented Mar 12, 2026

Description

Motivation

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants