Skip to content

fix qlinear nan bug#4914

Draft
aarushjain29 wants to merge 7 commits into
developfrom
fix-qlinear-nan-to-min
Draft

fix qlinear nan bug#4914
aarushjain29 wants to merge 7 commits into
developfrom
fix-qlinear-nan-to-min

Conversation

@aarushjain29
Copy link
Copy Markdown
Contributor

@aarushjain29 aarushjain29 commented May 27, 2026

Motivation

ONNX Runtime's CPUExecutionProvider and MIGraphX disagreed on QuantizeLinear output when the input is NaN:

input: [nan]
scale: 0.5
onnxruntime out: [-128]
migraphx out: [ 127] ← MIGraphX

Technical Details

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

Copilot AI review requested due to automatic review settings May 27, 2026 15:05
@aarushjain29 aarushjain29 requested a review from causten as a code owner May 27, 2026 15:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts MIGraphX’s quantizelinear operator to handle NaN inputs deterministically during quantization, aligning behavior with common ONNX runtime expectations (notably ONNX Runtime CPU EP).

Changes:

  • Add explicit NaN handling for integral quantization outputs: NaN inputs now saturate to the quantized type’s minimum value.
  • Prevent NaN from flowing through the clamp logic (std::min/std::max) where it would otherwise silently select the max bound.

Comment thread src/include/migraphx/op/quantizelinear.hpp Outdated
Comment thread src/include/migraphx/op/quantizelinear.hpp Outdated
@aarushjain29 aarushjain29 changed the title fix qlinear nan fix qlinear nan bug May 27, 2026
aarushjain29 and others added 2 commits May 27, 2026 13:57
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4914   +/-   ##
========================================
  Coverage    92.66%   92.66%           
========================================
  Files          588      588           
  Lines        30412    30415    +3     
========================================
+ Hits         28180    28183    +3     
  Misses        2232     2232           
Files with missing lines Coverage Δ
src/include/migraphx/op/quantizelinear.hpp 97.56% <100.00%> (+0.19%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aarushjain29 aarushjain29 marked this pull request as draft May 27, 2026 19:45
@gh-app-migraphx-bot-pr-write
Copy link
Copy Markdown

Test Batch New Rate (4e97ef) Old Rate (283f84) Diff Status
torchvision-resnet50 64 3,115.57 3,154.40 -1.23%
torchvision-resnet50_fp16 64 3,632.26 6,643.79 -45.33% 🔴
torchvision-densenet121 32 2,135.17 2,697.51 -20.85% 🔴
torchvision-densenet121_fp16 32 2,018.27 4,525.49 -55.40% 🔴
torchvision-inceptionv3 32 1,658.46 1,795.56 -7.64% 🔴
torchvision-inceptionv3_fp16 32 2,829.85 2,809.78 0.71%
cadene-inceptionv4 16 793.97 824.23 -3.67%
cadene-resnext64x4 16 342.22 783.73 -56.33% 🔴
slim-mobilenet 64 6,457.25 8,391.24 -23.05% 🔴
slim-nasnetalarge 64 207.46 228.48 -9.20% 🔴
slim-resnet50v2 64 3,246.33 3,315.41 -2.08%
bert-mrpc-onnx 8 1,128.78 1,172.27 -3.71%
bert-mrpc-tf 1 485.95 489.81 -0.79%
pytorch-examples-wlang-gru 1 385.60 410.78 -6.13% 🔴
pytorch-examples-wlang-lstm 1 571.01 495.68 15.20% 🔆
torchvision-resnet50_1 1 730.08 786.81 -7.21% 🔴
cadene-dpn92_1 1 464.07 443.27 4.69%
cadene-resnext101_1 1 364.08 363.99 0.02%
onnx-taau-downsample 1 403.48 399.95 0.88%
dlrm-criteoterabyte 1 32.17 32.43 -0.81%
dlrm-criteoterabyte_fp16 1 31.31 51.89 -39.66% 🔴
agentmodel 1 3,815.76 8,911.54 -57.18% 🔴
unet_fp16 2 49.25 56.90 -13.44% 🔴
resnet50v1_fp16 1 210.63 945.91 -77.73% 🔴
resnet50v1_int8 1 934.91 924.22 1.16%
bert_base_cased_fp16 64 788.74 1,098.59 -28.20% 🔴
bert_large_uncased_fp16 32 347.80 346.40 0.40%
bert_large_fp16 1 204.86 204.56 0.14%
distilgpt2_fp16 16 2,097.84 2,096.23 0.08%
yolov5s 1 571.01 564.81 1.10%
tinyllama 1 46.15 45.97 0.39%
vicuna-fastchat 1 44.00 44.04 -0.10%
whisper-tiny-encoder 1 420.56 417.87 0.64%
whisper-tiny-decoder 1 176.43 413.65 -57.35% 🔴
llama2_7b 1 20.50 20.33 0.81%
qwen1.5-7b 1 23.73 23.60 0.54%
phi3-3.8b 1 16.18 26.73 -39.48% 🔴
llama3-8b 1 9.80 21.73 -54.93% 🔴
whisper-large-encoder 1 9.92 10.28 -3.58%
whisper-large-decoder 1 106.73 105.60 1.07%
mistral-7b 1 23.93 23.77 0.66%
FLUX.1-schnell 1 753.89 764.64 -1.41%

Regressions detected 🔴

@gh-app-migraphx-bot-pr-write
Copy link
Copy Markdown

Test Status Result
bert-mrpc-onnx PASSED: MIGraphX meets tolerance
bert-mrpc-tf ERROR - check error output
traceback
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 377, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 313, in main
import tensorflow as tf
File "/usr/local/lib/python3.10/dist-packages/tensorflow/init.py", line 38, in
from tensorflow.python.tools import module_util as _module_util
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/init.py", line 36, in
from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 26, in
self_check.preload_check()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/self_check.py", line 63, in preload_check
from tensorflow.python.platform import _pywrap_cpu_feature_guard
ImportError: libamdhip64.so.6: cannot open shared object file: No such file or directory
pytorch-examples-wlang-gru PASSED: MIGraphX meets tolerance
pytorch-examples-wlang-lstm PASSED: MIGraphX meets tolerance
dlrm-criteoterabyte PASSED: MIGraphX meets tolerance
agentmodel PASSED: MIGraphX meets tolerance
unet PASSED: MIGraphX meets tolerance
resnet50v1 PASSED: MIGraphX meets tolerance
bert_base_cased_fp16 PASSED: MIGraphX meets tolerance
bert_large_uncased_fp16 🔴 FAILED: MIGraphX is not within tolerance - check verbose output
bert_large PASSED: MIGraphX meets tolerance
yolov5s PASSED: MIGraphX meets tolerance
tinyllama PASSED: MIGraphX meets tolerance
vicuna-fastchat PASSED: MIGraphX meets tolerance
whisper-tiny-encoder PASSED: MIGraphX meets tolerance
whisper-tiny-decoder PASSED: MIGraphX meets tolerance
distilgpt2_fp16 PASSED: MIGraphX meets tolerance
llama2_7b PASSED: MIGraphX meets tolerance
qwen1.5-7b PASSED: MIGraphX meets tolerance
phi3-3.8b PASSED: MIGraphX meets tolerance
llama3-8b PASSED: MIGraphX meets tolerance
whisper-large-encoder ERROR - check error output
traceback
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 377, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 224, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /data/src/include/migraphx/op/convolution.hpp:102: normalize_compute_shape: CONVOLUTION: mismatched channel numbers
whisper-large-decoder PASSED: MIGraphX meets tolerance
mistral-7b PASSED: MIGraphX meets tolerance
FLUX.1-schnell PASSED: MIGraphX meets tolerance

output[i] = min_value;
return;
}
auto rounding_mode = fegetround();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this and the next line out of the visit_all since this does an assignment for each and not needed.

{
output[i] = min_value;
return;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is consistent with GPU kernels, I think you just need to add this If block and thats it.

}
auto rounding_mode = fegetround();
fesetround(FE_TONEAREST);
auto rounded = std::nearbyint(input[i] / scales[i]);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, and just do the nearbyint in the static cast for quantized below.

auto rounding_mode = fegetround();
fesetround(FE_TONEAREST);
auto rounded = std::nearbyint(input[i] / scales[i]);
fesetround(rounding_mode);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing rounding mode again? Is this because we're assigning festround() above per item?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants