Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add outputDataType to argmin/argmax #730

Merged
merged 3 commits into from
Jul 24, 2024
Merged

Conversation

philloooo
Copy link
Contributor

@philloooo philloooo commented Jul 18, 2024

fixes #653

Add outputDataType to MLArgMinMaxOptions and defaults to int32. Given opSupportLimits is not added to spec yet, no validation steps is done to outputDataType right now.

@fdwr @huningxin


Preview | Diff

index.bs Show resolved Hide resolved
Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks Phillis. I have a thought on what the default should be, but otherwise LGTM.

Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns: an {{MLOperand}}. The N-D tensor of the reduced shape. The values must be of type {{MLOperandDataType/"int64"}} in the range [0, N-1] where N is the size of the input dimension specified by axis.

This prose needs to be updated as well. What would be the range for int32 output value?

@philloooo
Copy link
Contributor Author

philloooo commented Jul 19, 2024

@huningxin i think it would still be [0, N-1] range? The dimensions are of type uint32_t so it could exceed int32_t range, but I don't think in reality we have tensors that big?

this also exposes another issue - both CoreML and tflite backends actually use int32_t for tensor dimensions so if we have tensor that needs to use that extra bit from uint32_t they won't work on these backends. Should we actually consider use int32_t for dimensions?

@huningxin
Copy link
Contributor

@philloooo

i think it would still be [0, N-1] range? The dimensions are of type uint32_t so it could exceed int32_t range,

Yes, then would it be [0, min(N-1, MAX_INT)] range? And should we add a validation step to ensure the size of the dimension being reduced can be indexed by that range?

but I don't think in reality we have tensors that big?

I haven't seen any tensor of a model has such big dimension.

this also exposes another issue - both CoreML and tflite backends actually use int32_t for tensor dimensions so if we have tensor that needs to use that extra bit from uint32_t they won't work on these backends

Yes, AFAIK, the current Chromium TFLite prototype reports error when the dimension size is too large for int32: ToSignedDimensions().

Should we actually consider use int32_t for dimensions?

FYI, #279 replaces the int32_t dimension with uint32_t because there are no use cases for negative values.

We may want to reconsider that given it is not widely supported by currently targeting backends, CoreML / TFLite.

@a-sully
Copy link
Contributor

a-sully commented Jul 22, 2024

I haven't seen any tensor of a model has such big dimension.

While we closed issue #456, this table is still relevant: #456 (comment) (and the max for Core ML is 5D). The dimensions will always be non-negative and within MAX_INT

@a-sully
Copy link
Contributor

a-sully commented Jul 22, 2024

While we closed issue #456, this table is still relevant: #456 (comment) (and the max for Core ML is 5D). The dimensions will always be non-negative and within MAX_INT

Apologies, that table is referring to max rank while we're talking about the max dimension here

Yes, then would it be [0, min(N-1, MAX_INT)] range? And should we add a validation step to ensure the size of the dimension being reduced can be indexed by that range?

+1, though we should consider [0, MAX_INT] (or even something less than MAX_INT). In that case, we could make WebNN's argmin/argmax return just int32 or uint32 and backends which natively return other types can losslessly cast the result to int32 (which might be able to be optimized away in many cases)

fdwr pushed a commit to microsoft/onnxruntime that referenced this pull request Jul 22, 2024
### Description
WebNN spec introduces a new option: `outputDataType` to `argMax` and
`argMin` ops, it's default value is `int32`, we should explicitly set it
to `int64` for WebNN EP.

Spec CR: "Add outputDataType to argmin/argmax"
webmachinelearning/webnn#730
@philloooo
Copy link
Contributor Author

@huningxin I've created a separate issue #734 for int32 vs uint32.
For this issue for now, I've added a validation step you suggested, which we can remove later when #734 is resolved. Given the validation step, we can keep the return value to be still range of [0, N-1]

index.bs Show resolved Hide resolved
Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr fdwr merged commit 4a2b8ca into webmachinelearning:main Jul 24, 2024
2 checks passed
github-actions bot added a commit that referenced this pull request Jul 24, 2024
SHA: 4a2b8ca
Reason: push, by fdwr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider changing output type of ArgMax/Argmin to int32, or allow passing output_type
4 participants