fix(ONNX): avoids resizing fixed dimensions #3945

bjacobgordon · 2025-01-07T22:21:34Z

No description provided.

- "result" -> "outputTensor" - "type" -> more like "blueprint" since it includes shape and element data type

zjgarvey

I think the main structural question is about the need for adding the BaseTensorType method. If it were useful elsewhere (I have some doubts, since we would need to know too much about the two tensor shapes prior to using it- namely that they are present, and they have the same rank), I would consider keeping it; however, the code is simplified here by not using it, and I suspect that the same would be true in other circumstances where it might be used.

zjgarvey · 2025-01-09T17:11:21Z

lib/Dialect/Torch/IR/TorchTypes.cpp

+  auto this_dimensions = /**/ getSizes();
+  auto that_dimensions = that.getSizes();


BaseTensorType might not have sizes, and this will cause a crash when called. I would do:

Suggested change

auto this_dimensions = /**/ getSizes();

auto that_dimensions = that.getSizes();

auto selfSizes = getOptionalSizes();

auto otherSizes = other.getOptionalSizes();

Note also the variable and camel casing conventions. The variables self and other are used more typically than this and that in this codebase (easier to distiguish).

zjgarvey · 2025-01-09T17:25:59Z

include/torch-mlir/Dialect/Torch/IR/TorchTypes.h

@@ -84,6 +84,10 @@ class BaseTensorType : public Type {
  /// Enable isa/dyn_cast for BaseTensorType.
  static bool classof(Type type);

+  /// The element-wise comparison of each dimension/size in `that` tensor
+  std::vector<std::optional<bool>>


Use SmallVector instead of std::vector. The methods are the same, and it is better for small containers like this.

zjgarvey · 2025-01-09T17:27:12Z

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

@@ -2686,12 +2686,11 @@ void mlir::torch::onnx_c::populateDefaultDomainQtoZ(
      });
  patterns.onOp(
      "Resize", 11, [](OpBinder binder, ConversionPatternRewriter &rewriter) {
-        Torch::ValueTensorType resultType;
+        Torch::ValueTensorType outputTensor_blueprint;


I don't understand the name changing of this variable. This isn't a blueprint, it's the result type.

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

zjgarvey · 2025-01-09T17:31:42Z

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

+            return rewriter.notifyMatchFailure(
+                binder.op, "Sizes for batch and channel dimensions must be "
+                           "statically defined");
+          }


We definitely do not want to constrain this conversion to static batch and channel dims. This was the reason for needing to write asserts into the helper function getValueList in the dynamic case.

It might be fine to just put runtime asserts in right after the last match failure. Something like:

Value inputDimZero = rewriter.create<Torch::AtenSizeIntOp>(loc, input, cstZero); Value inputDimOne = rewriter.create<Torch::AtenSizeIntOp>(loc, input, cstOne); Value outputDimZero = rewriter.create<Torch::AtenSizeIntOp>(loc, output, cstZero); Value outputDimOne = rewriter.create<Torch::AtenSizeIntOp>(loc, output, cstOne); Value cmpDimZero = rewriter.create<Torch::AtenEqIntOp>(loc, inputDimZero, outputDimZero); Value cmpDimOne = ... rewriter.create<Torch::RuntimeAssertOp>(loc, cmpDimZero, rewriter.getStringAttr("message")); // same for DimOne

By the way, if one of the two dims have input/output sizes that are static and equal, then these asserts will fold out, so there isn't a pressing need to check again for static dims.

zjgarvey · 2025-01-09T17:36:53Z

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

+        for (auto eachDimensionComparison : shapeComparisonForFixedDimensions) {
+          if (eachDimensionComparison == std::nullopt) {


Since you need to loop over the result of the shape comparison anyway, it would be more efficient to not define the helper function at all, and do

for (int64_t dim = 0; dim < 2; dim++) { if (inputSizes[dim] == Torch::kUnknownSize || outputSizes[dim] == Torch::kUnknownSize) continue; // you need to implement the runtime asserts, but at least still check the other dim if static. if (inputSizes[dim] != outputSizes[dim]) return rewriter.notifyMatchFailure(...

Definitely more machine-efficient! That's exactly what I had played around with at first, yeah.

Some general first-principles questions:

At this level of abstraction, are we still able to optimize for minimal cognitive load at the cost of machine efficiency?

Or is this already at the level where we gotta optimize for machine runtime, even if it means more CL for the dev?

I was actually suggesting this for both readability and (very modestly) compiler performance. The runtime performance won't be affected either way.

In general, it is good to have runtime performance in mind at this level, but know that many things do indeed get optimized out later on (see, for example, my comment about the folding of runtime asserts for the static dim case). I tend to take a pessimistic view of what will and won't be optimized away, at least when I don't actually know- and will try to generate a cleaner pattern if it doesn't cost a huge amount in code complexity for the perceived benefit.

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch 2 times, most recently from bb3f80f to 6baa8d5 Compare January 8, 2025 22:57

bjacobgordon added 2 commits January 8, 2025 23:01

refactor(ONNX): avoids SSA before match failures in onnx.resize

f02464a

refactor(ONNX): forces cast of operand in onnx.resize

764a14f

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from 6baa8d5 to ab7e021 Compare January 8, 2025 23:02

bjacobgordon changed the title ~~fix(ONNX): protects against mismatched dynamic meta dimensions~~ fix(ONNX): avoids resizing fixed dimensions Jan 9, 2025

bjacobgordon added 4 commits January 9, 2025 16:26

refactor(ONNX): renames resultType to outputTensor_blueprint

6228eee

- "result" -> "outputTensor" - "type" -> more like "blueprint" since it includes shape and element data type

refactor(ONNX): extracts inputTensor within onnx.resize

db52ebc

refactor(ONNX): extracts inputTensor blueprint from rank derivation

c89dae3

refactor(ONNX): extracts inputTensor dimensions from rank derivation

76368bd

bjacobgordon mentioned this pull request Jan 9, 2025

convert-torch-onnx-to-torch generates invalid IR for onnx.Resize where scaling is in the first two dimensions #3453

Open

fix(ONNX): avoids resizing conventionally fixed dimensions

7aec80b

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from ab7e021 to 7aec80b Compare January 9, 2025 17:17

zjgarvey reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ONNX): avoids resizing fixed dimensions #3945

fix(ONNX): avoids resizing fixed dimensions #3945

bjacobgordon commented Jan 7, 2025

zjgarvey left a comment

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

zjgarvey Jan 9, 2025

bjacobgordon Jan 9, 2025

zjgarvey Jan 9, 2025

		auto this_dimensions = /**/ getSizes();
		auto that_dimensions = that.getSizes();

		for (auto eachDimensionComparison : shapeComparisonForFixedDimensions) {
		if (eachDimensionComparison == std::nullopt) {

fix(ONNX): avoids resizing fixed dimensions #3945

Are you sure you want to change the base?

fix(ONNX): avoids resizing fixed dimensions #3945

Conversation

bjacobgordon commented Jan 7, 2025

zjgarvey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment