From bc84398c513d2dc515cc3aeb29fcb5789bff7c3c Mon Sep 17 00:00:00 2001 From: Sandeep Dasgupta Date: Tue, 4 Apr 2023 18:22:21 +0000 Subject: [PATCH] Spec: Uniform Quantize/DeQuantize --- docs/spec.md | 88 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 83 insertions(+), 5 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 2d98b018fcb..ec5bd58e9be 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -319,11 +319,6 @@ in StableHLO programs. In the meanwhile, here is the list of these operations: `dynamic_gather`, `dynamic_iota`, `dynamic_pad`, `dynamic_reshape`, `real_dynamic_slice`, `set_dimension_size` ([#8](https://github.com/openxla/stablehlo/issues/8)). -* "Quantization" category of StableHLO operations - they were bootstrapped from - MHLO, but we haven't specced them yet: `uniform_quantize` - ([#531](https://github.com/openxla/stablehlo/issues/531)) and - `uniform_dequantize` - ([#530](https://github.com/openxla/stablehlo/issues/530)). * Shape computations, including `arith`, `shape` and `tensor` operations ([#8](https://github.com/openxla/stablehlo/issues/8)). @@ -5570,6 +5565,89 @@ Produces a `result` tuple from values `val`. // %result: ([1.0, 2.0], (3)) ``` +### uniform_dequantize + +#### Semantics + +Performs element-wise conversion of uniform quantized tensor `operand` to a +floating point tensor `result` according to the quantization parameters defined +by the `operand` type. + +Formally, `result = (operand - zero_point(operand)) * scale(operand)`. + +#### Inputs + +| Label | Name | Type | Constraints | +|-------|-----------|------------------|-------------| +| (I1) | `operand` | quantized tensor | (C1), (C2) | + +#### Outputs + +| Name | Type | Constraints | +|----------|-------------------------------|-------------| +| `result` | tensor of floating-point type | (C1), (C2) | + +#### Constraints + +* (C1) `expressed_type(operand) = element_type(result)`. +* (C2) `shape(operand) = shape(result)`. + +#### Examples + +```mlir +// %operand: 20 +%result = "stablehlo.uniform_dequantize"(%operand) : (tensor:f32, 0.5:-20>>) -> tensor +// %result: 20.0 +``` + +### uniform_quantize + +#### Semantics + +Performs element-wise conversion of floating-point tensor or uniform quantized +tensor `operand` to a uniform quantized tensor `result` according to the +quantization parameters defined by the `result` type. + +Formally, + +* For `element_type(operand)` a floating-point type, + * `rounded_result = round_nearest_even(operand / scale(result))`. + * `result = clamp(storage_min(result), rounded_result + zero_point(result), storage_max(result))`. + +* For `element_type(operand)` a quantized type, + * `float_result = (operand - zero_point(operand)) * scale(operand)`. + * `rounded_result = round_nearest_even(float_result / scale(result))`. + * `result = clamp(storage_min(result), rounded_result + zero_point(result), storage_max(result))`. + +#### Inputs + +| Label | Name | Type | Constraints | +|-------|-----------|---------------------------------------------|------------------| +| (I1) | `operand` | tensor of floating-point or quantized  type | (C1), (C2), (C3) | + +#### Outputs + +| Name | Type | Constraints | +|----------|--------------------------|------------------| +| `result` | quantized tensor | (C1), (C2), (C3) | + +#### Constraints + +* (C1) If `element_type(operand)` is a floating-point type, + * `element_type(operand) = expressed_type(result)`. +* (C2) If `element_type(operand)` is a quantized type, + * `num_bits(storage_type(operand)) >= num_bits(storage_type(result))`. + * `expressed_type(operand) = expressed_type(result)`. +* (C3) `shape(operand) = shape(result)`. + +#### Examples + +```mlir +// %operand: 20.0 +%result = "stablehlo.uniform_quantize"(%operand) : (tensor) -> tensor:f32, 0.5:-20>> +// %result: 20 +``` + ### while #### Semantics