webmachinelearning · anssiko · Feb 15, 2024 · Jan 31, 2024 · Feb 13, 2024 · Feb 13, 2024
diff --git a/docs/SpecCodingConventions.md b/docs/SpecCodingConventions.md
@@ -70,6 +70,8 @@ Example:
     1. If |shape| is a [=circle=], draw it at |shape|'s [=circle/origin=].
 ```
 
+* When referencing an operator in text (e.g. sigmoid, tanh, etc), link the operator name to the `MLGraphBuilder` methods for creating the corresponding `MLOperand` or `MLActivation`, e.g. `{{MLGraphBuilder/sigmoid()}}`. This provides consistent styling, and provides a thorough overview of the operator, even if the method itself isn't being discussed.
+
 
 ### Formatting
 

diff --git a/index.bs b/index.bs
@@ -912,12 +912,12 @@ interface MLActivation {
 </div>
 
 <div class="note">
-These activations function types are used to create other operations. One such use of this interface is for when an activation function is fused into another operation such as [[#api-mlgraphbuilder-conv2d]] or [[#api-mlgraphbuilder-batchnorm]] during a graph construction session. Such fused activation functions can provide a significant performance improvement when supported natively by the underlying implementation. This is intended as an optimization opportunity for implementers.
+These activations function types are used to create other operations. One such use of this interface is for when an activation function is fused into another operation such as {{MLGraphBuilder/conv2d()}} or {{MLGraphBuilder/batchNormalization()}} during a graph construction session. Such fused activation functions can provide a significant performance improvement when supported natively by the underlying implementation. This is intended as an optimization opportunity for implementers.
 </div>
 
 ### Creating {{MLActivation}} ### {#api-mlactivation-create}
 <div class="note">
-The {{MLActivation}} objects (including the ones passed as input to methods) are created by the methods of {{MLGraphBuilder}} and are identified by their name. The |options| dictionary is defined by those methods. The actual creation of the activation function e.g. a [[#api-mlgraphbuilder-sigmoid-method]] or [[#api-mlgraphbuilder-relu-method]] can then be deferred until when the rest of the graph is ready to connect with it such as during the construction of [[#api-mlgraphbuilder-conv2d]] for example.
+The {{MLActivation}} objects (including the ones passed as input to methods) are created by the methods of {{MLGraphBuilder}} and are identified by their name. The |options| dictionary is defined by those methods. The actual creation of the activation function e.g. a {{MLGraphBuilder/sigmoid()}} or {{MLGraphBuilder/relu()}} can then be deferred until when the rest of the graph is ready to connect with it such as during the construction of {{MLGraphBuilder/conv2d()}} for example.
 </div>
 
 <details open algorithm>
@@ -1486,7 +1486,7 @@ partial interface MLGraphBuilder {
 <div class="note">
   <details open>
     <summary>
-    The behavior of this operation when the input tensor is 4-D of the {{MLInputOperandLayout/"nchw"}} layout and the activation is of operator type *relu* can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
+    The behavior of this operation when the input tensor is 4-D of the {{MLInputOperandLayout/"nchw"}} layout and the activation is {{MLGraphBuilder/relu()}} can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
     </summary>
     <pre highlight="js">
     const shape = [1,null,1,1];
@@ -2341,7 +2341,7 @@ partial interface MLGraphBuilder {
 </div>
 
 <div class="note">
-Although operations *greaterOrEqual* and *lesserOrEqual* can each be implemented in terms of operations *not*, *lesser*, and *greater* in other words `greater-or-equal(a, b)` is `not(lesser(a, b))`, they are specifically defined to handle NaN cases and for performance reason to avoid double comparisons.
+Although operations {{MLGraphBuilder/greaterOrEqual()}} and {{MLGraphBuilder/lesserOrEqual()}} can each be implemented in terms of operations {{MLGraphBuilder/not()}}, {{MLGraphBuilder/lesser()}}, and {{MLGraphBuilder/greater()}} in other words `builder.greaterOrEqual(a, b)` is `builder.not(builder.lesser(a, b))`, they are specifically defined to handle NaN cases and for performance reason to avoid double comparisons.
 </div>
 
 <details open algorithm>
@@ -3199,7 +3199,7 @@ partial interface MLGraphBuilder {
 <div class="note">
 <details open>
   <summary>
-    The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default {{MLGruWeightLayout/"zrn"}} layout, and the activation functions of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively.
+    The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default {{MLGruWeightLayout/"zrn"}} layout, and the activation functions of the update/reset gate and new gate are {{MLGraphBuilder/sigmoid()}} and {{MLGraphBuilder/tanh()}} respectively.
   </summary>
   <pre highlight="js">
     const one = builder.constant(1);
@@ -3505,7 +3505,7 @@ Create a named {{MLOperand}} based on a descriptor, that can be used as an input
 </details>
 
 ### instanceNormalization ### {#api-mlgraphbuilder-instancenorm}
-Normalize the input using [[Instance-Normalization]]. Unlike [[#api-mlgraphbuilder-batchnorm]] where the mean and variance values used in the normalization are computed across all the samples in the batch dimension while the model is trained, the mean and variance values used in the instance normalization are computed on the fly for each input feature of each individual sample in the batch.
+Normalize the input using [[Instance-Normalization]]. Unlike {{MLGraphBuilder/batchNormalization()}} where the mean and variance values used in the normalization are computed across all the samples in the batch dimension while the model is trained, the mean and variance values used in the instance normalization are computed on the fly for each input feature of each individual sample in the batch.
 
 <script type=idl>
 dictionary MLInstanceNormalizationOptions {
@@ -3607,7 +3607,7 @@ partial interface MLGraphBuilder {
 </div>
 
 ### layerNormalization ### {#api-mlgraphbuilder-layernorm}
-Normalize the input using [[Layer-Normalization]]. Unlike [[#api-mlgraphbuilder-batchnorm]] where the mean and variance values are computed across all the samples in the batch dimension while the model is trained, and in [[#api-mlgraphbuilder-instancenorm]] where the mean and variance values are computed on the fly for each input feature of each individual sample in the batch, the means and variance values of the layer normalization are computed on the fly across all the input features of each individual sample in the batch.
+Normalize the input using [[Layer-Normalization]]. Unlike {{MLGraphBuilder/batchNormalization()}} where the mean and variance values are computed across all the samples in the batch dimension while the model is trained, and in {{MLGraphBuilder/instanceNormalization()}} where the mean and variance values are computed on the fly for each input feature of each individual sample in the batch, the means and variance values of the layer normalization are computed on the fly across all the input features of each individual sample in the batch.
 
 <script type=idl>
 dictionary MLLayerNormalizationOptions {
@@ -4202,7 +4202,7 @@ partial interface MLGraphBuilder {
 <div class="note">
 <details open>
   <summary>
-    The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default {{MLLstmWeightLayout/"iofg"}} layout, and the activation functions of the input/forget/output gate and the cell gate/the cell state's filter for the output hidden state are of the operator types *sigmoid* and *tanh* respectively.
+    The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default {{MLLstmWeightLayout/"iofg"}} layout, and the activation functions of the input/forget/output gate and the cell gate/the cell state's filter for the output hidden state are {{MLGraphBuilder/sigmoid()}} and {{MLGraphBuilder/tanh()}} respectively.
   </summary>
   <pre highlight="js">
     const zero = builder.constant(0);
@@ -5119,7 +5119,7 @@ partial interface MLGraphBuilder {
 <div class="note">
   <details open>
     <summary>
-    Many shape-related operations such as [squeeze](https://pytorch.org/docs/stable/generated/torch.squeeze.html), [unsqueeze](https://pytorch.org/docs/stable/generated/torch.unsqueeze.html), and [flatten](https://pytorch.org/docs/stable/generated/torch.flatten.html) can be generically implemented using the *reshape*}} operation as follows:
+    Many shape-related operations such as [squeeze](https://pytorch.org/docs/stable/generated/torch.squeeze.html), [unsqueeze](https://pytorch.org/docs/stable/generated/torch.unsqueeze.html), and [flatten](https://pytorch.org/docs/stable/generated/torch.flatten.html) can be generically implemented using the {{MLGraphBuilder/reshape()}} operation as follows:
     </summary>
     <pre highlight="js">
     // Returns a tensor with all specified dimensions of input of size 1 removed.
-Original file line number
+Diff line change
@@ Expand Up / @@ -70,6 +70,8 @@ Example: @@
 . If |shape| is a [=circle=], draw it at |shape|'s [=circle/origin=].
     ```
+    * When referencing an operator in text (e.g. sigmoid, tanh, etc), link the operator name to the `MLGraphBuilder` methods for creating the corresponding `MLOperand` or `MLActivation`, e.g. `{{MLGraphBuilder/sigmoid()}}`. This provides consistent styling, and provides a thorough overview of the operator, even if the method itself isn't being discussed.
     ### Formatting
@@ Expand Down @@