You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow constant data to be uploaded directly to the device, which is a capability that Execution Providers (EPs) leverage to prevent out-of-memory (OOM) errors (ORT example).
Re-use constant buffers in system memory between graphs, particularly for encoder-decoder models like Whisper.
Design
MLTensor containing constant data will be associated by name upon creating the MLOperand. At build(), the (un-optimized) constant data will be copied into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately once writeBuffer() is called and build() succeeds.
Example JS
// Upload constant data directly to deviceconstantTensor=ctx.createTensor({usage: MLTensorUsage.GRAPH_CONSTANT, ...},newUint8Array(...), ...);// immutablebuilder=newMLGraphBuilder(ctx);constant=builder.input('myconstant',{dataType: constantTensor.dataType,shape: constantTensor.shape});
...
graph=awaitbuilder.build(outputs,{'myconstant', constantTensor});// Optional: free-up system memoryconstantTensor.destroy();
Definitely interested in this from the point of view of caching models as well (especially weights that might be used by both WebGPU and WebNN implementations).
Demonstrate how
MLTensor
can be used to help web developers manage constant data (e.g., trained weights) on-device.Dependent PRs
MLConstantOperand
: Do we need anMLConstantOperand
? #668 (comment)MLTensor
: Add MLTensor explainer #754Motivation
Design
MLTensor
containing constant data will be associated by name upon creating theMLOperand
. At build(), the (un-optimized) constant data will be copied into the device. The original constant data (ie.ArrayBuffer
input or uploaded device data held byMLTensor
) can be discarded immediately once writeBuffer() is called and build() succeeds.Example JS
Proposed IDL
Edits:
MLOperandDescriptor
as required byMLOperand
The text was updated successfully, but these errors were encountered: