Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support building graphs from MLTensor containing constants #760

Open
bbernhar opened this issue Sep 12, 2024 · 4 comments
Open

Support building graphs from MLTensor containing constants #760

bbernhar opened this issue Sep 12, 2024 · 4 comments

Comments

@bbernhar
Copy link

bbernhar commented Sep 12, 2024

Demonstrate how MLTensor can be used to help web developers manage constant data (e.g., trained weights) on-device.

Dependent PRs

Motivation

  • Allow constant data to be uploaded directly to the device, which is a capability that Execution Providers (EPs) leverage to prevent out-of-memory (OOM) errors (ORT example).
  • Re-use constant buffers in system memory between graphs, particularly for encoder-decoder models like Whisper.

Design

MLTensor containing constant data will be associated by name upon creating the MLOperand. At build(), the (un-optimized) constant data will be copied into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately once writeBuffer() is called and build() succeeds.

Example JS

// Upload constant data directly to device
constantTensor = ctx.createTensor({usage: MLTensorUsage.GRAPH_CONSTANT, ...}, new Uint8Array(...), ...); // immutable

builder = new MLGraphBuilder(ctx);
constant = builder.input('myconstant', {dataType: constantTensor.dataType, shape: constantTensor.shape});
...
graph = await builder.build(outputs, {'myconstant', constantTensor});

// Optional: free-up system memory
constantTensor.destroy();

Proposed IDL

partial interface MLContext {
    Promise<MLTensor> createTensor(MLTensorDescriptor descriptor, optional ArrayBufferView sourceData);
};

partial interface MLGraphBuilder {
    Promise<MLGraph> build(MLNamedOperands outputs, optional MLNamedTensors constants = {});
};

Edits:

  • 9/16: Added MLOperandDescriptor as required by MLOperand
  • 9/18: Added constant-initializer to createTensor()
  • 9/19: Reuse input(..) via constant usage flag
@bbernhar
Copy link
Author

@a-sully @RafaelCintron @fdwr @huningxin appreciate any feedback

@fdwr
Copy link
Collaborator

fdwr commented Sep 12, 2024

constant_input -> constantInput (🚫🐍).

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

graphBuilder.input()
graphBuilder.constant()
graphBuilder.constantInput()

@bbernhar
Copy link
Author

I'd need more time to think for meaningful feedback, but it may be rather confusing having this list of methods o_o:

Thanks for the quick feedback. I simplified the proposal even further via initializer + new usage bit.

@mmccool
Copy link

mmccool commented Sep 23, 2024

Definitely interested in this from the point of view of caching models as well (especially weights that might be used by both WebGPU and WebNN implementations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants