Skip to content

Commit

Permalink
[webgpu] Use subgroup for matmulnbits (#23224)
Browse files Browse the repository at this point in the history
### Description
This PR applies subgroup to implement matmulnbits when tile_m > 1 for
intel devices.
With this PR, prefill for 500 tokens prompt for phi3 becomes 3.5s from
8.5s on intel Meteor Lake.
  • Loading branch information
qjia7 authored Jan 13, 2025
1 parent 73f5b0c commit 80d8931
Show file tree
Hide file tree
Showing 4 changed files with 293 additions and 127 deletions.
Loading

0 comments on commit 80d8931

Please sign in to comment.