Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FAQs and Common Issues doc page #7547

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

GregoryComer
Copy link
Member

@GregoryComer GregoryComer commented Jan 8, 2025

Summary

Add "FAQs and Common Issues" page to the ExecuTorch docs. This summarizes common issues that we've seen when users adopt ExecuTorch.

I've tentatively put this under the Getting Started section, as that seems like the most reasonable place to put it, but I'm open to suggestions.

Test plan

New doc page preview: https://docs-preview.pytorch.org/pytorch/executorch/7547/getting-started-faqs.html

Copy link

pytorch-bot bot commented Jan 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7547

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e8e0865 with merge base 6c9b9b6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 8, 2025

We are actively working to improve the out-of-box behavior, but the above APIs can be used to improve mobile performance as workaround until deeper changes for performant core detection land.

### Erroa setting input: 0x10 / Attempted to resize a bounded tensor...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo


### Duplicate Kernel Registration Abort

This manifests as a crash call stack including ExecuTorch kernel registration and failing with an `et_pal_abort`. This typically means there are multiple `gen_operators_lib` targets linked into the applications. There must be only one generated lib per target, though each model can have its own `gen_selected_ops/generate_bindings_for_kernels` call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generated lib -> generated operator library


### Performance Troubleshooting

Ensure the model is delegated. If not targeting a specific accelerator, use the XNNPACK delegate for CPU performance. Undelegated operators will typically fall back to the ExecuTorch portable library, which is designed as a platform-independent fallback, and is not optimized for specific hardware.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecuTorch portable library, which is designed as a platform-independent fallback, and is not optimized for specific hardware.

which is design to serve as a reference implementation/fallback and not intended to be used in a performance sensitive production scenarios.


Ensure the model is delegated. If not targeting a specific accelerator, use the XNNPACK delegate for CPU performance. Undelegated operators will typically fall back to the ExecuTorch portable library, which is designed as a platform-independent fallback, and is not optimized for specific hardware.

Additionally, thread counts are a common source of performance issues. While we are working to improve the default behavior, ExecuTorch will currently use as many threads as there are cores. On some heterogenous mobile SOCs, this can be slow. Consider setting the thread count to cores / 2, or just set to 4. This will lead to a speedup (or maintain parity) on almost all mobile devices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will lead to a speedup (or maintain parity) on almost all mobile devices.

This might lead to a speedup?

Because if it always will, why this is not a default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I would probably add a reference to a function other document that explain how CPU parallelism can be configured

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to this in OSS at the moment except for the unsafe API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: misc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants