-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring back Python backend based PyTorch backend #6518
Conversation
996a6e5
to
0562546
Compare
0562546
to
7b1b22a
Compare
7b1b22a
to
ec777d5
Compare
qa/L0_model_config/autofill_noplatform_success/custom/no_backend.identity/expected
Show resolved
Hide resolved
import torch | ||
|
||
sys.modules["triton_python_backend_utils"] = unittest.mock.MagicMock() | ||
from py_runtime import _gather_torch_tensors, _scatter_torch_tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment that this is importing from python based implementation of pytorch backend? It wasn't clear to me at first and had to look around to figure that out. Can also mention the mock is just to satisfy import for python model implementation, but is only using the helpers and not the model itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… jacky-python-based-pytorch
… jacky-python-based-pytorch
… jacky-python-based-pytorch
backend: "onnxruntime" | ||
runtime: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should auto-complete fill in this field with correct value as well (i.e., this would be /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my original thought, but the issue is how does auto-complete determine if the model uses C++ or Python runtime? A simple solution is to have backend specific auto-complete implementation. For example, if it the model uses PyTorch backend and a *.pt
model file is provided without any *.py
file, then use C++. If a *.py
file is provided, then use Python. See this commit. From previous discussions with @nnshah1 and @rmccorm4, we want to stop adding more backend specific logic into auto-complete and opt for a generic implementation, which Python runtime falls into the category of custom backend on auto-complete. Backends can implement either C++ or Python runtime (or both), we do not want to auto-complete into Python runtime on a C++ only backend, which requires checking the backend installation directory. The current structure limits auto-complete to only see model directory, but not backend directory, because the latter is resolved only when the model is loaded, which happens after auto-complete. A easier approach is to resolve runtime, if not already filled, when the model is loaded, which has both the model and backend directory information.
Going back to the original question, auto-complete does not touch the runtime field, it is determined/filled when the model is loaded, if it is not already filled.
@@ -0,0 +1,147 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a generic test that tries to set runtime
field to some invalid values, and test that Triton gracefully handles it?
# config.pbtxt
# unexpected/bad value
runtime: "invalid_value"
and
# config.pbtxt
# no python implementation found
backend: "onnxruntime"
runtime: "model.py"
Maybe we can have something like this:
L0_runtime_invalid
L0_runtime_pytorch_python
and future tests can be L0_runtime_*
for organization.
This could be a quick follow-up if you wanted to separate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think can simply add the invalid runtime test into L0_model_config
, because many of the existing Python runtime related model config tests are already added there, plus the runtime escape test is also added there on this PR.
I like the idea of having L0_runtime_*
for better organization. I think we can do a giant refactor later on when we see it fits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… jacky-python-based-pytorch
… jacky-python-based-pytorch
… jacky-python-based-pytorch
… jacky-python-based-pytorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
* Patch L0_model_config with runtime * Add L0_pytorch_python_runtime * Update expected runtime field * Add test for escaping runtime * Add comments on unit test imports * Add invalid runtime test * User to build PyTorch env * Update copyright
This reverts commit 64433b3.
This reverts commit 64433b3.
…"" This reverts commit 40e8ae8.
Related PRs:
Minor changes to L0_model_config due to adding "runtime" field into model configuration.
Add L0_pytorch_python_runtime test for the Python backend based PyTorch backend.