Skip to content

Commit

Permalink
Add a timeout for shortfin unit tests. (#777)
Browse files Browse the repository at this point in the history
I'm seeing stalls in `test_invoke_mobilenet_multi_fiber_per_fiber` from
https://github.com/nod-ai/shark-ai/blob/main/shortfin/tests/invocation/mobilenet_program_test.py
when the test program fails numerics checks. The other test cases fail
and terminate as expected, without needing to use a timeout mechanism.

Tested locally on Windows and the timeout worked (though it isn't
pretty):
```
(.venv) λ pytest tests/ -rA -k test_invoke_mobilenet_multi_fiber_per_fiber --timeout 10
======================================= test session starts =======================================
platform win32 -- Python 3.11.2, pytest-8.3.4, pluggy-1.5.0
rootdir: D:\dev\projects\shark-ai\shortfin
configfile: pyproject.toml
plugins: anyio-4.8.0, timeout-2.3.1
timeout: 10.0s
timeout method: thread
timeout func_only: False
collected 264 items / 263 deselected / 1 selected

tests\invocation\mobilenet_program_test.p
 +++++++++++++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++++++++++++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Captured stdout ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fibers: [Fiber(worker='__init__', devices=[cpu0]), Fiber(worker='__init__', devices=[cpu0]), Fiber(worker='__init__', devices=[cpu0]), Fiber(worker='__init__', devices=[cpu0]), Fiber(worker='__init__', devices=[cpu0])]
Waiting for processes: [Process(pid=1, worker='__init__'), Process(pid=2, worker='__init__'), Process(pid=3, worker='__init__'), Process(pid=4, worker='__init__'), Process(pid=5, worker='__init__')]
Process(pid=1, worker='__init__'): Start
Process(pid=2, worker='__init__'): Start
Process(pid=3, worker='__init__'): Start
Process(pid=4, worker='__init__'): Start
Process(pid=5, worker='__init__'): Start
Process(pid=1, worker='__init__'): Program complete (+116ms)
Process(pid=2, worker='__init__'): Program complete (+111ms)
Process(pid=3, worker='__init__'): Program complete (+107ms)
Process(pid=4, worker='__init__'): Program complete (+101ms)
Process(pid=5, worker='__init__'): Program complete (+97ms)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Captured stderr ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
D:\dev\projects\shark-ai\shortfin\src\shortfin/support/iree_helpers.h:316: UNKNOWN; Unhandled exception: Traceback (most recent call last):
  File "D:\dev\projects\shark-ai\shortfin\tests\invocation\mobilenet_program_test.py", line 77, in assert_mobilenet_ref_output
RuntimeError: Async exception on <Worker '__init__'>): assert 0.8119692911421882 == 5.01964943873882 ± 5.0e-06

  comparison failed
  Obtained: 0.8119692911421882
  Expected: 5.01964943873882 ± 5.0e-06
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-4 () (9816) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "C:\Program Files\Python311\Lib\threading.py", line 995, in _bootstrap
    self._bootstrap_inner()
  File "C:\Program Files\Python311\Lib\threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "C:\Program Files\Python311\Lib\threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)

...
```
  • Loading branch information
ScottTodd authored Jan 7, 2025
1 parent ab29d88 commit ad236fd
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
2 changes: 1 addition & 1 deletion .github/workflows/ci-libshortfin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ jobs:
working-directory: ${{ env.LIBSHORTFIN_DIR }}
run: |
ctest --timeout 30 --output-on-failure --test-dir build
pytest -s --durations=10
pytest -s --durations=10 --timeout=30
# Depends on all other jobs to provide an aggregate job status.
ci_libshortfin_summary:
Expand Down
1 change: 1 addition & 0 deletions shortfin/requirements-tests.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
pytest
pytest-timeout
requests
fastapi
onnx
Expand Down

0 comments on commit ad236fd

Please sign in to comment.