Figure out retry behavior across the board #1001

wolfv · 2024-12-23T10:54:27Z

We currently have the following situation with regards to network retries:

We use the ClientWithMiddleware everywhere, and there can be a retry-middleware installed.
We cannot inspect the middlewares that are installed. Once inside the ClientWithMiddleware it becomes opaque. We need the authentication middlewares for sure.
The package cache has it's own retry mechanism that retries under certain conditions. These conditions differ from reqwest-retry. For example, reqwest-retry also retries on a connect issue under certain conditions: https://github.com/TrueLayer/reqwest-middleware/blob/f310cb8604ada3c992ffab370fbdbd50ff108a21/reqwest-retry/src/retryable_strategy.rs#L140
We do not use a retry strategy by default when downloading sharded repodata and we have hit some issues in the pixi tests (they became a little flaky). Retrying would definitely be appropriate.

Now, in pixi and rattler-build we install the (default) retry middleware in the ClientWithMiddleware. However, now we are in a situation where we might retry more in the package cache, since we woudl do 3x3 retries.

We should decide if we want to have custom retry handling in rattler, or a custom retry middleware. However, that might also be tricky, because the package cache passes the stream to the extraction functions, and we migth have to do more "cleanup" when retrying.

We could also create a new wrapper type around the ClientWithMiddleware that would leave it open to the functions to use the client with builtin-retry mechanism or the client without.

The text was updated successfully, but these errors were encountered:

wolfv · 2025-01-06T09:45:40Z

Just discussed with @baszalmstra and the plan is as follows:

offer a "default middleware stack" as an API so that users can just use that to get a good behavior
rely on RetryWithMiddleware for network issues, but have our own handler when the IO fails while streaming contents in the PackageCache (ie. remove a bunch of stuff there).

baszalmstra · 2025-01-07T10:07:26Z

I noticed that we also see issues here when downloading repodata. RetryMiddleware only retries if there is an issue with either sending the request or when there is an issue with the response. It doesnt provide any retry behavior when the streaming the body fails for some reason.

This might be a useful library: https://docs.rs/futures-retry-policies/latest/futures_retry_policies/index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out retry behavior across the board #1001

Figure out retry behavior across the board #1001

wolfv commented Dec 23, 2024

wolfv commented Jan 6, 2025

baszalmstra commented Jan 7, 2025

Figure out retry behavior across the board #1001

Figure out retry behavior across the board #1001

Comments

wolfv commented Dec 23, 2024

wolfv commented Jan 6, 2025

baszalmstra commented Jan 7, 2025