-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[patch] Introduce caching #395
Conversation
Wrap them in `emit()` and `emitting_channels` instead of manually calling them. This lets us tighten up If-like nodes too.
To shortcut actually running a node and just return existing output if its cached input matches its current input (by `==` test)
So it can be set at class definition time, even by decorators
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences🚀 Don’t miss a bit, follow what’s new on Codacy. Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
Pull Request Test Coverage Report for Build 10157140074Details
💛 - Coveralls |
I wasn't going to make caching default, but since I'm just using a simple
==
comparison, it doesn't seem to be introducing any meaningful slowdown. The default easy enough to change in the future, in any case.Closes #169
Description copied from the update to the deepdive:
Caching
By default, all nodes exploit caching. I.e. when they run they save a fresh dictionary of their input values; in all subsequent runs if the dictionary of their current input values matches (
==
) that last-used dictionary, they skip executing altogether and leverage their existing outputs.Any changes to the inputs will obviously stop the cache from being retrieved, but for
Composite
nodes it is also reset if any child nodes are added/removed/replaced.Note that since we do a simple
==
on the dictionary of input values, if your workflow non-idempotently passes around mutable data, it's possible you'll wind up in a situation where you get a false cache hit.Caching behaviour can be defined at the class-level as a default, but can be overridden for individual nodes. Let's take a look:
Running the same workflow again, we see that the cached node just keeps returning the same "random" number, while the un-cached node gives us something new
If we look into the caching data, we can see that the non-caching node has not stored any inputs and does not register a cache hit; even if we had previously cached something, if we switch to
use_cache = False
, we won't even look for the cache hit but will just give new data!