Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[patch] Make pickle the default storage backend #412

Merged
merged 25 commits into from
Aug 9, 2024

Conversation

liamhuber
Copy link
Member

@liamhuber liamhuber commented Aug 8, 2024

But now instead of only looking for and loading storage content that matches your storage back-end, the back-end is the first one searched and then the node falls back on looking for content from any other allowed backend.

This means that when you re-instantiate a node with the new default "pickle" backend, it is able to automatically load data saved with the old "tinybase" backend, but then subsequent saves with re-save it with the new "pickle" backend. I.e. running the instantiate/load -> save cycle will automatically convert new save files. (Technically it will add the new save files, as the old ones won't get deleted unless you explicitly make them delete.)

This isn't technically, fully backwards compatible because of the change in default behaviour; however, since we're still in the wild west of v0.x.y, I deem it sufficiently friendly to count as a patch.

liamhuber and others added 19 commits August 6, 2024 20:07
With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.
Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.
You're only supposed to use it as a decorator to start with, so the kwarg was senseless
This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.
Now that it's pickling as well as anything else
When the pickle backend fails
Copy link

github-actions bot commented Aug 8, 2024

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_workflow/pickle_as_default_backend

@liamhuber liamhuber changed the title [minor] Make pickle the default storage backend [patch] Make pickle the default storage backend Aug 8, 2024
@liamhuber liamhuber added the format_black trigger the Black formatting bot label Aug 8, 2024
Base automatically changed from fallback_on_cloudpickle to patch_bump August 9, 2024 16:51
@liamhuber liamhuber merged commit 39e7544 into patch_bump Aug 9, 2024
1 of 2 checks passed
@liamhuber liamhuber deleted the pickle_as_default_backend branch August 9, 2024 16:52
liamhuber added a commit that referenced this pull request Aug 13, 2024
* [patch] Pickle storage backend (#408)

* Refactor storage test

* Remove empty line

* Rename workflow in tests

With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.

* Introduce `pickle` as a backend for saving

* Fix root cause of storage conflict

Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.

* Again, correctly order try/finally and for-loops

* [patch] `as_dataclass_node` pickle compatibility (#410)

* Refactor storage test

* Remove empty line

* Rename workflow in tests

With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.

* Introduce `pickle` as a backend for saving

* Fix root cause of storage conflict

Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.

* Again, correctly order try/finally and for-loops

* Remove keyword argument from pure-decorator

You're only supposed to use it as a decorator to start with, so the kwarg was senseless

* Add factory import source

This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.

* Bring the dataclass node in line with function and macro

* Leverage new pyiron_snippets.factory stuff to find the class

* Mangle the stored dataclass qualname so it can be found later

* Add tests

* Update docs examples to reflect new naming

* Update snippets dependency

* [dependabot skip] Update env file

* Use new pyiron_snippets syntax consistently

* Expose `as_dataclass_node` in the API

Now that it's pickling as well as anything else

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] Fall back on cloudpickle (#411)

* Refactor storage test

* Remove empty line

* Rename workflow in tests

With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.

* Introduce `pickle` as a backend for saving

* Fix root cause of storage conflict

Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.

* Again, correctly order try/finally and for-loops

* Remove keyword argument from pure-decorator

You're only supposed to use it as a decorator to start with, so the kwarg was senseless

* Add factory import source

This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.

* Bring the dataclass node in line with function and macro

* Leverage new pyiron_snippets.factory stuff to find the class

* Mangle the stored dataclass qualname so it can be found later

* Add tests

* Update docs examples to reflect new naming

* Update snippets dependency

* [dependabot skip] Update env file

* Use new pyiron_snippets syntax consistently

* Expose `as_dataclass_node` in the API

Now that it's pickling as well as anything else

* [patch] Fall back on cloudpickle

When the pickle backend fails

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] Make pickle the default storage backend (#412)

* Refactor storage test

* Remove empty line

* Rename workflow in tests

With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.

* Introduce `pickle` as a backend for saving

* Fix root cause of storage conflict

Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.

* Again, correctly order try/finally and for-loops

* Remove keyword argument from pure-decorator

You're only supposed to use it as a decorator to start with, so the kwarg was senseless

* Add factory import source

This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.

* Bring the dataclass node in line with function and macro

* Leverage new pyiron_snippets.factory stuff to find the class

* Mangle the stored dataclass qualname so it can be found later

* Add tests

* Update docs examples to reflect new naming

* Update snippets dependency

* [dependabot skip] Update env file

* Use new pyiron_snippets syntax consistently

* Expose `as_dataclass_node` in the API

Now that it's pickling as well as anything else

* [patch] Fall back on cloudpickle

When the pickle backend fails

* [minor] Make pickle the default storage backend

* Format black

* Fall back on loading any storage contents

Regardless of what the specified storage backend was.

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* Format black

* Update pyiron deps

* [dependabot skip] Update env file

---------

Co-authored-by: pyiron-runner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
format_black trigger the Black formatting bot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants