[patch] Bump development status to Beta (#482)

* [patch] Bump development status to Beta The filesystem interaction is stable and robust (i.e. recovery files get written when things go wrong), which has the knock-on effect that you can scale workflows to remote processes. Support for individual nodes running on remote processes beyond the lifetime of the parent python process is pretty rough, but also there. * Extend docs * Extend readme
pyiron · Sep 30, 2024 · d8728f1 · d8728f1
1 parent c57d387
commit d8728f1
Show file tree

Hide file tree

Showing 4 changed files with 25 additions and 4 deletions.
diff --git a/docs/README.md b/docs/README.md
@@ -17,18 +17,19 @@
 `pyiron_workflow` is a framework for constructing workflows as computational graphs from simple python functions. Its objective is to make it as easy as possible to create reliable, reusable, and sharable workflows, with a special focus on research workflows for HPC environments.
 
 Nodes are formed from python functions with simple decorators, and the resulting nodes can have their data inputs and outputs connected. 
+Unlike regular python, they operate in a delayed way.
 
 By allowing (but not demanding, in the case of data DAGs) users to specify the execution flow, both cyclic and acyclic graphs are supported. 
 
 By scraping type hints from decorated functions, both new data values and new graph connections are (optionally) required to conform to hints, making workflows strongly typed.
 
-Individual node computations can be shipped off to parallel processes for scalability. (This is a beta-feature at time of writing; standard python executors like `concurrent.futures.ThreadPoolExecutor` and `ProcessPoolExecutor` work, and the `Executor` executor from [`executorlib`](https://github.com/pyiron/executorlib) is supported and tested; `executorlib`'s more powerful flux- and slurm- based executors have not been tested and may fail.)
+Individual node computations can be shipped off to parallel processes for scalability. Standard python executors like `concurrent.futures.ThreadPoolExecutor` and `ProcessPoolExecutor` work, but so does, e.g., the `Executor` executor from [`executorlib`](https://github.com/pyiron/executorlib), which facilitates running on HPC. It is also straightforward to run an entire graph on a remote process, e.g. a SLURM allocation, by locally saving the graph and remotely loading, running, and re-saving. Cf. [this notebook](../notebooks/hpc_example.ipynb) for some simple examples.
 
 Once you're happy with a workflow, it can be easily turned it into a macro for use in other workflows. This allows the clean construction of increasingly complex computation graphs by composing simpler graphs.
 
 Nodes (including macros) can be stored in plain text as python code, and imported by future workflows for easy access. This encourages and supports an ecosystem of useful nodes, so you don't need to re-invent the wheel. When these python files are in a properly managed git repository and released in a stable channel (e.g. conda-forge), they fulfill most requirements of the [FAIR](https://en.wikipedia.org/wiki/FAIR_data) principles.
 
-Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. These can be reloaded (automatically on instantiation, in the case of workflows) and examined/rerun, etc. 
+Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. These can be reloaded (automatically on instantiation, in the case of workflows) and examined/rerun, etc. If your workflow fails, it will (by default) save a recovery file for you to restore it at the time of failure.
 
 ## Installation
 

diff --git a/pyiron_workflow/mixin/run.py b/pyiron_workflow/mixin/run.py
@@ -33,6 +33,18 @@ class Runnable(UsesState, HasLabel, HasRun, ABC):
     Child classes can optionally override :meth:`process_run_result` to do something
     with the returned value of :meth:`on_run`, but by default the returned value just
     passes cleanly through the function.
+
+    The `run` cycle is broken down into sub-steps:
+    - `_before_run`: prior to the `running` status being set to `True`
+    - `_run`: after the `running` status has been set to `True`
+    - `_finish_run`: what is done to the results of running, and when `running` is
+        set to `False`
+    - `_run_exception`: What to do if an encountered
+    - `_run_finally`: What to do after _every_ run, regardless of whether an exception
+        was encountered
+
+    Child classes can extend the behavior of these sub-steps, including introducing
+    new keyword arguments.
     """
 
     def __init__(self, *args, **kwargs):

diff --git a/pyiron_workflow/node.py b/pyiron_workflow/node.py
@@ -76,6 +76,8 @@ class Node(
             - In addition to operations, some methods exist for common routines, e.g.
                 casting the value as `int`.
     - When running their computation, nodes may or may not:
+        - If already running, check for serialized results from a process that
+            survived the death of their original process
         - First update their input data values using kwargs
             - (Note that since this happens first, if the "fetching" step later occurs,
                 any values provided here will get overwritten by data that is flowing
@@ -100,10 +102,12 @@ class Node(
             the execution flow
     - Running the node (and all aliases of running) return a representation of data
         held by the output channels (or a futures object)
-    - If an error is encountered _after_ reaching the state of actually computing the
+    - If an error is encountered _after_ reaching the state of actually running the
         node's task, the status will get set to failure
     - Nodes can be instructed to run at the end of their initialization, but will exit
         cleanly if they get to checking their readiness and find they are not ready
+    - Nodes can suppress raising errors they encounter by setting a runtime keyword
+        argument.
     - Nodes have a label by which they are identified within their scope, and a full
         label which is unique among the entire semantic graph they exist within
     - Nodes can run their computation using remote resources by setting an executor
@@ -140,6 +144,10 @@ class Node(
             IO data is not pickle-able.
         - Saving is triggered manually, or by setting a flag to make a checkpoint save
             of the entire graph after the node runs.
+        - Saving the entire graph can be set to happen at the end of a particular
+            node's run with a checkpoint flag.
+        - A specially named recovery file for the entire graph will (by default) be
+            automatically saved if the node raises an exception.
         - The pickle storage interface comes with all the same caveats as pickle and
             is not suitable for storage over indefinitely long time periods.
             - E.g., if the source code (cells, `.py` files...) for a saved graph is

diff --git a/pyproject.toml b/pyproject.toml
@@ -19,7 +19,7 @@ readme = "docs/README.md"
 keywords = [ "pyiron",]
 requires-python = ">=3.10, <3.13"
 classifiers = [
-    "Development Status :: 3 - Alpha",
+    "Development Status :: 4 - Beta",
     "Topic :: Scientific/Engineering",
     "License :: OSI Approved :: BSD License",
     "Intended Audience :: Science/Research",