Skip to content

Commit

Permalink
Merge branch 'master' into benc-dependency-error-rendering
Browse files Browse the repository at this point in the history
  • Loading branch information
benclifford authored Jan 8, 2025
2 parents ce429bb + 34a2890 commit f1eaa9b
Show file tree
Hide file tree
Showing 115 changed files with 3,987 additions and 3,726 deletions.
5 changes: 4 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Parsl - Parallel Scripting Library
==================================
|licence| |docs| |NSF-1550588| |NSF-1550476| |NSF-1550562| |NSF-1550528| |CZI-EOSS|
|licence| |docs| |NSF-1550588| |NSF-1550476| |NSF-1550562| |NSF-1550528| |NumFOCUS| |CZI-EOSS|

Parsl extends parallelism in Python beyond a single computer.

Expand Down Expand Up @@ -64,6 +64,9 @@ then explore the `parallel computing patterns <https://parsl.readthedocs.io/en/s
.. |CZI-EOSS| image:: https://chanzuckerberg.github.io/open-science/badges/CZI-EOSS.svg
:target: https://czi.co/EOSS
:alt: CZI's Essential Open Source Software for Science
.. |NumFOCUS| image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A
:target: https://numfocus.org
:alt: Powered by NumFOCUS


Quickstart
Expand Down
2 changes: 1 addition & 1 deletion docs/devguide/roadmap.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Code Maintenance
* **Type Annotations and Static Type Checking**: Add static type annotations throughout the codebase and add typeguard checks.
* **Release Process**: `Improve the overall release process <https://github.com/Parsl/parsl/issues?q=is%3Aopen+is%3Aissue+label%3Arelease_process>`_ to synchronize docs and code releases, automatically produce changelog documentation.
* **Components Maturity Model**: Defines the `component maturity model <https://github.com/Parsl/parsl/issues/2554>`_ and tags components with their appropriate maturity level.
* **Define and Document Interfaces**: Identify and document interfaces via which `external components <https://parsl.readthedocs.io/en/stable/userguide/plugins.html>`_ can augment the Parsl ecosystem.
* **Define and Document Interfaces**: Identify and document interfaces via which `external components <https://parsl.readthedocs.io/en/stable/userguide/advanced/plugins.html>`_ can augment the Parsl ecosystem.
* **Distributed Testing Process**: All tests should be run against all possible schedulers, using different executors, on a variety of remote systems. Explore the use of containerized schedulers and remote testing on real systems.

New Features and Integrations
Expand Down
10 changes: 5 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Parsl - Parallel Scripting Library
Parsl extends parallelism in Python beyond a single computer.

You can use Parsl
`just like Python's parallel executors <userguide/workflow.html#parallel-workflows-with-loops>`_
`just like Python's parallel executors <userguide/workflows/workflow.html#parallel-workflows-with-loops>`_
but across *multiple cores and nodes*.
However, the real power of Parsl is in expressing multi-step workflows of functions.
Parsl lets you chain functions together and will launch each function as inputs and computing resources are available.
Expand Down Expand Up @@ -37,8 +37,8 @@ Parsl lets you chain functions together and will launch each function as inputs
Start with the `configuration quickstart <quickstart.html#getting-started>`_ to learn how to tell Parsl how to use your computing resource,
see if `a template configuration for your supercomputer <userguide/configuring.html>`_ is already available,
then explore the `parallel computing patterns <userguide/workflow.html>`_ to determine how to use parallelism best in your application.
see if `a template configuration for your supercomputer <userguide/configuration/examples.html>`_ is already available,
then explore the `parallel computing patterns <userguide/workflows/workflows.html>`_ to determine how to use parallelism best in your application.

Parsl is an open-source code, and available on GitHub: https://github.com/parsl/parsl/

Expand All @@ -57,7 +57,7 @@ Parsl works everywhere

*Parsl can run parallel functions on a laptop and the world's fastest supercomputers.*
Scaling from laptop to supercomputer is often as simple as changing the resource configuration.
Parsl is tested `on many of the top supercomputers <userguide/configuring.html>`_.
Parsl is tested `on many of the top supercomputers <userguide/configuration/examples.html>`_.

Parsl is flexible
-----------------
Expand Down Expand Up @@ -107,7 +107,7 @@ Table of Contents
quickstart
1-parsl-introduction.ipynb
userguide/index
userguide/glossary
userguide/glossary
faq
reference
devguide/index
Expand Down
14 changes: 7 additions & 7 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ We describe these components briefly here, and link to more details in the `User

.. note::

Parsl's documentation includes `templates for many supercomputers <userguide/configuring.html>`_.
Parsl's documentation includes `templates for many supercomputers <userguide/configuring/examples.html>`_.
Even though you may not need to write a configuration from a blank slate,
understanding the basic terminology below will be very useful.

Expand Down Expand Up @@ -112,7 +112,7 @@ with hello world Python and Bash apps.
with open('hello-stdout', 'r') as f:
print(f.read())
Learn more about the types of Apps and their options `here <userguide/apps.html>`__.
Learn more about the types of Apps and their options `here <userguide/apps/index.html>`__.

Executors
^^^^^^^^^
Expand All @@ -127,7 +127,7 @@ You can dynamically set the number of workers based on available memory and
pin each worker to specific GPUs or CPU cores
among other powerful features.

Learn more about Executors `here <userguide/execution.html#executors>`__.
Learn more about Executors `here <userguide/configuration/execution.html#executors>`__.

Execution Providers
^^^^^^^^^^^^^^^^^^^
Expand All @@ -141,7 +141,7 @@ Another key role of Providers is defining how to start an Executor on a remote c
Often, this simply involves specifying the correct Python environment and
(described below) how to launch the Executor on each acquired computers.

Learn more about Providers `here <userguide/execution.html#execution-providers>`__.
Learn more about Providers `here <userguide/configuration/execution.html#execution-providers>`__.

Launchers
^^^^^^^^^
Expand All @@ -151,7 +151,7 @@ A common example is an :class:`~parsl.launchers.launchers.MPILauncher`, which us
for starting a single program on multiple computing nodes.
Like Providers, Parsl comes packaged with Launchers for most supercomputers and clouds.

Learn more about Launchers `here <userguide/execution.html#launchers>`__.
Learn more about Launchers `here <userguide/configuration/execution.html#launchers>`__.


Benefits of a Data-Flow Kernel
Expand All @@ -164,7 +164,7 @@ and performs the many other functions needed to execute complex workflows.
The flexibility and performance of the DFK enables applications with
intricate dependencies between tasks to execute on thousands of parallel workers.

Start with the Tutorial or the `parallel patterns <userguide/workflow.html>`_
Start with the Tutorial or the `parallel patterns <userguide/workflows/workflow.html>`_
to see the complex types of workflows you can make with Parsl.

Starting Parsl
Expand Down Expand Up @@ -210,7 +210,7 @@ An example which launches 4 workers on 1 node of the Polaris supercomputer looks
)
The documentation has examples for other supercomputers `here <userguide/configuring.html>`__.
The documentation has examples for other supercomputers `here <userguide/configuration/examples.html>`_.

The next step is to load the configuration

Expand Down
File renamed without changes.
9 changes: 9 additions & 0 deletions docs/userguide/advanced/examples/library/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Functions used as part of the workflow"""
from typing import List, Tuple

from .logic import convert_to_binary


def convert_many_to_binary(xs: List[int]) -> List[Tuple[bool, ...]]:
"""Convert a list of nonnegative integers to binary"""
return [convert_to_binary(x) for x in xs]
23 changes: 23 additions & 0 deletions docs/userguide/advanced/examples/library/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import LocalProvider


def make_local_config(cores_per_worker: int = 1) -> Config:
"""Generate a configuration which runs all tasks on the local system
Args:
cores_per_worker: Number of cores to dedicate for each task
Returns:
Configuration object with the requested settings
"""
return Config(
executors=[
HighThroughputExecutor(
label="htex_local",
cores_per_worker=cores_per_worker,
cpu_affinity='block',
provider=LocalProvider(),
)
],
)
15 changes: 15 additions & 0 deletions docs/userguide/advanced/examples/library/logic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from typing import Tuple


def convert_to_binary(x: int) -> Tuple[bool, ...]:
"""Convert a nonnegative integer into a binary
Args:
x: Number to be converted
Returns:
The binary number represented as list of booleans
"""
if x < 0:
raise ValueError('`x` must be nonnegative')
bin_as_string = bin(x)
return tuple(i == '1' for i in bin_as_string[2:])
7 changes: 7 additions & 0 deletions docs/userguide/advanced/examples/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[project]
name = "library"
version = '0.0.0'
description = 'Example library for Parsl documentation'

[tool.setuptools.packages.find]
include = ['library*']
35 changes: 35 additions & 0 deletions docs/userguide/advanced/examples/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
from argparse import ArgumentParser

import parsl

from library.config import make_local_config
from library.app import convert_many_to_binary
from parsl.app.python import PythonApp

# Protect the script from running twice.
# See "Safe importing of main module" in Python multiprocessing docs
# https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming
if __name__ == "__main__":
# Get user instructions
parser = ArgumentParser()
parser.add_argument('--numbers-per-batch', default=8, type=int)
parser.add_argument('numbers', nargs='+', type=int)
args = parser.parse_args()

# Prepare the workflow functions
convert_app = PythonApp(convert_many_to_binary, cache=False)

# Load the configuration
# As a context manager so resources are shutdown on exit
with parsl.load(make_local_config()):

# Spawn tasks
futures = [
convert_app(args.numbers[start:start + args.numbers_per_batch])
for start in range(0, len(args.numbers), args.numbers_per_batch)
]

# Retrieve task results
for future in futures:
for x, b in zip(future.task_record['args'][0], future.result()):
print(f'{x} -> {"".join("1" if i else "0" for i in b)}')
13 changes: 13 additions & 0 deletions docs/userguide/advanced/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Advanced Topics
===============

More to learn about Parsl after starting a project.

.. toctree::
:maxdepth: 2

modularizing
usage_tracking
monitoring
parsl_perf
plugins
109 changes: 109 additions & 0 deletions docs/userguide/advanced/modularizing.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
.. _codebases:

Structuring Parsl programs
--------------------------

While convenient to build simple Parsl programs as a single Python file,
splitting a Parsl programs into multiple files and a Python module
has significant benefits, including:

1. Better readability
2. Logical separation of components (e.g., apps, config, and control logic)
3. Ease of reuse of components

Large applications that use Parsl often divide into several core components:

.. contents::
:local:
:depth: 2

The following sections use an example where each component is in a separate file:

.. code-block::
examples/logic.py
examples/app.py
examples/config.py
examples/__init__.py
run.py
pyproject.toml
Run the application by first installing the Python library and then executing the "run.py" script.

.. code-block:: bash
pip install . # Install module so it can be imported by workers
python run.py
Core application logic
======================

The core application logic should be developed without any deference to Parsl.
Implement capabilities, write unit tests, and prepare documentation
in which ever way works best for the problem at hand.

Parallelization with Parsl will be easy if the software already follows best practices.

The example defines a function to convert a single integer into binary.

.. literalinclude:: examples/library/logic.py
:caption: library/logic.py

Workflow functions
==================

Tasks within a workflow may require unique combinations of core functions.
Functions to be run in parallel must also meet :ref:`specific requirements <function-rules>`
that may complicate writing the core logic effectively.
As such, separating functions to be used as Apps is often beneficial.

The example includes a function to convert many integers into binary.

Key points to note:

- It is not necessary to have import statements inside the function.
Parsl will serialize this function by reference, as described in :ref:`functions-from-modules`.

- The function is not yet marked as a Parsl PythonApp.
Keeping Parsl out of the function definitions simplifies testing
because you will not need to run Parsl when testing the code.

- *Advanced*: Consider including Parsl decorators in the library if using complex workflow patterns,
such as :ref:`join apps <label-joinapp>` or functions which take :ref:`special arguments <special-kwargs>`.

.. literalinclude:: examples/library/app.py
:caption: library/app.py


Parsl configuration functions
=============================

Create Parsl configurations specific to your application needs as functions.
While not necessary, including the Parsl configuration functions inside the module
ensures they can be imported into other scripts easily.

Generating Parsl :class:`~parsl.config.Config` objects from a function
makes it possible to change the configuration without editing the module.

The example function provides a configuration suited for a single node.

.. literalinclude:: examples/library/config.py
:caption: library/config.py

Orchestration Scripts
=====================

The last file defines the workflow itself.

Such orchestration scripts, at minimum, perform at least four tasks:

1. *Load execution options* using a tool like :mod:`argparse`.
2. *Prepare workflow functions for execution* by creating :class:`~parsl.app.python.PythonApp` wrappers over each function.
3. *Create configuration then start Parsl* with the :meth:`parsl.load` function.
4. *Launch tasks and retrieve results* depending on the needs of the application.

An example run script is as follows

.. literalinclude:: examples/run.py
:caption: run.py
Loading

0 comments on commit f1eaa9b

Please sign in to comment.