Skip to content

Commit

Permalink
Merge branch 'demo-firecrestspawner' into 'master'
Browse files Browse the repository at this point in the history
Add JupyterHub demo

See merge request firecrest/firecrest!327
  • Loading branch information
rsarm committed Dec 6, 2024
2 parents f51384a + 3dffa8a commit bcd6ce2
Show file tree
Hide file tree
Showing 7 changed files with 359 additions and 0 deletions.
9 changes: 9 additions & 0 deletions doc/source/usecases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,12 @@ FirecREST Operators for Airflow
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In this `example <https://github.com/eth-cscs/firecrest/examples/UI-code-flow>`__ we define an Airflow graph combining small tasks which run localy in a laptop with compute-intensive tasks that must run on an HPC system. The idea is to add in Airflow the support for executing the compute-intensive tasks in a supercomputer via FirecREST. For that we are going to write custom Airflow operators that will use FirecREST to access the HPC system.

JupyterHub with FirecRESTSpawner
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is a `tutorial <https://github.com/eth-cscs/firecrest/examples/jupyterhub>`__ on how to run `JupyterHub <https://jupyterhub.readthedocs.io/en/stable/>`__ with `FirecRESTSpawner <https://github.com/eth-cscs/firecrestspawner>`__ using the `Docker demo of FirecREST <https://github.com/eth-cscs/firecrest/tree/master/deploy/demo>`__.

FirecRESTSpawner is a tool for launching Jupyter Notebook servers from JupyterHub on HPC clusters through FirecREST.
It can be deployed on Kubernetes as part of JupyterHub and configured to target different systems.
In this tutorial, we will set up a simplified environment on a local machine, including a `Docker Compose <https://docs.docker.com/compose>`__ deployment of FirecREST, a single-node Slurm cluster and a `Keycloak <https://www.keycloak.org>`__ server which will be used as identity provider for the authentication. Then we will install JupyterHub locally and configure it to launch notebooks on the Slurm cluster.
36 changes: 36 additions & 0 deletions examples/jupyterhub/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
FROM --platform=linux/amd64 f7t-cluster

RUN yum update -y && \
yum install -y sqlite-devel openssl-devel bzip2-devel libffi-devel wget && \
yum groupinstall -y "Development Tools"

RUN wget https://github.com/openssl/openssl/releases/download/OpenSSL_1_1_1/openssl-1.1.1.tar.gz && \
tar -xzvf openssl-1.1.1.tar.gz && \
cd openssl-1.1.1 && \
./config --prefix=/usr/openssl1.1 --openssldir=/etc/ssl --libdir=lib no-shared zlib-dynamic && \
make && \
make install && \
rm -r ../openssl-1.1.1.tar.gz ../openssl-1.1.1

RUN wget https://www.python.org/ftp/python/3.10.2/Python-3.10.2.tgz && \
tar -xzf Python-3.10.2.tgz && \
cd Python-3.10.2 && \
export TCLTK_LIBS='-ltk8.5 -ltcl8.5' && \
./configure --enable-shared --with-openssl=/usr/openssl1.1 --with-openssl-rpath=auto --enable-optimizations && \
make altinstall

ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

RUN python3.10 -m venv /opt/jhub-env

RUN . /opt/jhub-env/bin/activate && \
python3.10 -m pip install --no-cache jupyterlab jupyterhub==4.1.6 pyfirecrest==2.1.0 SQLAlchemy==1.4.52 oauthenticator==16.3.1 notebook==7.2.1

RUN . /opt/jhub-env/bin/activate && \
git clone https://github.com/eth-cscs/firecrestspawner.git && \
cd firecrestspawner && \
pip install --no-cache . && \
cd .. && \
rm -r firecrestspawner

RUN localedef -i en_US -f UTF-8 en_US.UTF-8
158 changes: 158 additions & 0 deletions examples/jupyterhub/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# FirecRESTSpawner on the Docker demo


This tutorial explains how to run [JupyterHub](https://jupyterhub.readthedocs.io/en/stable/) with [FirecRESTSpawner](https://github.com/eth-cscs/firecrestspawner) using the [Docker demo of FirecREST](https://github.com/eth-cscs/firecrest/tree/master/deploy/demo).

FirecRESTSpawner is a tool for launching Jupyter Notebook servers from JupyterHub on HPC clusters through [FirecREST](https://firecrest.readthedocs.io/en/stable/).
It can be deployed on Kubernetes as part of JupyterHub and configured to target different systems.

In this tutorial, we will set up a simplified environment on a local machine, including:

-[Docker Compose](https://docs.docker.com/compose) deployment of FirecREST, a single-node Slurm cluster and a [Keycloak](https://www.keycloak.org) server which will be used as identity provider for the authentication
- a local installation of JupyterHub, configured to launch notebooks on the Slurm cluster

This deployment not only demonstrates the use case but also serves as a platform for testing and developing FirecRESTSpawner.


## Requirements

For this tutorial you will need

* a recent installation of Docker, which includes the `docker compose` command (or the older `docker-compose` command line tool)
* a Python installation (version 3.9 or higher)


## Setup


### Building images from FirecREST's Docker Compose demo

This tutorial builds on the Docker demo of FirecREST.
We will use a small [docker-compose.yaml](docker-compose.yaml) file to override some settings in the FirecREST demo.
This can be done by passing both files to the `docker compose` command.

To get started, let's clone the FirecREST repository

```bash
git clone https://github.com/eth-cscs/firecrest.git
```

and build the images used in it's [Docker Compose demo](https://github.com/eth-cscs/firecrest/tree/master/deploy/demo):

```bash
cd firecrest/deploy/demo/
docker compose build
```

This step takes a few minutes. In the meanwhile we can install JupyterHub on a local virtual environment.


### Install JupyterHub and FirecRESTSpawner

An easy way to install JupyterHub is via [Miniconda](https://docs.anaconda.com/miniconda/install/).
We need to [download the Miniconda installer](https://docs.anaconda.com/miniconda/install/) for our platforms and install it using the following command

```bash
bash Miniconda3-latest-<arch>.sh -p /path/to/mc-jhub -b
```

Here we use `-p` to pass the absolute path to the install directory and `-b` to accept the [terms of service](https://legal.anaconda.com/policies/en/).

We can activate our conda base environment and install configurable-http-proxy, JupyterHub and FirecRESTSpawner

```bash
. /path/to/mc-jhub/bin/activate
conda install -y configurable-http-proxy
pip install --no-cache jupyterhub==4.1.6 pyfirecrest==2.6.0 SQLAlchemy==1.4.52 oauthenticator==16.3.1 python-hostlist==1.23.0

git clone https://github.com/eth-cscs/firecrestspawner.git
cd firecrestspawner
pip install --no-cache .
```

## Deployment of FirecREST and Slurm cluster

Once all the images have been built we can move to the tutorial directory and deploy the [docker-compose.yaml](docker-compose.yaml).

```bash
cd firecrest/examples/jupyterhub
chmod 400 ../../deploy/test-build/environment/keys/ca-key ../../deploy/test-build/environment/keys/user-key
export JHUB_DOCKERFILE_DIR=$PWD
docker compose -f ../../deploy/demo/docker-compose.yml -f docker-compose.yml up --build
```

The `chmod` command we run before `docker compose` comes from the Docker demo of FirecREST.
It's needed to make the SSH private keys for accessing the Slurm cluster readable by their owner on the host machine.

This step will create a new image that extends the `f7t-cluster` image from the Docker demo of FirecREST to include JupyterLab and other requirements.
The process may take a few minutes, as some dependencies for JupyterLab need to be built from source.

Once that's finished, you can check that all containers are running

```bash
docker compose -p demo ps --format 'table {{.ID}}\t{{.Name}}\t{{.State}}'
```

That should show something like this

```bash
CONTAINER ID NAME STATE
fa355219633c certificator running
8dada9a2f57a cluster running
bd5f33b3b34e compute running
8b8029c9bec2 fckeycloak running
e66970df55a8 jaeger running
1be08e3707f4 kong running
9dd5a68a84b0 minio running
33ce4e9df9c5 opa running
b0cfba2eb816 openapi running
974356ee229a reservations running
143375c02912 status running
c424bca5efef storage running
5004cc49e1b8 taskpersistence running
163d91b0bd8d tasks running
5239294e62bb utilities running
```

When we are done with the tutorial, the deployment can be shutdown by pressing `ctrl+c` and then

```
cd firecrest/examples/jupyterhub
docker compose -f ../../deploy/demo/docker-compose.yml -f docker-compose.yml down
```

### Setting up the authorization

A requirement for running JupyterHub with FirecRESTSpawner is to use an authenticator that prompts users for login and password in exchange for an access token.
That token is then be passed to the spawner, allowing users to authenticate with FirecREST when submitting, stopping or polling for jobs.
For this purpose, we will use an Authorization Code Flow client, which we need to create on the Keycloak web interface.

Let's go to the [Clients page](http://localhost:8080/auth/admin/master/console/#/realms/kcrealm/clients) in Keycloak (username: admin, password: admin2) within the `kcrealm` realm.
We click on "Create" and then on "Select file".
A file system explorer will open.
Navigate to the tutorial's directory, choose the [jhub-client.json](jhub-client.json) file and click on "Save".

Once that's done, the client `jhub-client` can be seen listed on the "Clients" tab of the side panel.


### Launching JupyterHub

The [configuration file](jupyterhub-config.py) provided in this tutorial has all the settings needed for using JupyterHub with our deployment.

> Depending on the platform and Docker setup, you may need to adjust a few lines in the configuration to set the correct host IP address for the Docker bridge network.
> On most Linux systems, you can find this address with `ip addr show docker0`.
> It's typically `172.17.0.1`.
> If JupyterHub gets a timeout when launching a notebook, you can try replacing the two instances of `host.docker.internal` in the configuration by that ip.
Now we can run JupyterHub with

```bash
. /path/to/mc-jhub/bin/activate
. env.sh
jupyterhub --config jupyterhub-config.py --port 8003 --ip 0.0.0.0
```
Here we are sourcing the file [env.sh](env.sh) which defines environment variables needed by the spawner (more information can be found [here](https://firecrestspawner.readthedocs.io/en/latest/authentication.html)).
We use the port `8003` for the JupyterHub since the default one `8000` is already used for FirecREST in the deployment.
The ip `0.0.0.0` is necessary to allow JupyterLab to connect back to the JupyterHub.

JupyterHub should be accessible in the browser at [http://localhost:8003](http://localhost:8003/) (username: test1 and password: test11) and it should be possible to launch notebooks on the slurm cluster.
9 changes: 9 additions & 0 deletions examples/jupyterhub/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
services:
cluster:
image: "f7t-cluster:jhub"
build:
context: .
dockerfile: $JHUB_DOCKERFILE_DIR/Dockerfile
network: host
ports:
- "56123:56123"
4 changes: 4 additions & 0 deletions examples/jupyterhub/env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
export FIRECREST_URL=http://localhost:8000
export SA_CLIENT_ID=firecrest-sample
export SA_CLIENT_SECRET=b391e177-fa50-4987-beaf-e6d33ca93571
export SA_AUTH_TOKEN_URL=http://localhost:8080/auth/realms/kcrealm/protocol/openid-connect/token
74 changes: 74 additions & 0 deletions examples/jupyterhub/jhub-client.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
{
"clientId": "jhub-client",
"name": "jhub-client",
"surrogateAuthRequired": false,
"enabled": true,
"alwaysDisplayInConsole": false,
"clientAuthenticatorType": "client-secret",
"secret": "Ap45Agq2KpnbTXUGQxaUF1WiVOPm8Wf0",
"redirectUris": [
"http://localhost:8003/hub/oauth_callback"
],
"webOrigins": [],
"notBefore": 0,
"bearerOnly": false,
"consentRequired": false,
"standardFlowEnabled": true,
"implicitFlowEnabled": false,
"directAccessGrantsEnabled": false,
"serviceAccountsEnabled": false,
"publicClient": false,
"frontchannelLogout": false,
"protocol": "openid-connect",
"attributes": {
"saml.force.post.binding": "false",
"saml.multivalued.roles": "false",
"frontchannel.logout.session.required": "false",
"oauth2.device.authorization.grant.enabled": "false",
"backchannel.logout.revoke.offline.tokens": "false",
"saml.server.signature.keyinfo.ext": "false",
"use.refresh.tokens": "true",
"oidc.ciba.grant.enabled": "false",
"backchannel.logout.session.required": "true",
"client_credentials.use_refresh_token": "false",
"require.pushed.authorization.requests": "false",
"saml.client.signature": "false",
"saml.allow.ecp.flow": "false",
"id.token.as.detached.signature": "false",
"saml.assertion.signature": "false",
"client.secret.creation.time": "1731487602",
"saml.encrypt": "false",
"saml.server.signature": "false",
"exclude.session.state.from.auth.response": "false",
"saml.artifact.binding": "false",
"saml_force_name_id_format": "false",
"acr.loa.map": "{}",
"tls.client.certificate.bound.access.tokens": "false",
"saml.authnstatement": "false",
"display.on.consent.screen": "false",
"token.response.type.bearer.lower-case": "false",
"saml.onetimeuse.condition": "false"
},
"authenticationFlowBindingOverrides": {},
"fullScopeAllowed": true,
"nodeReRegistrationTimeout": -1,
"defaultClientScopes": [
"web-origins",
"acr",
"firecrest",
"roles",
"profile",
"email"
],
"optionalClientScopes": [
"address",
"phone",
"offline_access",
"microprofile-jwt"
],
"access": {
"view": true,
"configure": true,
"manage": true
}
}
69 changes: 69 additions & 0 deletions examples/jupyterhub/jupyterhub-config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
import secrets
from firecrestspawner.spawner import SlurmSpawner
from oauthenticator.generic import GenericOAuthenticator


def gen_hex_string(num_bytes=32, num_hex_strings=4):
"""Generate keys to encode the auth state"""
hex_strings = [secrets.token_hex(num_bytes)
for i in range(num_hex_strings)]
return hex_strings


c = get_config()

c.JupyterHub.authenticator_class = GenericOAuthenticator

# Keycloak setup
c.Authenticator.client_id = "jhub-client"
c.Authenticator.client_secret = "Ap45Agq2KpnbTXUGQxaUF1WiVOPm8Wf0"
c.Authenticator.oauth_callback_url = "http://localhost:8003/hub/oauth_callback"
c.Authenticator.authorize_url = "http://localhost:8080/auth/realms/kcrealm/protocol/openid-connect/auth"
c.Authenticator.token_url = "http://localhost:8080/auth/realms/kcrealm/protocol/openid-connect/token"
c.Authenticator.userdata_url = "http://localhost:8080/auth/realms/kcrealm/protocol/openid-connect/userinfo"
c.Authenticator.login_service = "http://localhost:8080"
c.Authenticator.username_claim = "preferred_username"
c.Authenticator.userdata_params = {"state": "state"}
c.Authenticator.scope = ["openid", "profile", "firecrest"]

# Hub access
c.Authenticator.admin_users = {"test1" }
c.Authenticator.allow_all = True

# Auth state enabled
c.Authenticator.enable_auth_state = True
c.CryptKeeper.keys = gen_hex_string()

c.JupyterHub.default_url = "/hub/home"

# Spawner setup
c.JupyterHub.spawner_class = SlurmSpawner
c.Spawner.req_host = "cluster"
c.Spawner.cmd = "firecrestspawner-singleuser jupyterhub-singleuser"
c.Spawner.enable_aux_fc_client = True
c.Spawner.node_name_template = "localhost"
c.Spawner.port = 56123
c.Spawner.batch_script = """#!/bin/bash
#SBATCH --job-name=jhub
export JUPYTERHUB_API_URL="http://host.docker.internal:8003/hub/api"
export JUPYTERHUB_ACTIVITY_URL="http://host.docker.internal:8003/hub/api/users/${USER}/activity"
export JUPYTERHUB_OAUTH_ACCESS_SCOPES=$(echo $JUPYTERHUB_OAUTH_ACCESS_SCOPES | base64 --decode)
export JUPYTERHUB_OAUTH_SCOPES=$(echo $JUPYTERHUB_OAUTH_SCOPES | base64 --decode)
export JUPYTERHUB_CRYPT_KEY=$(/usr/openssl1.1/bin/openssl rand -hex 32)
export PATH=/usr/openssl1.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
set -euo pipefail
. /opt/jhub-env/bin/activate
trap 'echo SIGTERM received' TERM
# {{prologue}}
{% if srun %}{{srun}}{% endif %} {{cmd}}
echo "jupyterhub-singleuser ended gracefully"
# {{epilogue}}
"""

0 comments on commit bcd6ce2

Please sign in to comment.