From 85b10309eda9efda2e7f0ac0ecf9476ae59c193f Mon Sep 17 00:00:00 2001 From: shane knapp Date: Thu, 24 Oct 2024 10:45:19 -0700 Subject: [PATCH] updating the ORM database cleanup doc --- docs/tasks/remove-users-orm.qmd | 56 +++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 10 deletions(-) diff --git a/docs/tasks/remove-users-orm.qmd b/docs/tasks/remove-users-orm.qmd index 6440f0476..c91d0eb16 100644 --- a/docs/tasks/remove-users-orm.qmd +++ b/docs/tasks/remove-users-orm.qmd @@ -15,19 +15,55 @@ a while. Note that this does not delete the user's storage. The script `scripts/delete-unused-users.py` will delete anyone who hasn't registered any activity in a given period of time, double checking to make sure they aren't active right now. This will require users to log in again the next -time they use the hub, but that is probably fine. +time they use the hub. This should be done before the start of each semester, particularly on hubs with a lot of users. -## Run the script +## Running the script -You can run the script on your own device. The script depends on the -`jhub_client` python library. This can be installed with -`pip install jhub_client`. +``` +./delete-unused-users.py --help +usage: delete-unused-users.py [-h] [-c CREDENTIALS] [-H HUB_URL] [--dry_run] + [--inactive_since INACTIVE_SINCE] [-v] [-d] -1. You will need to acquire a JupyterHub API token with administrative - rights. A hub admin can go to `{hub_url}/hub/token` to create a new - one. -2. Set the environment variable `JUPYTERHUB_API_TOKEN` to the token. -3. Run `python scripts/delete-unused-users.py --hub_url {hub_url}` +options: + -h, --help show this help message and exit + -c CREDENTIALS, --credentials CREDENTIALS + Path to a json file containing hub url and api keys. + Format is: {"hub1_url": "hub1_key", "hub2_url":, "hub2_key"} + -H HUB_URL, --hub_url HUB_URL + Fully qualified URL to the JupyterHub. You must also + set the JUPYTERHUB_API_TOKEN environment variable with + the API key. + --dry_run Dry run without deleting users. + --inactive_since INACTIVE_SINCE + Period of inactivity after which users are considered + for deletion (literal string constructor values for + timedelta objects). + -v, --verbose Set info log level. + -d, --debug Set debug log level. +``` + +The 'best' way to run this script is to log in to each hub and in the Admin +page, generate a token. The URL will be `{hub_url}/hub/token`. You can store +the tokens in a json-like configuration file on your device with the following +format: + +``` +{ + "https://a11y.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://astro.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://biology.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://cee.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://data8.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://data100.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", + "https://data101.datahub.berkeley.edu": "XXXXXXXXXXXXXXXXXXXXXXXXXXX", +} +``` + +Then you can execute the script as such: + +``` +./delete-unused-users.py -c ~/.datahub/hub-api-tokens.json -v --inactive_since=days=30 +```