Skip to content

Commit

Permalink
docs/document how to setup hf oauth (#5125)
Browse files Browse the repository at this point in the history
# Pull Request Template
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

COMMENTS:
- I'm not sure about the index shown due to the installation.md file (as
it encompass the library installation, mention the server and some
developer info), maybe it would be useful to also add a basic docker.md
as they are the two basic ways of deploying.
- I also thought of mentioning that they can use Oauth with other apps,
but while we don't add other type of deployments (community ones) I
decided not to confuse.

Closes #5091 

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->
mkdocs serve

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation

---------

Co-authored-by: David Berenstein <[email protected]>
  • Loading branch information
sdiazlor and davidberenstein1957 authored Jul 5, 2024
1 parent 1e6cb47 commit b31e607
Show file tree
Hide file tree
Showing 7 changed files with 155 additions and 7 deletions.
7 changes: 2 additions & 5 deletions argilla-server/docker/quickstart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,5 @@ This will run the latest quickstart docker image with 3 users `owner`, `admin`,
- `ANNOTATOR_PASSWORD`: This sets a custom password for login into the app with the `argilla` username. The default password
is `12345678`. By setting up a custom password you can use your own password to login into the app.
- `ARGILLA_WORKSPACE`: The name of a workspace that will be created and used by default for admin and annotator users. The default value will be the one defined by `ADMIN_USERNAME` environment variable.
- `LOAD_DATASETS`: This variables will allow you to load sample datasets. The default value will be `full`. The
supported values for this variable is as follows:
1. `single`: Load single datasets for Feedback task.
2. `full`: Load all the sample datasets for NLP tasks (Feedback, TokenClassification, TextClassification, Text2Text)
3. `none`: No datasets being loaded.
- `OAUTH2_HUGGINGFACE_CLIENT_ID`: The Client ID of your connected app.
- `OAUTH2_HUGGINGFACE_CLIENT_SECRET`: The app secret of your connected app.
147 changes: 147 additions & 0 deletions argilla/docs/getting_started/huggingface-spaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
description: In this section, we will provide a step-by-step guide for setting up Argilla in Hugging Face Spaces.
---

# Set up in Hugging Face Spaces

This guide explains how to deploy your Argilla app in a Hugging Face Space and configure it according to your preferences.

> For more information about Hugging Face Spaces, check the [documentation](https://huggingface.co/docs/hub/en/spaces-overview).

## Deploy Argilla on Spaces

1. Access the [Hugging Face template](https://huggingface.co/new-space?template=argilla/argilla-template-space). The template will show the Argilla Space SDK preselected.
2. Fill in the template. Take into account the following:

- Enable **Persistent Storage** not to lose your data. Unless you’re using a paid Space upgrade, any Space will be shut down after 48 hours of inactivity, resulting in data loss. You can do it later as detailed in ["Enable persistent storage"](#enable-persistent-storage), but all the data previous to the set up will be lost.
- Configure the **Space secrets** to manage the credentials. The default credentials and instructions for adding them later are described in ["Configure the Space secrets"](#configure-the-space-secrets).
- Set up the **visibility** to make your Space `public` or `private`. To access your private Space from the Argilla SDK, you will need to [configure and provide a `HF_TOKEN`](https://huggingface.co/settings/tokens)

3. Click "Create Space". The status will change to `Building` and, once it switches to `Running`, your Space will be ready. If you don't see the Argilla login UI, refresh the page.

!!! note "Argilla Space direct URL"

The Argilla Space direct URL provides access to a full-screen, stable Argilla instance.

It also serves as the `api_url` for reading and writing datasets using the Argilla SDK.

This URL is constructed as follows: `https://[your-owner-name]-[your_space_name].hf.space`. For instance, if the owner of the Space is `argilla` and your Space name is `demo`, it will be `https://argilla-demo.hf.space`.


## Enable persistent storage

!!! warning
Activate persistent storage not to loose your data. If you are not using a paid Space upgrade, any Space will be shut down after 48 hours of inactivity. When restarted, it will undergo a factory reset, resulting in the loss of all your data.

Make sure to enable persistent storage before adding any data, as the Space will restart, removing all the previous data.

If you haven’t enabled persistent storage, Argilla will display a warning message by default. To prevent this warning from appearing if you don’t need persistent storage, set the environment variable `ARGILLA_SHOW_HUGGINGFACE_SPACE_PERSISTENT_STORAGE_WARNING` to `false` in "Settings" > "New variable". This will suppress the warning message.

To enable [persistent storage](https://huggingface.co/docs/hub/spaces-storage#persistent-storage), go to the "Settings" tab on your created Space and click on the desired plan on the "Persistent Storage" section.

## Configure the Space secrets

[Secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets) can be configured as optional settings to secure your Argilla Space.

The Argilla template allows you to configure the credentials for the default users: `owner`, `admin` and `annotator`. It will also allow you to configure the name of the default workspace.

> Check this [guide](../how_to_guides/user.md) to learn more about users, and refer to this [guide](../how_to_guides/workspace.md) for more information about workspaces in Argilla.
Moreover, it provides two secrets to configure the Hugging Face Oauth. For more information, check the [next section](#sign-in-with-hugging-face).

You can add the available secrets in "Settings" > "New secret".

??? note "Available secrets"

| Secret name | Type | Description | Default value |
|-------------------------------|----------|-------------|----------------|
| `OWNER_USERNAME` | string | A username to log in with owner permissions | owner |
| `OWNER_PASSWORD` | string | A password for the `OWNER_USERNAME` | 12345678 |
| `OWNER_API_KEY` | string or online generator |An `api_key` to interact with the SDK with owner permissions | owner.apikey |
| `ADMIN_USERNAME` | string |A username to log in with admin permissions | admin |
| `ADMIN_PASSWORD` | string | A password for the `ADMIN_USERNAME` | 12345678 |
| `ADMIN_API_KEY` | string or online generator|An `api_key` to interact with the SDK with admin permissions | admin.apikey |
| `ANNOTATOR_USERNAME` | string |A username to log in with annotator permissions | argilla |
| `ANNOTATOR_PASSWORD` | string |A password for the `ANNOTATOR_PASSWORD` | 12345678 |
| `ARGILLA_WORKSPACE` | string | The name of the default workspace | admin |
| `OAUTH2_HUGGINGFACE_CLIENT_ID` | ID | The Client ID of your connected app | |
| `OAUTH2_HUGGINGFACE_CLIENT_SECRET` | ID | The app secret of your connected app | |


## Sign in with Hugging Face

The Hugging Face authentication allows you to give access to your Argilla Space to users that are logged in to the Hugging Face Hub.

!!! tip
This feature is especially helpful for public crowdsourcing projects. The users logging in with OAuth will have the annotator role.

To have more control over who can log in to the Space, you can set this up in a private Space so that only members of your organization can sign in.

To create users with admin or owner roles, you can [create users](../how_to_guides/user.md) and access with their credentials.

!!! note
If you duplicated an Argilla Space with OAuth instead of creating one from the template, make sure to also follow the next steps.

!!! warning
If persistent storage is not enabled, make sure to activate the Hugging Face authentication before adding any data, as the Space will need to be restarted with a "Factory build", removing all the previous data.

1. [Create an OAuth App in Hugging Face](https://huggingface.co/settings/applications/new) and fill in the information:

| Field | Value |
|--------------|----------|
| Homepage URL | [`[Your Argilla Space direct URL]`](#deploy-argilla-on-spaces) |
| Logo URL | `[Your Argilla Space direct URL]/favicon.ico` |
| Scopes | `openid` and `profile` |
| Redirect URL | `[Your Argilla Space Direct URL]/oauth/huggingface/callback`

2. Add the created Client ID and App Secret to your Space. It can be done in two different ways.

=== "As secrets"

Add them to your Space in "Settings" > "New secret".

| Secret name | Value |
|-------------------------------|----------|
| `OAUTH2_HUGGINGFACE_CLIENT_ID` | [Your Client ID] |
| `OAUTH2_HUGGINGFACE_CLIENT_SECRET` | [Your App Secret] |

=== "As environment variables in `.oauth.yaml`"

Add them to your Space in "Files" > `.oauth.yaml`.

!!! warning
Be aware that the `.oauth.yaml` file is public in public spaces or may be accessible by other members of your organization in private spaces.

Therefore, we recommend setting these variables as environment secrets.

??? quote "Example code in `.oauth.yaml`"
```yaml
# This attribute will enable or disable the Hugging Face authentication
enabled: true

providers:
# The OAuth provider setup
# For now, only Hugging Face is supported
- name: huggingface
# This is the client ID of the OAuth app. You can find it in your Hugging Face settings.
# see https://huggingface.co/docs/hub/oauth#creating-an-oauth-app for more info.
# You can also provide it by using the env variable `OAUTH2_HUGGINGFACE_CLIENT_ID`
client_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX

# This is the client secret of the OAuth app. You can find it in your Hugging Face settings.
# See https://huggingface.co/docs/hub/oauth#creating-an-oauth-app for more info.
# We encourage you to provide it by using the env variable `OAUTH2_HUGGINGFACE_CLIENT_SECRET`
client_secret: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXX

# The scope of the OAuth app. At least `openid` and `profile` are required.
scope: openid profile

# This section defines the allowed workspaces for the oauth users.
# Workspaces defined here must exist in Argilla.
allowed_workspaces:
- name: admin
```

3. Check that the `enabled` parameter is set to `true` in "Files" > `.oauth.yaml`.
4. Go back to "Settings" and do a "Factory rebuild". Once the Space is restarted, you and your collaborators can sign and log in to your Space using their Hugging Face accounts.
2 changes: 1 addition & 1 deletion argilla/docs/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: Quickstart of Argilla on how to create your first dataset.

This guide provides a quick overview of the Argilla SDK and how to create your first dataset.

## Setting up your Argilla project
## Set up your Argilla project

### Install the SDK with pip

Expand Down
2 changes: 2 additions & 0 deletions argilla/docs/how_to_guides/user.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ Argilla provides a default user with the `owner` role to help you get started in
| Quickstart Docker and HF Space | owner | 12345678 | owner.apikey |
| Server image | argilla | 1234 | argilla.apikey |

For the new users, the username and password are set during the creation process. The API key is automatically generated and can be copied from the "Settings" section of the UI.

!!! info "Main Class"

```python
Expand Down
1 change: 1 addition & 0 deletions argilla/docs/scripts/gen_popular_issues.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ def fetch_data_from_github(repository, auth_token):
.reset_index()
)

df["Milestone"] = df["Milestone"].astype(str).fillna("")
planned_issues = df[
((df["Milestone"].str.startswith("v2")) & (df["State"] == "open"))
| ((df["Milestone"].str.startswith("2")) & (df["State"] == "open"))
Expand Down
2 changes: 1 addition & 1 deletion argilla/docs/tutorials/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ These are the tutorials for *the Argilla SDK*. They provide step-by-step instruc
---

Learn about a standard workflow to improve data quality for a text classification task.
[:octicons-arrow-right-24: How-to guide](text_classification.ipynb)
[:octicons-arrow-right-24: Tutorial](text_classification.ipynb)

</div>
1 change: 1 addition & 0 deletions argilla/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ nav:
- Getting started:
- Quickstart: getting_started/quickstart.md
- Installation: getting_started/installation.md
- Set up in Hugging Face Spaces: getting_started/huggingface-spaces.md
- FAQ: getting_started/faq.md
- How-to guides:
- how_to_guides/index.md
Expand Down

0 comments on commit b31e607

Please sign in to comment.