Skip to content

Commit

Permalink
[ENHANCEMENT] improve argilla deployments when running on spaces (#5255)
Browse files Browse the repository at this point in the history
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This is the main feature branch to improve the Argilla deployments when
running on spaces.

The main features that include this PR are:

- The OAUTH client configuration will be automatically read from the HF
environment (the `hf_oauth` flag must be set to `true`. See the
[docs](https://huggingface.co/docs/hub/en/spaces-oauth#create-an-oauth-app)
for more details)
- If users want to create a specific owner without using the OAuth flow,
`USERNAME` and `PASSWORD` env variables are available for that purpose
(By default, `USERNAME` will be filled with the `SPACE_AUTHOR_NAME`
value)
- The Argilla template will be set to running Argilla with OAuth enabled
by default (users don't need to fine-tune anything after cloning the
template).
- Workspaces defined in `.oauth.yaml` will be created automatically (By
default, the template will provide the `argilla` workspace)
- When the space author is an HF username, that user will be the argilla
`owner` for the Argilla server.
- When the space author is an HF organization, user roles will be
computed from roles in the ORG -> (for now admin roles in the HF org
will be mapped as `owner` roles in Argilla. The rest will be mapped as
the `annotator` role).


## Tasks
- [X] Update image README.md
- [x] [Simplify environment
variables](#5256)
- [x] [Create a single user providing USERNAME and PASSWORD env
variables (temporal
solution)](#5256)
- [x] [Reading injected `OAUTH_CLIENT_ID` and `OAUTH_CLIENT_SECRET` to
avoid the OAuth app configuration
step.](#5262)
- [x] [Create the user with roles depending on the space privileges
(user space VS org
space)](#5299)
- [x] [Create workspaces configured in `.oauth.yaml::allowed_workspaces`
](#5287)
- [ ] Update docs
- [x] [Rename quickstart image to
`argilla-hf-spaces`](#5307)

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: José Francisco Calvo <[email protected]>
  • Loading branch information
frascuchon and jfcalvo authored Jul 24, 2024
1 parent 122a17a commit 2774686
Show file tree
Hide file tree
Showing 36 changed files with 677 additions and 364 deletions.
29 changes: 13 additions & 16 deletions .github/workflows/argilla-server.build-docker-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,14 @@ jobs:
echo "PLATFORMS=linux/amd64,linux/arm64" >> $GITHUB_ENV
echo "IMAGE_TAG=v$PACKAGE_VERSION" >> $GITHUB_ENV
echo "SERVER_DOCKER_IMAGE=argilla/argilla-server" >> $GITHUB_ENV
echo "QUICKSTART_DOCKER_IMAGE=argilla/argilla-quickstart" >> $GITHUB_ENV
echo "HF_SPACES_DOCKER_IMAGE=argilla/argilla-hf-spaces" >> $GITHUB_ENV
echo "DOCKER_USERNAME=$DOCKER_USERNAME" >> $GITHUB_ENV
echo "DOCKER_PASSWORD=$DOCKER_PASSWORD" >> $GITHUB_ENV
else
echo "PLATFORMS=linux/amd64" >> $GITHUB_ENV
echo "IMAGE_TAG=$DOCKER_IMAGE_TAG" >> $GITHUB_ENV
echo "SERVER_DOCKER_IMAGE=argilladev/argilla-server" >> $GITHUB_ENV
echo "QUICKSTART_DOCKER_IMAGE=argilladev/argilla-quickstart" >> $GITHUB_ENV
echo "HF_SPACES_DOCKER_IMAGE=argilladev/argilla-hf-spaces" >> $GITHUB_ENV
echo "DOCKER_USERNAME=$DOCKER_USERNAME_DEV" >> $GITHUB_ENV
echo "DOCKER_PASSWORD=$DOCKER_PASSWORD_DEV" >> $GITHUB_ENV
fi
Expand Down Expand Up @@ -92,7 +92,6 @@ jobs:
uses: docker/build-push-action@v5
with:
context: argilla-server/docker/server
file: argilla-server/docker/server/Dockerfile
platforms: ${{ env.PLATFORMS }}
tags: ${{ env.SERVER_DOCKER_IMAGE }}:${{ env.IMAGE_TAG }}
labels: ${{ steps.meta.outputs.labels }}
Expand All @@ -103,35 +102,33 @@ jobs:
uses: docker/build-push-action@v5
with:
context: argilla-server/docker/server
file: argilla-server/docker/server/Dockerfile
platforms: ${{ env.PLATFORMS }}
tags: ${{ env.SERVER_DOCKER_IMAGE }}:latest
labels: ${{ steps.meta.outputs.labels }}
push: true

- name: Build and push `argilla-quickstart` image
- name: Build and push `argilla-hf-spaces` image
uses: docker/build-push-action@v5
with:
context: argilla-server/docker/quickstart
file: argilla-server/docker/quickstart/Dockerfile
context: argilla-server/docker/argilla-hf-spaces
platforms: ${{ env.PLATFORMS }}
tags: ${{ env.QUICKSTART_DOCKER_IMAGE }}:${{ env.IMAGE_TAG }}
tags: ${{ env.HF_SPACES_DOCKER_IMAGE }}:${{ env.IMAGE_TAG }}
labels: ${{ steps.meta.outputs.labels }}
build-args: |
ARGILLA_SERVER_IMAGE=${{ env.SERVER_DOCKER_IMAGE }}
ARGILLA_VERSION=${{ env.IMAGE_TAG }}
push: true

- name: Push latest `argilla-quickstart` image
- name: Push latest `argilla-hf-spaces` image
if: ${{ inputs.is_release && inputs.publish_latest }}
uses: docker/build-push-action@v5
with:
context: argilla-server/docker/quickstart
file: argilla-server/docker/quickstart/Dockerfile
context: argilla-server/docker/argilla-hf-spaces
platforms: ${{ env.PLATFORMS }}
tags: ${{ env.QUICKSTART_DOCKER_IMAGE }}:latest
tags: ${{ env.HF_SPACES_DOCKER_IMAGE }}:latest
labels: ${{ steps.meta.outputs.labels }}
build-args: |
ARGILLA_SERVER_IMAGE=${{ env.SERVER_DOCKER_IMAGE }}
ARGILLA_VERSION=${{ env.IMAGE_TAG }}
push: true

Expand All @@ -141,14 +138,14 @@ jobs:
with:
username: ${{ env.DOCKER_USERNAME }}
password: ${{ env.DOCKER_PASSWORD }}
repository: argilla/argilla-server
repository: $${{ env.SERVER_DOCKER_IMAGE }}
readme-filepath: argilla-server/README.md

- name: Docker Hub Description for `argilla-quickstart`
- name: Docker Hub Description for `argilla-hf-spaces`
uses: peter-evans/dockerhub-description@v4
if: ${{ inputs.is_release && inputs.publish_latest }}
with:
username: ${{ secrets.AR_DOCKER_USERNAME }}
password: ${{ secrets.AR_DOCKER_PASSWORD }}
repository: argilla/argilla-quickstart
readme-filepath: argilla-server/docker/quickstart/README.md
repository: $${{ env.HF_SPACES_DOCKER_IMAGE }}
readme-filepath: argilla-server/docker/argilla-hf-spaces/README.md
2 changes: 2 additions & 0 deletions argilla-server/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ These are the section headers that we use:
- Added new `ARGILLA_DATABASE_POSTGRESQL_MAX_OVERFLOW` environment variable allowing to set the number of connections that can be opened above and beyond the `ARGILLA_DATABASE_POSTGRESQL_POOL_SIZE` setting. ([#5220](https://github.com/argilla-io/argilla/pull/5220))
- Added new `Server-Timing` header to all responses with the total time in milliseconds the server took to generate the response. ([#5239](https://github.com/argilla-io/argilla/pull/5239))
- Added `REINDEX_DATASETS` environment variable to Argilla server Docker image. ([#5268](https://github.com/argilla-io/argilla/pull/5268))
- Added `argilla-hf-spaces` docker image for running Argilla server in HF spaces. ([#5307](https://github.com/argilla-io/argilla/pull/5307))

### Changed

Expand All @@ -51,6 +52,7 @@ These are the section headers that we use:
- [breaking] Removed support for `response_status` query param for endpoints `POST /api/v1/me/datasets/:dataset_id/records/search` and `POST /api/v1/datasets/:dataset_id/records/search`. ([#5163](https://github.com/argilla-io/argilla/pull/5163))
- [breaking] Removed support for `metadata` query param for endpoints `POST /api/v1/me/datasets/:dataset_id/records/search` and `POST /api/v1/datasets/:dataset_id/records/search`. ([#5156](https://github.com/argilla-io/argilla/pull/5156))
- [breaking] Removed support for `sort_by` query param for endpoints `POST /api/v1/me/datasets/:dataset_id/records/search` and `POST /api/v1/datasets/:dataset_id/records/search`. ([#5166](https://github.com/argilla-io/argilla/pull/5166))
- Removed argilla quickstart docker image (Older versions are still available). ([#5307](https://github.com/argilla-io/argilla/pull/5307))

## [2.0.0rc1](https://github.com/argilla-io/argilla/compare/v1.29.0...v2.0.0rc1)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ FROM ${ARGILLA_SERVER_IMAGE}:${ARGILLA_VERSION}
USER root

# Copy Argilla distribution files
COPY scripts/start_quickstart_argilla.sh /home/argilla
COPY scripts/start.sh /home/argilla
COPY scripts/start_argilla_server.sh /home/argilla
COPY Procfile /home/argilla
COPY requirements.txt /packages/requirements.txt
Expand All @@ -31,7 +31,7 @@ RUN \
chown argilla:argilla /etc/default/elasticsearch && \
# Install quickstart image dependencies
pip install -r /packages/requirements.txt && \
chmod +x /home/argilla/start_quickstart_argilla.sh && \
chmod +x /home/argilla/start.sh && \
chmod +x /home/argilla/start_argilla_server.sh && \
# Give ownership of the data directory to the argilla user
chown -R argilla:argilla /data && \
Expand All @@ -52,20 +52,9 @@ USER argilla
ENV ELASTIC_CONTAINER=true
ENV ES_JAVA_OPTS="-Xms1g -Xmx1g"

ENV OWNER_USERNAME=owner
ENV OWNER_PASSWORD=12345678
ENV OWNER_API_KEY=owner.apikey

ENV ADMIN_USERNAME=admin
ENV ADMIN_PASSWORD=12345678
ENV ADMIN_API_KEY=admin.apikey

ENV ANNOTATOR_USERNAME=argilla
ENV ANNOTATOR_PASSWORD=12345678
ENV USERNAME=""
ENV PASSWORD=""

ENV ARGILLA_HOME_PATH=/data/argilla
ENV ARGILLA_WORKSPACE=$ADMIN_USERNAME

ENV UVICORN_PORT=6900

CMD ["/bin/bash", "start_quickstart_argilla.sh"]
CMD ["/bin/bash", "start.sh"]
File renamed without changes.
15 changes: 15 additions & 0 deletions argilla-server/docker/argilla-hf-spaces/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<h1 align="center">
<a href=""><img src="https://github.com/dvsrepo/imgs/raw/main/rg.svg" alt="Argilla" width="150"></a>
<br>
Argilla
<br>
</h1>

> This Docker image corresponds to the **Argilla Hugging Face Spaces deployment** and **can only be used to deploy Argilla inside the Hugging Face Hub**. For other type of deployments check the Argilla docs.

Argilla is a **collaboration tool for AI engineers and domain experts** that require **high-quality outputs, data ownership, and overall efficiency**.

## Why use Argilla?

Whether you are working on monitoring and improving complex **generative tasks** involving LLM pipelines with RAG, or you are working on a **predictive task** for things like AB-testing of span- and text-classification models. Our versatile platform helps you ensure **your data work pays off**.
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/env bash

set -e

# Preset oauth env vars based on injected space variables.
# See https://huggingface.co/docs/hub/en/spaces-oauth#create-an-oauth-app
export OAUTH2_HUGGINGFACE_CLIENT_ID=$OAUTH_CLIENT_ID
export OAUTH2_HUGGINGFACE_CLIENT_SECRET=$OAUTH_CLIENT_SECRET
export OAUTH2_HUGGINGFACE_SCOPE=$OAUTH_SCOPES

echo "Running database migrations"
python -m argilla_server database migrate

# Set the space author name as username if no provided.
# See https://huggingface.co/docs/hub/en/spaces-overview#helper-environment-variables for more details
USERNAME="${USERNAME:-$SPACE_AUTHOR_NAME}"

if [ -n "$USERNAME" ] && [ -n "$PASSWORD" ]; then
echo "Creating owner user with username ${USERNAME}"
python -m argilla_server database users create \
--first-name "$USERNAME" \
--username "$USERNAME" \
--password "$PASSWORD" \
--role owner
else
echo "No username and password was provided. Skipping user creation"
fi

# Forcing reindex on restart since elasticsearch data could be allocated in a non-persistent volume
echo "Reindexing existing datasets"
python -m argilla_server search-engine reindex

# Start Argilla
echo "Starting Argilla"
python -m uvicorn argilla_server:app --host "0.0.0.0"
134 changes: 0 additions & 134 deletions argilla-server/docker/quickstart/README.md

This file was deleted.

40 changes: 0 additions & 40 deletions argilla-server/docker/quickstart/scripts/start_argilla_server.sh

This file was deleted.

4 changes: 2 additions & 2 deletions argilla-server/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -173,5 +173,5 @@ server-dev.composite = [
]
test = { cmd = "pytest", env_file = ".env.test" }

build-server-image = { shell = "cp -R dist docker/server && docker build -t argilla/argilla-server:local docker/server" }
build-quickstart-image = { shell = "docker build --build-arg ARGILLA_VERSION=local -t argilla/argilla-quickstart:local docker/quickstart" }
docker-build-argilla-server = { shell = "pdm build && cp -R dist docker/server && docker build -t argilla/argilla-server:local docker/server" }
docker-build-argilla-hf-spaces = { shell = "pdm run docker-build-argilla-server && docker build --build-arg ARGILLA_VERSION=local -t argilla/argilla-hf-spaces:local docker/argilla-hf-spaces" }
Loading

0 comments on commit 2774686

Please sign in to comment.