Skip to content

Commit

Permalink
Merge branch 'develop' into feat/argilla-direct-feature-branch
Browse files Browse the repository at this point in the history
  • Loading branch information
frascuchon authored Oct 7, 2024
2 parents c25c88c + 485f3ff commit 6cc4d07
Show file tree
Hide file tree
Showing 15 changed files with 71 additions and 136 deletions.
1 change: 1 addition & 0 deletions argilla-server/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ These are the section headers that we use:
- Removed name pattern restrictions for Questions. ([#5573](https://github.com/argilla-io/argilla/pull/5573))
- Removed name pattern restrictions for Metadata Properties. ([#5573](https://github.com/argilla-io/argilla/pull/5573))
- Removed name pattern restrictions for Vector Settings. ([#5573](https://github.com/argilla-io/argilla/pull/5573))
- Removed name pattern validation for Workspaces, Datasets, and Users. ([#5575](https://github.com/argilla-io/argilla/pull/5575))

## [2.3.0](https://github.com/argilla-io/argilla/compare/v2.2.0...v2.3.0)

Expand Down
4 changes: 3 additions & 1 deletion argilla-server/docker/argilla-hf-spaces/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ RUN \
apt-get remove -y wget gnupg && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
rm -rf /packages
rm -rf /packages && \
# Install pwgen curl and jq
apt-get update && apt-get install -y curl jq pwgen

COPY config/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml

Expand Down
7 changes: 5 additions & 2 deletions argilla-server/docker/argilla-hf-spaces/scripts/start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,11 @@ export OAUTH2_HUGGINGFACE_CLIENT_ID=$OAUTH_CLIENT_ID
export OAUTH2_HUGGINGFACE_CLIENT_SECRET=$OAUTH_CLIENT_SECRET
export OAUTH2_HUGGINGFACE_SCOPE=$OAUTH_SCOPES

# Set the space author name as username if no provided.
# Set the space creator name as username if no name is provided, if the user is not found, use the provided space author name
# See https://huggingface.co/docs/hub/en/spaces-overview#helper-environment-variables for more details
export USERNAME="${USERNAME:-$SPACE_AUTHOR_NAME}"
DEFAULT_USERNAME=$(curl -L -s https://huggingface.co/api/users/${SPACES_CREATOR_USER_ID}/overview | jq -r '.user' || echo "${SPACE_AUTHOR_NAME}")
export USERNAME="${USERNAME:-$DEFAULT_USERNAME}"
DEFAULT_PASSWORD=$(pwgen -s 16 1)
export PASSWORD="${PASSWORD:-$DEFAULT_PASSWORD}"

honcho start
6 changes: 4 additions & 2 deletions argilla-server/src/argilla_server/api/schemas/v1/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,16 @@
except ImportError:
from typing_extensions import Annotated

DATASET_NAME_REGEX = r"^(?!-|_)[a-zA-Z0-9-_ ]+$"
DATASET_NAME_MIN_LENGTH = 1
DATASET_NAME_MAX_LENGTH = 200
DATASET_GUIDELINES_MIN_LENGTH = 1
DATASET_GUIDELINES_MAX_LENGTH = 10000

DatasetName = Annotated[
constr(regex=DATASET_NAME_REGEX, min_length=DATASET_NAME_MIN_LENGTH, max_length=DATASET_NAME_MAX_LENGTH),
constr(
min_length=DATASET_NAME_MIN_LENGTH,
max_length=DATASET_NAME_MAX_LENGTH,
),
Field(..., description="Dataset name"),
]

Expand Down
3 changes: 1 addition & 2 deletions argilla-server/src/argilla_server/api/schemas/v1/users.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
from argilla_server.enums import UserRole
from argilla_server.pydantic_v1 import BaseModel, Field, constr

USER_USERNAME_REGEX = "^(?!-|_)[A-za-z0-9-_]+$"
USER_PASSWORD_MIN_LENGTH = 8
USER_PASSWORD_MAX_LENGTH = 100

Expand All @@ -43,7 +42,7 @@ class Config:
class UserCreate(BaseModel):
first_name: constr(min_length=1, strip_whitespace=True)
last_name: Optional[constr(min_length=1, strip_whitespace=True)]
username: str = Field(regex=USER_USERNAME_REGEX, min_length=1)
username: str = Field(..., min_length=1)
role: Optional[UserRole]
password: str = Field(min_length=USER_PASSWORD_MIN_LENGTH, max_length=USER_PASSWORD_MAX_LENGTH)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,8 @@
from typing import List
from uuid import UUID

from argilla_server.constants import ES_INDEX_REGEX_PATTERN
from argilla_server.pydantic_v1 import BaseModel, Field

WORKSPACE_NAME_REGEX = ES_INDEX_REGEX_PATTERN


class Workspace(BaseModel):
id: UUID
Expand All @@ -33,7 +30,7 @@ class Config:


class WorkspaceCreate(BaseModel):
name: str = Field(regex=WORKSPACE_NAME_REGEX, min_length=1)
name: str = Field(min_length=1)


class Workspaces(BaseModel):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@
import typer
import yaml

from argilla_server.api.schemas.v1.users import USER_USERNAME_REGEX
from argilla_server.api.schemas.v1.workspaces import WORKSPACE_NAME_REGEX
from argilla_server.database import AsyncSessionLocal
from argilla_server.models import User, UserRole
from argilla_server.pydantic_v1 import BaseModel, Field, constr
Expand All @@ -31,12 +29,12 @@


class WorkspaceCreate(BaseModel):
name: str = Field(..., regex=WORKSPACE_NAME_REGEX, min_length=1)
name: str = Field(..., min_length=1)


class UserCreate(BaseModel):
first_name: constr(strip_whitespace=True)
username: str = Field(..., regex=USER_USERNAME_REGEX, min_length=1)
username: str = Field(..., min_length=1)
role: UserRole
api_key: constr(min_length=1)
password_hash: constr(min_length=1)
Expand Down
2 changes: 0 additions & 2 deletions argilla-server/src/argilla_server/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,4 @@
# The metadata field name prefix defined for protected (non-searchable) values
PROTECTED_METADATA_FIELD_PREFIX = "_"

ES_INDEX_REGEX_PATTERN = r"^(?!-|_)[a-z0-9-_]+$"

JS_MAX_SAFE_INTEGER = 9007199254740991
49 changes: 49 additions & 0 deletions argilla-server/tests/unit/api/handlers/v1/test_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -912,6 +912,28 @@ async def test_create_dataset_with_invalid_length_guidelines(
assert response.status_code == 422
assert (await db.execute(select(func.count(Dataset.id)))).scalar() == 0


@pytest.mark.parametrize(
"dataset_json",
[
{"name": ""},
{"name": "a" * (DATASET_NAME_MAX_LENGTH + 1)},
{"name": "test-dataset", "guidelines": ""},
{"name": "test-dataset", "guidelines": "a" * (DATASET_GUIDELINES_MAX_LENGTH + 1)},
],
)
async def test_create_dataset_with_invalid_settings(
self, async_client: "AsyncClient", db: "AsyncSession", owner_auth_header: dict, dataset_json: dict
):
workspace = await WorkspaceFactory.create()
dataset_json.update({"workspace_id": str(workspace.id)})

response = await async_client.post("/api/v1/datasets", headers=owner_auth_header, json=dataset_json)

assert response.status_code == 422
assert (await db.execute(select(func.count(Dataset.id)))).scalar() == 0


async def test_create_dataset_without_authentication(self, async_client: "AsyncClient", db: "AsyncSession"):
workspace = await WorkspaceFactory.create()
dataset_json = {"name": "name", "workspace_id": str(workspace.id)}
Expand Down Expand Up @@ -4486,6 +4508,33 @@ async def test_update_dataset(self, async_client: "AsyncClient", db: "AsyncSessi
assert dataset.guidelines == guidelines
assert dataset.allow_extra_metadata is allow_extra_metadata


@pytest.mark.parametrize(
"dataset_json",
[
{"name": None},
{"name": ""},
{"name": "a" * (DATASET_NAME_MAX_LENGTH + 1)},
{"name": "test-dataset", "guidelines": ""},
{"name": "test-dataset", "guidelines": "a" * (DATASET_GUIDELINES_MAX_LENGTH + 1)},
{"allow_extra_metadata": None},
],
)
@pytest.mark.asyncio
async def test_update_dataset_with_invalid_settings(
self, async_client: "AsyncClient", db: "AsyncSession", owner_auth_header: dict, dataset_json: dict
):
dataset = await DatasetFactory.create(
name="Current Name", guidelines="Current Guidelines", status=DatasetStatus.ready
)

response = await async_client.patch(
f"/api/v1/datasets/{dataset.id}", headers=owner_auth_header, json=dataset_json
)

assert response.status_code == 422


@pytest.mark.asyncio
async def test_update_dataset_with_invalid_payload(self, async_client: "AsyncClient", owner_auth_header: dict):
dataset = await DatasetFactory.create()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,23 +201,6 @@ async def test_create_user_with_existent_username(

assert (await db.execute(select(func.count(User.id)))).scalar() == 2

async def test_create_user_with_invalid_username(
self, db: AsyncSession, async_client: AsyncClient, owner_auth_header: dict
):
response = await async_client.post(
self.url(),
headers=owner_auth_header,
json={
"first_name": "First name",
"last_name": "Last name",
"username": "invalid username",
"password": "12345678",
},
)

assert response.status_code == 422
assert (await db.execute(select(func.count(User.id)))).scalar() == 1

async def test_create_user_with_invalid_min_length_first_name(
self, db: AsyncSession, async_client: AsyncClient, owner_auth_header: dict
):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,18 +87,6 @@ async def test_create_workspace_with_existent_name(

assert (await db.execute(select(func.count(Workspace.id)))).scalar() == 1

async def test_create_workspace_with_invalid_name(
self, db: AsyncSession, async_client: AsyncClient, owner_auth_header: dict
):
response = await async_client.post(
self.url(),
headers=owner_auth_header,
json={"name": "invalid name"},
)

assert response.status_code == 422
assert (await db.execute(select(func.count(Workspace.id)))).scalar() == 0

async def test_create_workspace_with_invalid_min_length_name(
self, db: AsyncSession, async_client: AsyncClient, owner_auth_header: dict
):
Expand Down
27 changes: 2 additions & 25 deletions argilla-server/tests/unit/cli/database/users/test_create.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@
from typing import TYPE_CHECKING

import pytest
from argilla_server.contexts import accounts
from argilla_server.models import User, UserRole, Workspace
from click.testing import CliRunner
from typer import Typer

from argilla_server.contexts import accounts
from argilla_server.models import User, UserRole, Workspace
from tests.factories import UserSyncFactory, WorkspaceSyncFactory

if TYPE_CHECKING:
Expand Down Expand Up @@ -131,17 +131,6 @@ def test_create_with_input_username(sync_db: "Session", cli_runner: CliRunner, c
assert sync_db.query(User).filter_by(username="username").first()


def test_create_with_invalid_username(sync_db: "Session", cli_runner: CliRunner, cli: Typer):
result = cli_runner.invoke(
cli,
"database users create --first-name first-name --username -Invalid-Username --password 12345678 --role owner",
)

assert result.exit_code == 1
assert sync_db.query(User).count() == 0
assert sync_db.query(Workspace).count() == 0


def test_create_with_existing_username(sync_db: "Session", cli_runner: CliRunner, cli: Typer):
UserSyncFactory.create(username="username")

Expand Down Expand Up @@ -243,15 +232,3 @@ def test_create_with_existent_workspaces(sync_db: "Session", cli_runner: CliRunn
user = sync_db.query(User).filter_by(username="username").first()
assert user
assert [ws.name for ws in user.workspaces] == ["workspace-a", "workspace-b", "workspace-c"]


def test_create_with_invalid_workspaces(sync_db: "Session", cli_runner: CliRunner, cli: Typer):
result = cli_runner.invoke(
cli,
"database users create --first-name first-name --username username --role owner --password 12345678 "
"--workspace workspace-a --workspace 'invalid workspace' --workspace workspace-c",
)

assert result.exit_code == 1
assert sync_db.query(User).count() == 0
assert sync_db.query(Workspace).count() == 0
24 changes: 0 additions & 24 deletions argilla-server/tests/unit/cli/database/users/test_migrate.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,30 +96,6 @@ def test_migrate_with_one_user_file(monkeypatch, sync_db: "Session", cli_runner:
assert [ws.name for ws in user.workspaces] == ["john", "argilla", "team"]


def test_migrate_with_invalid_user(monkeypatch, sync_db: "Session", cli_runner: CliRunner, cli: Typer):
mock_users_file = os.path.join(os.path.dirname(__file__), "test_user_files", "users_invalid_user.yml")

with mock.patch.dict(os.environ, {"ARGILLA_LOCAL_AUTH_USERS_DB_FILE": mock_users_file}):
result = cli_runner.invoke(cli, "database users migrate")

assert result.exit_code == 1
assert sync_db.query(User).count() == 0
assert sync_db.query(Workspace).count() == 0
assert sync_db.query(WorkspaceUser).count() == 0


def test_migrate_with_invalid_workspace(monkeypatch, sync_db: "Session", cli_runner: CliRunner, cli: Typer):
mock_users_file = os.path.join(os.path.dirname(__file__), "test_user_files", "users_invalid_workspace.yml")

with mock.patch.dict(os.environ, {"ARGILLA_LOCAL_AUTH_USERS_DB_FILE": mock_users_file}):
result = cli_runner.invoke(cli, "database users migrate")

assert result.exit_code == 1
assert sync_db.query(User).count() == 0
assert sync_db.query(Workspace).count() == 0
assert sync_db.query(WorkspaceUser).count() == 0


def test_migrate_with_nonexistent_file(monkeypatch, sync_db: "Session", cli_runner: CliRunner, cli: Typer):
with mock.patch.dict(os.environ, {"ARGILLA_LOCAL_AUTH_USERS_DB_FILE": "nonexistent.yml"}):
result = cli_runner.invoke(cli, "database users migrate")
Expand Down
27 changes: 1 addition & 26 deletions argilla-server/tests/unit/security/test_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,13 @@


import pytest

from argilla_server.api.schemas.v1.users import User, UserCreate
from argilla_server.api.schemas.v1.workspaces import WorkspaceCreate

from tests.factories import UserFactory
from tests.pydantic_v1 import ValidationError


@pytest.mark.parametrize("invalid_name", ["work space", "work/space", "work.space", "_", "-"])
def test_workspace_create_invalid_name(invalid_name: str):
with pytest.raises(ValidationError):
WorkspaceCreate(name=invalid_name)


@pytest.mark.parametrize(
"username",
[
Expand All @@ -49,25 +43,6 @@ def test_user_create(username: str):
assert UserCreate(first_name="first-name", username=username, password="12345678")


@pytest.mark.parametrize(
"invalid_username",
[
"user name",
"user/name",
"user.name",
"_",
"-",
"-1234",
"_1234",
"_mark",
"-mark",
],
)
def test_user_create_invalid_username(invalid_username: str):
with pytest.raises(ValidationError):
UserCreate(first_name="first-name", username=invalid_username, password="12345678")


@pytest.mark.asyncio
async def test_user_first_name():
user = await UserFactory.create(first_name="first-name", workspaces=[])
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,7 @@ Creating an Argilla Space within an organization is useful for several scenarios
- **You want manage the Space together with other users** (e.g., Space settings, etc.). Note that if you just want to manage your Argilla datasets, workspaces, you can achieve this by adding other Argilla `owner` roles to your Argilla Server.
- **More generally, you want to make available your space under an organization/community umbrella**.

The steps are very similar the [Quickstart guide](quickstart.md) with two important differences:

!!! tip "Setup USERNAME"
You need to **set up the `USERNAME` Space Secret with your Hugging Face username**. This way, the first time you enter with the `Hugging Face Sign in` button, you'll be granted the `owner` role.
The steps are very similar the [Quickstart guide](quickstart.md) with one important difference:

!!! tip "Enable Persistent Storage `SMALL`"
Not setting persistent storage to `Small` means that **you will loose your data when the Space restarts**.
Expand Down Expand Up @@ -118,16 +115,6 @@ client = rg.Argilla(

## Space Secrets overview

There's two optional secrets to set up the `USERNAME` and `PASSWORD` of the `owner` of the Argilla Space. Remember that, by default Argilla Spaces are configured with a *Sign in with Hugging Face* button, which is also used to grant an `owner` to the creator of the Space for personal spaces.

The `USERNAME` and `PASSWORD` are only useful in a couple of scenarios:

- You have [disabled Hugging Face OAuth](#how-to-configure-and-disable-oauth-access).
- You want to [set up Argilla under an organization](#how-to-deploy-argilla-under-a-hugging-face-organization) and want your Hugging Face username to be granted the `owner` role.
Remember that, by default, Argilla Spaces are configured with a *Sign in with Hugging Face* button, which is also used to grant an `owner` to the creator of the Space. There are two optional secrets to set up the `USERNAME` and `PASSWORD` of the `owner` of Argilla Space. Those are useful when you want to create a different owner user to login into Argilla.

In summary, when setting up a Space:
!!! info "Creating a Space under your personal account"
If you are creating the Space under your personal account, **don't insert any value for `USERNAME` and `PASSWORD`**. Once you launch the Space you will be able to Sign in with your Hugging Face username and the `owner` role.

!!! info "Creating a Space under an organization"
If you are creating the Space under an organization **make sure to insert your Hugging Face username in the secret `USERNAME`**. In this way, you'll be able to Sign in with your Hugging Face user.

0 comments on commit 6cc4d07

Please sign in to comment.