From 9fb55b0526ceb5148e4ffdaf6bb18322f436d89f Mon Sep 17 00:00:00 2001 From: Ben Burtenshaw Date: Wed, 12 Jun 2024 13:43:17 +0200 Subject: [PATCH 1/7] docs: add migration notebook to docs format --- .../migrate_from_legacy_datasets.md | 200 ++++++++++++++++++ 1 file changed, 200 insertions(+) create mode 100644 argilla/docs/how_to_guides/migrate_from_legacy_datasets.md diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md new file mode 100644 index 0000000000..e429a0954f --- /dev/null +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -0,0 +1,200 @@ +# Migrate your legacy datasets to Argilla V2 + +This guide will help you migrate task specific datasets to Argilla V2. These do not include the `FeedbackDataset` which is just an interim naming convention for the latest extensible dataset. Task specific datasets are datasets that are used for a specific task, such as text classification, token classification, etc. If you would like to learn about the backstory of SDK this migration, please refer to the [SDK migration blog post](https://argilla.io/blog/introducing-argilla-new-sdk/). + +!!! note + Legacy Datasets include: `DatasetForTextClassification`, `DatasetForTokenClassification`, and `DatasetForText2Text`. + + `FeedbackDataset`'s do not need to be migrated as they are already in the Argilla V2 format. + +To follow this guide, you will need to have the following prerequisites: + +- An argilla 1.* server instance running with legacy datasets. +- An argilla 2.* server instance running. If you don't have one, you can create one by following the [Argilla installation guide](../../getting_started/installation.md). +- The `argilla` sdk package installed in your environment. + +## Steps + +The guide will take you through three steps: + +1. **Retrieve the legacy dataset** from the Argilla V1 server using the new `argilla` package. +2. **Define the new dataset** in the Argilla V2 format. +3. **Upload the dataset records** to the new Argilla V2 dataset format and attributes. + + +### Step 1: Retrieve the legacy dataset + +Connect to the Argilla V1 server via the new `argilla` package. The new sdk contains a `v1` module that allows you to connect to the Argilla V1 server: + +```python +import argilla.v1 as rg_v1 + +# Initialize the API with an Argilla server less than 2.0 +api_url = "" +api_key = "" +rg_v1.init(api_url, api_key) +``` + +Next, load the dataset settings and records from the Argilla V1 server: + +```python +dataset_name = "news-programmatic-labeling" +workspace = "demo" + +settings_v1 = rg_v1.load_dataset_settings(dataset_name, workspace) +records_v1 = rg_v1.load(dataset_name, workspace, limit=100, query="_exists_:annotated_by") +hf_dataset = records_v1.to_datasets() +``` + +Your legacy dataset is now loaded into the `hf_dataset` object. + +### Step 2: Define the new dataset + +Define the new dataset in the Argilla V2 format. The new dataset format is defined in the `argilla` package. You can create a new dataset with the `Settings` and `Dataset` classes: + +First, instantiate the `Argilla` class to connect to the Argilla V2 server: + +```python +import argilla as rg + +client = rg.Argilla() +``` + +Next, define the new dataset settings: + +```python +settings = rg.Settings( + fields=[ + rg.TextField(name="text"), # (1) + ], + questions=[ + rg.LabelQuestion(name="label", labels=settings_v1.label_schema), # (2) + ], + metadata=[ + rg.TermsMetadataProperty(name="split"), # (3) + ], + vectors=[ + rg.VectorField(name='mini-lm-sentence-transformers', dimensions=384), # (4) + ], +) +``` + +1. The default name for text classification is `text`, but we should provide all names included in `record.inputs`. + +2. The basis question for text classification is a `LabelQuestion` for single-label or `MultiLabelQuestion` for multi-label classification. + +3. Here, we need to provide all relevant metadata fields. + +4. The vectors fields available in the dataset. + +Finally, create the new dataset on the Argilla V2 server: + +```python +dataset = rg.Dataset(name=dataset_name, settings=settings) +dataset.create() +``` + +!!! note + If a dataset with the same name already exists, the `create` method will raise an exception. You can check if the dataset exists and delete it before creating a new one. + + ```python + dataset = client.datasets(name=dataset_name) + + if dataset.exists(): + dataset.delete() + ``` + +### Step 3: Upload the dataset records + +To upload the records to the new server, we will need to convert the records from the Argilla V1 format to the Argilla V2 format. The new `argilla` sdk package uses a generic `Record` class, but legacy datasets have specific record classes. We will need to convert the records to the generic `Record` class. + +Here are a set of example functions to convert the records for single-label and multi-label classification. You can modify these functions to suit your dataset. + +=== "For single-label classification" + + ```python + def map_to_record_for_single_label(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + """ This function maps a text classification record dictionary to the new Argilla record.""" + suggestions = [] + responses = [] + vectors = [] + if data.get("prediction"): + # From data["prediction"] + label, score = data["prediction"][0].values() + agent = data.get("prediction_agent") + suggestions.append(rg.Suggestion(question_name="label", value=label, score=score, agent=agent)) + if data.get("annotation"): + # From data[annotation] and data[annotation_agent] + user_id = users_by_name.get(data["annotation_agent"], current_user).id + responses.append(rg.Response(question_name="label", value=data["annotation"], user_id=user_id)) + if data.get("vectors"): + # From data["vectors"] + vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] + + return rg.Record( + id=data["id"], + fields=data["inputs"], + # The inputs field should be a dictionary with the same keys as the `fields` in the settings + metadata=data["metadata"], + # The metadata field should be a dictionary with the same keys as the `metadata` in the settings + vectors=vectors, + suggestions=suggestions, + responses=responses, + ) + ``` + +=== "For multi-label classification" + + ```python + def map_to_record_for_multi_label(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + suggestions = [] + responses = [] + vectors = [] + if data.get("prediction"): + # From data["prediction"] + labels = [label["label"] for label in data["prediction"]] + scores = [label["score"] for label in data["prediction"]] + agent = data.get("prediction_agent") + suggestions.append(rg.Suggestion(question_name="labels", value=labels, score=scores, agent=agent)) + if data.get("annotation"): + # From data[annotation] and data[annotation_agent] + user_id = users_by_name.get(data["annotation_agent"], current_user).id + responses.append(rg.Response(question_name="label", value=data["annotation"], user_id=user_id)) + + if data.get("vectors"): + # From data["vectors"] + vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] + + return rg.Record( + id=data["id"], + fields=data["inputs"], + # The inputs field should be a dictionary with the same keys as the `fields` in the settings + metadata=data["metadata"], + # The metadata field should be a dictionary with the same keys as the `metadata` in the settings + vectors=vectors, + # The vectors field should be a dictionary with the same keys as the `vectors` in the settings + suggestions=suggestions, + responses=responses, + ) + ``` + +The functions above depend on the `users_by_name` dictionary and the `current_user` object to assign responses to users, we need to load the existing users. You can retrieve the users from the Argilla V2 server and the current user as follows: + +```python +# For +users_by_name = {user.username: user for user in client.users} +current_user = client.me +``` + +Finally, upload the records to the new dataset using the `log` method and map functions. + +```python +records = [] + +for data in hf_records: + records.append(map_to_record_for_single_label(data, users_by_name, current_user)) + +# Upload the records to the new dataset +dataset.records.log(records) +``` +You have now successfully migrated your legacy dataset to Argilla V2. For more guides on how to use the Argilla SDK, please refer to the [How to guides](index.md). From fd979ddff3115cc3f20fb0c8287f69177053efd6 Mon Sep 17 00:00:00 2001 From: Ben Burtenshaw Date: Thu, 13 Jun 2024 10:58:45 +0200 Subject: [PATCH 2/7] docs: add information for feedback datasets on existing servers --- .../migrate_from_legacy_datasets.md | 76 ++++++++++++++++++- 1 file changed, 74 insertions(+), 2 deletions(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index e429a0954f..8b76656890 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -10,9 +10,11 @@ This guide will help you migrate task specific datasets to Argilla V2. These do To follow this guide, you will need to have the following prerequisites: - An argilla 1.* server instance running with legacy datasets. -- An argilla 2.* server instance running. If you don't have one, you can create one by following the [Argilla installation guide](../../getting_started/installation.md). +- An argilla >=1.29 server instance running. If you don't have one, you can create one by following the [Argilla installation guide](../../getting_started/installation.md). - The `argilla` sdk package installed in your environment. +If your current legacy datasets are on a server with Argilla release after 1.29, you could chose to recreate your legacy datasets as new datasets on the same server. You could then upgrade the server to Argilla 2.0 and carry on working their. Your legacy datasets will not be visible on the new server, but they will remain in storage layers if you need to access them. + ## Steps The guide will take you through three steps: @@ -42,7 +44,7 @@ dataset_name = "news-programmatic-labeling" workspace = "demo" settings_v1 = rg_v1.load_dataset_settings(dataset_name, workspace) -records_v1 = rg_v1.load(dataset_name, workspace, limit=100, query="_exists_:annotated_by") +records_v1 = rg_v1.load(dataset_name, workspace) hf_dataset = records_v1.to_datasets() ``` @@ -177,6 +179,48 @@ Here are a set of example functions to convert the records for single-label and responses=responses, ) ``` +=== "For token classification" + + ```python + + def map_to_record_for_span(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + suggestions = [] + responses = [] + + if data.get("prediction"): + # Prediction for token classification will be a list of tuple (label, start, end) + spans = [ + {"start": start, "end": end, "label": label} + for label, start, end in data["prediction"] + ] + agent = data.get("prediction_agent") + suggestions.append(rg.Suggestion(question_name="labels", value=spans, agent=agent)) + + if data.get("annotation"): + # From data[annotation] and data[annotation_agent] + spans = [ + {"start": start, "end": end, "label": label} + for label, start, end in data["annotation"] + ] + user_id = users_by_name.get(data["annotation_agent"], current_user).id + responses.append(rg.Response(question_name="label", value=spans, user_id=user_id)) + + if data.get("vectors"): + # From data["vectors"] + vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] + + return rg.Record( + id=data["id"], + fields=data["inputs"], + # The inputs field should be a dictionary with the same keys as the `fields` in the settings + metadata=data["metadata"], + # The metadata field should be a dictionary with the same keys as the `metadata` in the settings + vectors=vectors, + # The vectors field should be a dictionary with the same keys as the `vectors` in the settings + suggestions=suggestions, + responses=responses, + ) + ``` The functions above depend on the `users_by_name` dictionary and the `current_user` object to assign responses to users, we need to load the existing users. You can retrieve the users from the Argilla V2 server and the current user as follows: @@ -198,3 +242,31 @@ for data in hf_records: dataset.records.log(records) ``` You have now successfully migrated your legacy dataset to Argilla V2. For more guides on how to use the Argilla SDK, please refer to the [How to guides](index.md). + +## Migrating Feedback Datasets on your Argilla 1.* server + +As mentioned above, `FeedbackDataset`'s are compatible with Argilla V2 and do not need to be reformatted. However, you may want to migrate your feedback datasets to the new server so that you can deprecate your Argilla 1.* server. Here is a guide on how to migrate your feedback datasets: + +```python +import argilla.v1 as rg_v1 + +# Initialize the API with an Argilla server less than 2.0 +old_client = rg.Argilla(old_server_api_url, old_server_api_key) +new_client = rg.Argilla(new_server_api_url, new_server_api_key) + +dataset_name = "feedback-dataset" +old_dataset = old_client.datasets(name=dataset_name) +new_dataset = new_client.datasets.add(dataset) + +# Load the records from the old server +new_dataset.records.log( + old_dataset.records( + with_responses=True, # (1) + with_suggestions=True, + with_vectors=True, + with_metadata=True, + ) +) +``` + +1. The `with_responses`, `with_suggestions`, `with_vectors`, and `with_metadata` flags are used to load the records with the responses, suggestions, vectors, and metadata respectively. \ No newline at end of file From f5794e3a3027b144bd9086d3e8c59f96abf1a079 Mon Sep 17 00:00:00 2001 From: Paco Aranda Date: Thu, 13 Jun 2024 15:52:26 +0200 Subject: [PATCH 3/7] [DOCS] Update and define all record mapping functions (#5009) Helper functions for record mapping have been reviewed and tested with several datasets. --- .../migrate_from_legacy_datasets.md | 121 ++++++++++-------- 1 file changed, 68 insertions(+), 53 deletions(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index e67ae2e6e1..cb19f9b187 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -119,27 +119,24 @@ Here are a set of example functions to convert the records for single-label and """ This function maps a text classification record dictionary to the new Argilla record.""" suggestions = [] responses = [] - vectors = [] - if data.get("prediction"): - # From data["prediction"] - label, score = data["prediction"][0].values() - agent = data.get("prediction_agent") + + if prediction := data.get("prediction"): + label, score = prediction[0].values() + agent = data["prediction_agent"] suggestions.append(rg.Suggestion(question_name="label", value=label, score=score, agent=agent)) - if data.get("annotation"): - # From data[annotation] and data[annotation_agent] + + if annotation := data.get("annotation"): user_id = users_by_name.get(data["annotation_agent"], current_user).id - responses.append(rg.Response(question_name="label", value=data["annotation"], user_id=user_id)) - if data.get("vectors"): - # From data["vectors"] - vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] - + responses.append(rg.Response(question_name="label", value=annotation, user_id=user_id)) + + vectors = (data.get("vectors") or {}) return rg.Record( id=data["id"], fields=data["inputs"], # The inputs field should be a dictionary with the same keys as the `fields` in the settings metadata=data["metadata"], # The metadata field should be a dictionary with the same keys as the `metadata` in the settings - vectors=vectors, + vectors=[rg.Vector(name=name, values=value) for name, value in vectors.items()], suggestions=suggestions, responses=responses, ) @@ -149,73 +146,91 @@ Here are a set of example functions to convert the records for single-label and ```python def map_to_record_for_multi_label(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + """ This function maps a text classification record dictionary to the new Argilla record.""" suggestions = [] responses = [] - vectors = [] - if data.get("prediction"): - # From data["prediction"] - labels = [label["label"] for label in data["prediction"]] - scores = [label["score"] for label in data["prediction"]] - agent = data.get("prediction_agent") + + if prediction := data.get("prediction"): + labels, scores = zip(*[(pred["label"], pred["score"]) for pred in prediction]) + agent = data["prediction_agent"] suggestions.append(rg.Suggestion(question_name="labels", value=labels, score=scores, agent=agent)) - if data.get("annotation"): - # From data[annotation] and data[annotation_agent] + + if annotation := data.get("annotation"): user_id = users_by_name.get(data["annotation_agent"], current_user).id - responses.append(rg.Response(question_name="label", value=data["annotation"], user_id=user_id)) - - if data.get("vectors"): - # From data["vectors"] - vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] - + responses.append(rg.Response(question_name="label", value=annotation, user_id=user_id)) + + vectors = data.get("vectors") or {} return rg.Record( id=data["id"], fields=data["inputs"], # The inputs field should be a dictionary with the same keys as the `fields` in the settings metadata=data["metadata"], # The metadata field should be a dictionary with the same keys as the `metadata` in the settings - vectors=vectors, - # The vectors field should be a dictionary with the same keys as the `vectors` in the settings + vectors=[rg.Vector(name=name, values=value) for name, value in vectors.items()], suggestions=suggestions, responses=responses, ) ``` + === "For token classification" ```python - def map_to_record_for_span(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + """ This function maps a token classification record dictionary to the new Argilla record.""" suggestions = [] responses = [] - - if data.get("prediction"): - # Prediction for token classification will be a list of tuple (label, start, end) - spans = [ - {"start": start, "end": end, "label": label} - for label, start, end in data["prediction"] - ] - agent = data.get("prediction_agent") - suggestions.append(rg.Suggestion(question_name="labels", value=spans, agent=agent)) - - if data.get("annotation"): - # From data[annotation] and data[annotation_agent] - spans = [ - {"start": start, "end": end, "label": label} - for label, start, end in data["annotation"] - ] + + if prediction := data.get("prediction"): + scores = [span["score"] for span in prediction] + agent = data["prediction_agent"] + suggestions.append(rg.Suggestion(question_name="labels", value=prediction, score=scores, agent=agent)) + + if annotation := data.get("annotation"): user_id = users_by_name.get(data["annotation_agent"], current_user).id - responses.append(rg.Response(question_name="label", value=spans, user_id=user_id)) - - if data.get("vectors"): - # From data["vectors"] - vectors = [rg.Vector(name=name, values=value) for name, value in data["vectors"].items()] + responses.append(rg.Response(question_name="spans", value=annotation, user_id=user_id)) + + vectors = data.get("vectors") or {} + return rg.Record( + id=data["id"], + fields={"text": data["text"]}, + # The inputs field should be a dictionary with the same keys as the `fields` in the settings + metadata=data["metadata"], + # The metadata field should be a dictionary with the same keys as the `metadata` in the settings + vectors=[rg.Vector(name=name, values=value) for name, value in vectors.items()], + # The vectors field should be a dictionary with the same keys as the `vectors` in the settings + suggestions=suggestions, + responses=responses, + ) + ``` + +=== "For Text generation" + ```python + def map_to_record_for_text_generation(data: dict, users_by_name: dict, current_user: rg.User) -> rg.Record: + """ This function maps a text2text record dictionary to the new Argilla record.""" + suggestions = [] + responses = [] + + if prediction := data.get("prediction"): + first = prediction[0] + agent = data["prediction_agent"] + suggestions.append( + rg.Suggestion(question_name="text_generation", value=first["text"], score=first["score"], agent=agent) + ) + + if annotation := data.get("annotation"): + # From data[annotation] + user_id = users_by_name.get(data["annotation_agent"], current_user).id + responses.append(rg.Response(question_name="text_generation", value=annotation, user_id=user_id)) + + vectors = (data.get("vectors") or {}) return rg.Record( id=data["id"], - fields=data["inputs"], + fields={"text": data["text"]}, # The inputs field should be a dictionary with the same keys as the `fields` in the settings metadata=data["metadata"], # The metadata field should be a dictionary with the same keys as the `metadata` in the settings - vectors=vectors, + vectors=[rg.Vector(name=name, values=value) for name, value in vectors.items()], # The vectors field should be a dictionary with the same keys as the `vectors` in the settings suggestions=suggestions, responses=responses, From 7aebd54f5e041a74bf48f0189e61ed2915405619 Mon Sep 17 00:00:00 2001 From: Ben Burtenshaw Date: Thu, 20 Jun 2024 10:42:59 +0200 Subject: [PATCH 4/7] fix: fix errors in code snippets --- argilla/docs/how_to_guides/migrate_from_legacy_datasets.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index cb19f9b187..d220649b90 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -183,7 +183,7 @@ Here are a set of example functions to convert the records for single-label and if prediction := data.get("prediction"): scores = [span["score"] for span in prediction] agent = data["prediction_agent"] - suggestions.append(rg.Suggestion(question_name="labels", value=prediction, score=scores, agent=agent)) + suggestions.append(rg.Suggestion(question_name="spans", value=prediction, score=scores, agent=agent)) if annotation := data.get("annotation"): user_id = users_by_name.get(data["annotation_agent"], current_user).id @@ -271,7 +271,7 @@ new_client = rg.Argilla(new_server_api_url, new_server_api_key) dataset_name = "feedback-dataset" old_dataset = old_client.datasets(name=dataset_name) -new_dataset = new_client.datasets.add(dataset) +new_dataset = new_client.datasets.add(old_dataset) # Load the records from the old server new_dataset.records.log( From 5022318bf43b12db63c6a493e82afe98885ceb5d Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Thu, 20 Jun 2024 11:03:34 +0200 Subject: [PATCH 5/7] Update argilla/docs/how_to_guides/migrate_from_legacy_datasets.md Co-authored-by: Paco Aranda --- argilla/docs/how_to_guides/migrate_from_legacy_datasets.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index d220649b90..bbd1017df6 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -263,7 +263,7 @@ You have now successfully migrated your legacy dataset to Argilla V2. For more g As mentioned above, `FeedbackDataset`'s are compatible with Argilla V2 and do not need to be reformatted. However, you may want to migrate your feedback datasets to the new server so that you can deprecate your Argilla 1.* server. Here is a guide on how to migrate your feedback datasets: ```python -import argilla.v1 as rg_v1 +import argilla as rg # Initialize the API with an Argilla server less than 2.0 old_client = rg.Argilla(old_server_api_url, old_server_api_key) From 41ac96ae60d7988f9c48371bf27eb05c74a4bff8 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Thu, 20 Jun 2024 11:03:53 +0200 Subject: [PATCH 6/7] Update argilla/docs/how_to_guides/migrate_from_legacy_datasets.md Co-authored-by: Paco Aranda --- argilla/docs/how_to_guides/migrate_from_legacy_datasets.md | 1 - 1 file changed, 1 deletion(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index bbd1017df6..f410a82ce5 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -279,7 +279,6 @@ new_dataset.records.log( with_responses=True, # (1) with_suggestions=True, with_vectors=True, - with_metadata=True, ) ) ``` From 9c0768e7571dfa5f9ef4e7fe2d8860baeb319983 Mon Sep 17 00:00:00 2001 From: Ben Burtenshaw Date: Thu, 20 Jun 2024 12:15:44 +0200 Subject: [PATCH 7/7] docs: drop migration of feedback datasets between servers --- .../migrate_from_legacy_datasets.md | 26 ------------------- 1 file changed, 26 deletions(-) diff --git a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md index f410a82ce5..4d01713bf6 100644 --- a/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md +++ b/argilla/docs/how_to_guides/migrate_from_legacy_datasets.md @@ -258,29 +258,3 @@ dataset.records.log(records) ``` You have now successfully migrated your legacy dataset to Argilla V2. For more guides on how to use the Argilla SDK, please refer to the [How to guides](index.md). -## Migrating Feedback Datasets on your Argilla 1.* server - -As mentioned above, `FeedbackDataset`'s are compatible with Argilla V2 and do not need to be reformatted. However, you may want to migrate your feedback datasets to the new server so that you can deprecate your Argilla 1.* server. Here is a guide on how to migrate your feedback datasets: - -```python -import argilla as rg - -# Initialize the API with an Argilla server less than 2.0 -old_client = rg.Argilla(old_server_api_url, old_server_api_key) -new_client = rg.Argilla(new_server_api_url, new_server_api_key) - -dataset_name = "feedback-dataset" -old_dataset = old_client.datasets(name=dataset_name) -new_dataset = new_client.datasets.add(old_dataset) - -# Load the records from the old server -new_dataset.records.log( - old_dataset.records( - with_responses=True, # (1) - with_suggestions=True, - with_vectors=True, - ) -) -``` - -1. The `with_responses`, `with_suggestions`, `with_vectors`, and `with_metadata` flags are used to load the records with the responses, suggestions, vectors, and metadata respectively.