refactor(sdk): apply minor improvements (#15)

* feat: add folders api support * refactor(sdk): apply minor improvements * ci(fix): ubuntu 24 (latest) fails for ecosystem tests Reverting CI to ubuntu 22.04 for LTS according to actions/runner-images#10636
Coherent-Partners · Jan 24, 2025 · 1a0af1f · 1a0af1f
1 parent d6ad632
commit 1a0af1f
Show file tree

Hide file tree

Showing 26 changed files with 494 additions and 273 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -43,7 +43,7 @@ jobs:
 
   ecosystem-test:
     needs: unit-test
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-22.04
     strategy:
       matrix:
         python-version: ['3.7', '3.8', '3.9', '3.10', '3.12']

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,14 @@ All notable changes to this project will be documented in this file.
 See [standard-version](https://github.com/conventional-changelog/standard-version)
 for commit guidelines.
 
+## 0.1.12 (2025-01-23)
+
+- Add support for folder API
+- Update `create_chunks` method to handle columnar format only (headers + inputs)
+- Remove `None` values from the request payload (e.g., `Services.execute()`)
+- Support extra metadata in the request payload (e.g., `Services.execute(extras={...})`)
+- Update documentation and examples
+
 ## 0.1.11 (2024-12-23)
 
 - Apply bug fixes and enhancements

diff --git a/README.md b/README.md
@@ -235,8 +235,8 @@ OAuth2.0 Client Credentials flow:
 
 [ImpEx API](./docs/impex.md) - imports and exports Spark services:
 
-- `Spark.impex.export(data)` exports Spark entities (versions, services, or folders).
-- `Spark.impex.import_(data)` imports previously exported Spark entities into the platform.
+- `Spark.impex.exp(data)` exports Spark entities (versions, services, or folders).
+- `Spark.impex.imp(data)` imports previously exported Spark entities into the platform.
 
 [Other APIs](./docs/misc.md) - for other functionality:
 

diff --git a/docs/batches.md b/docs/batches.md
diff --git a/docs/history.md b/docs/history.md
@@ -89,7 +89,7 @@ when successful, this method returns:
 }
 ```
 
-Here's a full example how to harness this method:
+Here's a full example of how to harness this method:
 
 ```python
 import cspark.sdk as Spark
@@ -105,38 +105,38 @@ with spark.logs as logs:
 
 ## Download service execution logs
 
-This method allows you to download the service execution logs as a CSV or JSON file
-to your local machine. Unlike the `rehydrate` method, this one initiates a download
-job and continuously checks the status until the job is completed and finally downloads
-the zip file. It throws a `SparkError` if the download job fails to produce a downloadable
-file.
+This method allows you to export service execution logs in either CSV or JSON
+format to your local machine. It streamlines the download process by handling the
+complete workflow: initiating the download job, monitoring its status, and retrieving
+the final zip file once ready. If the download process encounters any issues or
+fails to generate a downloadable file, the method raises a `SparkError`.
 
 If you want to have more fine-grained control over the download process, you can use
 respectively the `Spark.logs.downloads.initiate(uri, [type])` and
 `Spark.logs.downloads.get_status(uri, [type])` methods to initiate a download
 job and check its status until it's finished. Do note that the status check is
-subject to a timeout when it reaches the maximum number of retries.
+subject to `RetryTimeoutError` when it reaches the maximum number of retries.
 
 ### Arguments
 
 This method accepts the following keyword arguments:
 
-| Property          | Type                 | Description                                                      |
-| ----------------- | -------------------- | ---------------------------------------------------------------- |
-| _folder_          | `str`                | The folder name.                                                 |
-| _service_         | `str`                | The service name.                                                |
-| _version\_id_     | `None \| string`     | The particular service version for the download.                 |
-| _type_            | `csv \| json`        | The file type (defaults to `json`).                              |
-| _call\_ids_       | `None \| List[str]`  | An array of call IDs to download logs for.                       |
+| Property          | Type                 | Description                                           |
+| ----------------- | -------------------- | ----------------------------------------------------- |
+| _folder_          | `str`                | The folder name.                                      |
+| _service_         | `str`                | The service name.                                     |
+| _version\_id_     | `None \| string`     | The particular service version for the download.      |
+| _type_            | `csv \| json`        | The file type (defaults to `json`).                   |
+| _call\_ids_       | `None \| List[str]`  | An array of call IDs to download logs for.            |
 | _start\_date_     | `None \| str \| int \| datetime` | The start date (format: `YYYY-MM-DD[THH:MM:SS.SSSZ]`).|
 | _end\_date_       | `None \| str \| int \| datetime` | The end date (format: `YYYY-MM-DD[THH:MM:SS.SSSZ]`). |
-| _correration\_id_ | `string`             | The correlation ID (possible fallback for `call_ids`).         |
-| _source\_system_  | `string`             | The source system (possible fallback for `call_ids`).          |
+| _correration\_id_ | `string`             | The correlation ID (possible fallback for `call_ids`).|
+| _source\_system_  | `string`             | The source system (possible fallback for `call_ids`). |
 | _max\_retries_    | `None \| int`   | The number of retries to attempt (defaults to `Config.max_retries`).|
 | _retry\_interval_ | `None \| float` | The interval between retries in seconds (defaults to `Config.retry_interval`).|
 
 ```python
-logs.download(
+spark.logs.download(
     folder='my-folder',
     service='my-service',
     call_ids=['uuid1', 'uuid2', 'uuid3'],

diff --git a/docs/hybrid.md b/docs/hybrid.md
@@ -17,8 +17,8 @@ support the Hybrid Runner API. To install it, run:
 pip install cspark
 ```
 
-Obviously, a runner offers a smaller subset of functionality compared to the SaaS API,
-however, extending `cspark.sdk` to support the Hybrid Runner API is a good way
+Hybrid runners offer a smaller subset of functionality compared to the SaaS API.
+However, extending `cspark.sdk` to support the Hybrid Runner API is a good way
 to keep the codebase consistent and maintainable. This also means that you may
 want to check its [documentation][cspark] to learn about its client options,
 error handling, and other features.
@@ -52,7 +52,7 @@ Explore the [examples] and [docs] folders to find out more about its capabilitie
 <!-- References -->
 
 [cspark]: https://pypi.org/project/cspark/
-[version-img]: https://badge.fury.io/py/cspark.svg
+[version-img]: https://img.shields.io/pypi/v/cspark
 [version-url]: https://pypi.python.org/pypi/cspark
 [user-guide]: https://docs.coherent.global/hybrid-runner/introduction-to-the-hybrid-runner
 [hybrid-runner]: https://github.com/orgs/Coherent-Partners/packages/container/package/nodegen-server

diff --git a/docs/impex.md b/docs/impex.md
@@ -2,10 +2,10 @@
 
 # ImpEx API
 
-| Verb                          | Description                                                                       |
-| ----------------------------- | --------------------------------------------------------------------------------- |
-| `Spark.impex.export(data)`    | [Export Spark entities (versions, services, or folders)](#export-spark-entities). |
-| `Spark.impex.import_(data)`| [Import exported Spark entities into your workspace](#import-spark-entities).     |
+| Verb                    | Description                                                                       |
+| ----------------------- | --------------------------------------------------------------------------------- |
+| `Spark.impex.exp(data)` | [Export Spark entities (versions, services, or folders)](#export-spark-entities). |
+| `Spark.impex.imp(data)` | [Import exported Spark entities into your workspace](#import-spark-entities).     |
 
 ## Export Spark entities
 
@@ -23,7 +23,7 @@ The expected keyword arguments are as follows:
 | _folders_         | `None \| list[str]`     | 1+ folder name(s).                                                 |
 | _services_        | `None \| list[str]`     | 1+ service URI(s).                                                 |
 | _version\_ids_    | `None \| list[str]`     | 1+ version UUID(s) of the desired service.                         |
-| _file\_filter_    | `migrate \| onpremises` | For data migration or hybrid deployments (defaults to `migrate`).  |
+| _file\_filter_    | `'migrate' \| 'onpremises'` | For data migration or hybrid deployments (defaults to `migrate`).  |
 | _version\_filter_ | `latest \| all`         | Which version of the file to export (defaults to `latest`).        |
 | _source\_system_  | `None \| str`           | Source system name to export from (e.g., `Spark Python SDK`).      |
 | _correlation\_id_ | `None \| str`           | Correlation ID for the export (useful for tagging).                |
@@ -41,7 +41,7 @@ Check out the [API reference](https://docs.coherent.global/spark-apis/impex-apis
 for more information.
 
 ```python
-spark.impex.export(
+spark.impex.exp(
   services=['my-folder/my-service[0.4.2]', 'my-other-folder/my-service-2'],
   file_filter='onpremises',
   max_retries=5,
@@ -54,16 +54,15 @@ spark.impex.export(
 When successful, this method returns an array of exported entities, where each entity
 is an `HttpResponse` object with the buffer containing the exported entity.
 
-### Non-Transactional Methods
-
-This method is transactional. It will initiate an export job, poll its status
-until it completes, and download the exported files. If you need more control over
-these steps, consider using the `exports` resource directly. You may use the following
-methods:
-
-- `Spark.impex.exports.initiate(data)` creates an export job.
-- `Spark.impex.exports.get_status(job_id)` gets an export job's status.
-- `Spark.impex.exports.download(urls)` downloads the exported files as a ZIP.
+> [!TIP]
+> This method is transactional. It will initiate an export job, poll its status
+> until it completes, and download the exported files. If you need more control over
+> these steps, consider using the `exports` resource directly. You may use the following
+> methods:
+>
+> - `Spark.impex.exports.initiate(data)` creates an export job.
+> - `Spark.impex.exports.get_status(job_id)` gets an export job's status.
+> - `Spark.impex.exports.download(urls)` downloads the exported files as a ZIP.
 
 ## Import Spark entities
 
@@ -78,7 +77,7 @@ The expected keyword arguments are as follows:
 | --------------- | ------------------ | ------------------------------------------------------------ |
 | _file_          | `BinaryIO`         | The ZIP file containing the exported entities.               |
 | _destination_   | `str \| List[str] \| Mapping[str, str] \| List[Mapping[str, str]]`| The destination service URI(s). |
-| _if\_present_   | `abort \| replace \| add_version` | What to do if the entity already exists in the destination (defaults to `add_version`). |
+| _if\_present_   | `'abort' \| 'replace' \| 'add_version'` | What to do if the entity already exists in the destination (defaults to `add_version`). |
 | _source\_system_  | `None \| str`    | Source system name to export from (e.g., `Spark Python SDK`).|
 | _correlation\_id_ | `None \| str`    | Correlation ID for the export (useful for tagging).          |
 | _max\_retries_    | `None \| int`    | Maximum number of retries when checking the export status.   |
@@ -96,13 +95,13 @@ any of the formats indicated below:
 | --------- | ------- | ------------------------------------------ |
 | _source_  | `str`   | The service URI of the source tenant.      |
 | _target_  | `str \| None`| The service URI of the destination tenant (defaults to `source`) |
-| _upgrade_ | `major \| minor \| patch` | The version upgrade strategy (defaults to `minor`). |
+| _upgrade_ | `'major' \| 'minor' \| 'patch'` | The version upgrade strategy (defaults to `minor`). |
 
 Check out the [API reference](https://docs.coherent.global/spark-apis/impex-apis/import#request-body)
 for more information.
 
 ```python
-spark.impex.import_(
+spark.impex.imp(
     destination={'source': 'my-folder/my-service', 'target': 'this-folder/my-service', 'upgrade': 'patch'},
     file=open('exported.zip', 'rb'),
     max_retries=7,
@@ -175,8 +174,6 @@ See the sample response below.
 }
 ```
 
-### Non-Transactional Methods
-
 Being transactional, this method will create an import job, and poll its status
 continuously until it completes the import process. You may consider using the
 `imports` resource directly and control the import process manually:

diff --git a/docs/misc.md b/docs/misc.md
@@ -12,17 +12,14 @@
 This method helps you download a service's [WebAssembly](https://webassembly.org/)
 module.
 
-Roughly speaking, WebAssembly (or WASM) is a binary instruction format
-for a stack-based virtual machine. It's designed as a portable compilation target
-for programming languages, enabling deployment on the web for client and server
-applications.
-
-In the context of Spark, a WebAssembly module refers to a cohesive bundle of
-files designed for portability and execution across web and Node.js environments.
-This bundle typically includes the WebAssembly representation of the Spark service's
-encapsulated logic along with associated JavaScript files. By packaging these
-components together, a Spark WASM module becomes executable within both browser and
-Node environments.
+[WebAssembly](https://webassembly.org/) (WASM) is a low-level binary format for
+executing code in a stack-based virtual machine. It serves as a compilation target
+for high-level programming languages, enabling efficient execution across web platforms.
+
+In Spark's context, a WebAssembly module is a self-contained package that bundles
+the compiled service logic with its supporting files. This modular approach ensures
+consistent execution in both web browsers and Node.js environments, making Spark
+services highly portable and performant.
 
 Check out the [API reference](https://docs.coherent.global/spark-apis/webassembly-module-api)
 for more information.
@@ -31,9 +28,9 @@ for more information.
 
 You may pass in the service URI as `string` in the following format:
 
-- `version/uuid` (e.g., `version/123e4567-e89b-12d3-a456-426614174000`) - **preferred**
-- `service/uuid` (e.g., `service/123e4567-e89b-12d3-a456-426614174000`)
-- `folder/service` (e.g., `my-folder/my-service`)
+- `version/{version_id}` - **preferred**
+- `service/{service_id}`
+- `{folder}/{service}`
 
 ```python
 spark.wasm.download('version/uuid')
@@ -46,7 +43,7 @@ Alternatively, you can pass in the following parameters as an `object`.
 | _folder_      | `str \| None` | The folder name.                 |
 | _service_     | `str \| None` | The service name.                |
 | _version\_id_ | `str \| None` | The version UUID of the service. |
-| _service\_id_ | `str \| None` | The service UUID.                |
+| _service\_id_ | `str \| None` | The service UUID (points to the latest version).|
 
 > [!NOTE]
 > As of now, only the `version_id` should be used to download the WebAssembly module.

diff --git a/docs/readme.md b/docs/readme.md
@@ -60,8 +60,8 @@ which provides an elegant, feature-rich HTTP module. The SDK built a layer
 on top of it to simplify the process of making HTTP requests to the Spark platform.
 
 Presently, only the synchronous HTTP methods are supported. Hence, all the methods
-under `Spark.Client()` are synchronous and return an `HttpResponse` object with
-the following properties:
+under `Spark.Client()` are synchronous (i.e., blocking) and return an `HttpResponse`
+object with the following properties:
 
 - `status`: HTTP status code
 - `data`: Data returned by the API if any (usually JSON)
@@ -74,18 +74,19 @@ the following properties:
 > when accessing the response data.
 >
 > As a side note, we intend to leverage the asynchronous methods in the future
-> to provide a more efficient way to interact with the Spark platform.
+> to provide a more efficient (i.e., non-blocking) way to interact with the Spark platform.
 
 ## HTTP Error
 
-When attempting to communicate with the API, the SDK will wrap any sort of failure
+When attempting to communicate with the API, the SDK will wrap any failure
 (any error during the roundtrip) into a `SparkApiError`, which will include
 the HTTP `status` code of the response and the `request_id`, a unique identifier
 of the request. The most common errors are:
 
 - `UnauthorizedError`: when the user is not authenticated/authorized
 - `NotFoundError`: when the requested resource is not found
-- `BadRequestError`: when the request or payload is invalid.
+- `BadRequestError`: when the request or payload is invalid
+- `RetryTimeoutError`: when the maximum number of retries is reached.
 
 The following properties are available in a `SparkApiError`:
 
@@ -101,23 +102,18 @@ as well as the obtained response, if available.
 
 ## API Resource
 
-The Spark platform offers a wide range of functionalities that can be accessed
-programmatically via RESTful APIs. For now, the SDK only supports [Services API](./services.md)
-and [Batches API](./batches.md).
+The Spark platform provides extensive functionality through its RESTful APIs,
+with over 60 endpoints available. While the SDK currently implements a subset of
+these endpoints, it's designed to be extensible.
 
-Since the SDK does not cover all the endpoints in the platform, it provides a way
-to cover additional endpoints. So, if there's an API resource you would like to
-consume that's not available in the SDK, you can always extend this `ApiResource`
-to include it.
+If you need to consume an API endpoint that's not yet available in the SDK, you
+can easily extend the `ApiResource` class to implement it. Here's how:
 
 ```py
-from cspark.sdk import Client, Config, ApiResource, Uri
+from cspark.sdk import Client, ApiResource, Uri
 
 # 1. Prepare the additional API resource you want to consume (e.g., MyResource).
 class MyResource(ApiResource):
-    def __init__(self, config: Config):
-        super().__init__(config)
-
     def fetch_data(self):
         url = Uri.of(base_url=self.config.base_url.full, version='api/v4', endpoint='my/resource')
         return self.request(url, method='GET')
@@ -140,7 +136,8 @@ some other goodies like the `base_url`, which can be used to build other URLs
 supported by the Spark platform.
 
 The `Uri` class is also available to help you build the URL for your custom resource.
-In this particular example, the built URL will be: `https://excel.my-env.coherent.global/my-tenant/api/v4/my/resource`.
+In this particular example, the built URL will be:
+`https://excel.my-env.coherent.global/my-tenant/api/v4/my/resource`.
 
 ### Error Handling