-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAJOR FEATURE] Overhaul library to use Data Validation via Pydantic #58
Comments
I think the 3.6 requirement was due to Ubuntu 18 coming with 3.6 and a lot of folks were still on 18 at the time, but a few years have passed and 18 EOLed last year anyway so we could consider bumping the min python to 3.7 or 3.8. Pydantic is one of those things where it optimizes performance and if you are familiar with pydantic then the code is way simpler and more readable, but for those who are new to pydantic it will cause a higher level of effort to make tweaks and additions to the code, imo. From an additional dependency perspective I personally don't see anything wrong with adding pydantic to the list (which now only includes numpy and jsonschema) although there will always be folks out there who like to keep it minimal for simplicity and security reasons. Ultimately it's going to come down to how @Teque5 and @gmabey feel about Pydantic because they have been the main long-term maintainers of this code and even if you single-handedly refactored the entire code to use Pydantic and no regressions are introduced, if it makes it harder for them to maintain the project then it's a deal-breaker, especially since we don't need Pydantic to do what we want sigmf-python to do. |
Thanks for getting back Marc - really love the PySDR book by the way - it really is one of the only great online resources out there for understanding SDRs, I/Q etc,
Glad to hear. Realistically it would be better if we bumped to v3.8 as I think with Python v3.7 it would use Pydantic V1 which had serious migration issues when moving to Pydantic V2.
Agreed, however I would argue it is good practice to follow model-view-controller design patterns, and Python The burden of implementation would only fall on those maintaining the library, as for usage it would be as simple as: handle = sigmf.sigmffile.fromfile(my_file)
# metadata is a pydantic.BaseModel following the SigMFMeta schema.
index = handle.metadata.captures[0].sample_start Within an IDE this is great as a user you're able to find the properties you want with hints / suggestions, rather than needing to know a priori using the dictionary constants currently implemented : e.g
We could implement it for now in an optional way which avoids minimal changes to the current API. For instance, adding a In terms of security, there are a large number of companies which use Pydantic in their pipelines, so a security breach in Pydantic would affect considerably more than just SigMF users: https://docs.pydantic.dev/latest/why/#using-pydantic
Agreed, I would anticipate that pydantic would significantly reduce the code footprint of the validation code, and Pydantic also works well with But I'll wait until we can agree either to skip or come up with a basic API. |
The choice to implement validation through ps. We officially dropped support for |
OK great. IssuesIssue (1): Shouldn't be a problem as we won't delete the schema.json file for now. PDF / website in future could be produced using a GitHub runner (loading in Pydantic etc) on commit to the main branch to deploy web pages to your chosen URL. Pydantic of course generates JSON schema on command using
UpdateTo address the primary concern of de-sync between the JSON schema and the Pydantic version, there is a hack - the Also good news, looks like Pydantic V2.5 works on Python 3.7+, so no need to bump the minimum version. The base is there, just need to write some tests for it in a couple of days (when I get space between work). |
@gmabey was wondering what your opinions are on pydantic |
Aim
I would propose to overhaul the library as it currently stands to introduce a dependency on
pydantic
, an excellent data validation library that works hand-in-glove with JSON and JSON schemas.Let me know what you think of the below, and I'd consider developing a basic PR.
Note: I'm aware that Pydantic requires Python>=3.7 and there are mentioned comments that need backwards compatibility (specifically Python 3.6). If that is the case, the Pydantic parts could be implemented as optional if desired.
Benefits
This would serve to provide the following benefits:
.sigmf-meta
JSON files. Pydantic would perform type, length and semantic checking on your behalf.pydantic
is written in Rust).IQEngine
package you are also developing, as Pydantic works great with JSON for HTTP requests, FastAPI and other common web libraries.Examples
For example, in defining the schema for a SigMF Global object, one might:
and for a single meta file one would define:
calling
BaseModel.model_dump_json()
automatically converts nested BaseModel objects into a JSON-compliant strings which is trivial to store to disk. Similarly, reading in a.sigmf-meta
file could be parsed by aSigMFFile
object (which extends BaseModel) to validate the JSON according to the schema without any custom checking code.Pydantic can handle custom serialization formats by using aliasing, for example:
helps to handle any discrepancies between the JSON naming format e.g "core:" or "my_custom_extension:<item"> and the Python variable name.
Extensions
Pydantic would be a great way to handle custom extensions for SigMF also, any custom user extension could simply extend a pre-defined extension class.
The text was updated successfully, but these errors were encountered: