-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(contrib): Start guidelines for schema design #15037
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Data Schemas | ||
|
||
Cargo reads and writes user and machine facing data formats, like | ||
- `Cargo.toml`, read and written on `cargo package` | ||
- `Cargo.lock`, read and written | ||
- `.cargo/config.toml`, read-only | ||
- `cargo metadata` output | ||
- `cargo build --message-format` output | ||
|
||
## Schema Design | ||
|
||
Generally, | ||
- Fields should be kebab case | ||
- `#[serde(rename_all = "kebab-case")]` should be applied defensively | ||
- Fields should only be present when needed, saving space and parse time | ||
- Also, we can always switch to always outputting the fields but its harder to stop outputting them | ||
- `#[serde(skip_serializing_if = "Default::default")]` should be applied liberally | ||
- For output, prefer [jsonlines](https://jsonlines.org/) as it allows streaming output and flexibility to mix content (e.g. adding diagnostics to output that didn't prevously have it | ||
- `#[serde(deny_unknown_fields)]` should not be used to allow evolution of formats, including feature gating | ||
|
||
## Schema Evolution Strategies | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When reading this, each bullet point is a bit disconnected to me. Also like "When reading data, add new fields", I had a hard time interpreting it as part of evolution strategies. For "Versioned with the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tried to expand on it a bit more. Hope it helps! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. That is good enough for me as a maintainer. Not sure about casual contributors, but we can always iterate later :) |
||
|
||
When changing a schema for data that is read, some options include: | ||
- Adding new fields is relatively safe | ||
- If the field must not be ignored when present, | ||
have a transition period where it is invalid to use on stable Cargo before stabilizing it or | ||
error if its used before supported within the schema version | ||
(e.g. `edition` requires a minimum `package.rust-version`, if present) | ||
- Adding new values to a field is relatively safe | ||
- Unstable values should fail on stable Cargo | ||
- Version the structure and interpretation of the data (e.g. the `edition` field or `package.resolver` which has an `edition` fallback) | ||
|
||
Note: some formats that are read are also written back out | ||
(e.g. `cargo package` generating a `Cargo.toml` file) | ||
and those strategies need to be considered as well. | ||
|
||
When changing a schema for data that is written, some options include: | ||
- Add new fields if the presence can be ignored | ||
- Infer permission from the users use of the new schema (e.g. a new alias for an `enum` variant) | ||
- Version the structure and interpretation of the format | ||
- Defaulting to the latest version with a warning that behavior may change (e.g. `cargo metadata --format-version`, `edition` in cargo script) | ||
- Defaulting to the first version, eventually warning the user of the implicit stale behavior (e.g. `package.edition` in `Cargo.toml`) | ||
- Without a default (e.g. `package.rust-version`, or a command-line flag like `--format-version`) | ||
|
||
Note: While `serde` makes it easy to support data formats that add new fields, | ||
new data types or supported values for a field are more difficult to future-proof | ||
against. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have an overview of what this "data schemas" refer to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some context