-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(contrib): Start guidelines for schema design (#15037)
### What does this PR try to resolve? This was inspired by a recent Cargo team discussion on whether we should generally elide default values. This will also help with https://rust-lang.github.io/rust-project-goals/2025h1/cargo-plumbing.html Case studies in schema design: - #14506 - #10543 ### How should we test and review this PR? ### Additional information
- Loading branch information
Showing
2 changed files
with
48 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Data Schemas | ||
|
||
Cargo reads and writes user and machine facing data formats, like | ||
- `Cargo.toml`, read and written on `cargo package` | ||
- `Cargo.lock`, read and written | ||
- `.cargo/config.toml`, read-only | ||
- `cargo metadata` output | ||
- `cargo build --message-format` output | ||
|
||
## Schema Design | ||
|
||
Generally, | ||
- Fields should be kebab case | ||
- `#[serde(rename_all = "kebab-case")]` should be applied defensively | ||
- Fields should only be present when needed, saving space and parse time | ||
- Also, we can always switch to always outputting the fields but its harder to stop outputting them | ||
- `#[serde(skip_serializing_if = "Default::default")]` should be applied liberally | ||
- For output, prefer [jsonlines](https://jsonlines.org/) as it allows streaming output and flexibility to mix content (e.g. adding diagnostics to output that didn't prevously have it | ||
- `#[serde(deny_unknown_fields)]` should not be used to allow evolution of formats, including feature gating | ||
|
||
## Schema Evolution Strategies | ||
|
||
When changing a schema for data that is read, some options include: | ||
- Adding new fields is relatively safe | ||
- If the field must not be ignored when present, | ||
have a transition period where it is invalid to use on stable Cargo before stabilizing it or | ||
error if its used before supported within the schema version | ||
(e.g. `edition` requires a minimum `package.rust-version`, if present) | ||
- Adding new values to a field is relatively safe | ||
- Unstable values should fail on stable Cargo | ||
- Version the structure and interpretation of the data (e.g. the `edition` field or `package.resolver` which has an `edition` fallback) | ||
|
||
Note: some formats that are read are also written back out | ||
(e.g. `cargo package` generating a `Cargo.toml` file) | ||
and those strategies need to be considered as well. | ||
|
||
When changing a schema for data that is written, some options include: | ||
- Add new fields if the presence can be ignored | ||
- Infer permission from the users use of the new schema (e.g. a new alias for an `enum` variant) | ||
- Version the structure and interpretation of the format | ||
- Defaulting to the latest version with a warning that behavior may change (e.g. `cargo metadata --format-version`, `edition` in cargo script) | ||
- Defaulting to the first version, eventually warning the user of the implicit stale behavior (e.g. `package.edition` in `Cargo.toml`) | ||
- Without a default (e.g. `package.rust-version`, or a command-line flag like `--format-version`) | ||
|
||
Note: While `serde` makes it easy to support data formats that add new fields, | ||
new data types or supported values for a field are more difficult to future-proof | ||
against. |