-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix (and test) codegen issues w/ types that conflict with Rust types inc Result
#705
Conversation
Result
Result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. The short of it seems to be to prefix std types with their full path. Any other subtlety beyond the testing kludge?
typify-impl/src/structs.rs
Outdated
@@ -373,12 +373,12 @@ pub(crate) fn generate_serde_attr( | |||
let default_fn = match (state, &prop_type.details) { | |||
(StructPropertyState::Optional, TypeEntryDetails::Option(_)) => { | |||
serde_options.push(quote! { default }); | |||
serde_options.push(quote! { skip_serializing_if = "Option::is_none" }); | |||
serde_options.push(quote! { skip_serializing_if = "std::option::Option::is_none" }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we prefix with :: ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, yes we should. There's also a std::vec::Vec
right below it which also needs to be fully qualified.
typify-impl/Cargo.toml
Outdated
@@ -17,7 +17,7 @@ schemars = "0.8.21" | |||
semver = "1.0.23" | |||
serde = "1.0.215" | |||
serde_json = "1.0.133" | |||
syn = { version = "2.0.89", features = ["full"] } | |||
syn = { version = "2.0.89", features = ["full", "visit-mut"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this new feature only needed for tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch, I moved that feature to the dev-dependencies
.
typify/tests/schemas.rs
Outdated
@@ -23,6 +23,8 @@ fn test_schemas() { | |||
} | |||
|
|||
fn validate_schema(path: std::path::PathBuf) -> Result<(), Box<dyn Error>> { | |||
println!("Testing the processing of schema: {}", path.display()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to leave this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did mean to since I found it helpful when troubleshooting, but I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was useful feel free to leave it
typify-impl/src/test_util.rs
Outdated
&& !path.segments.is_empty() | ||
&& path.segments[0].ident == "std" | ||
{ | ||
// Fun additional hack to keep you on your toes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check the history on that? It might have been during a previous round of prefixing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right, it was done in #647.
The tests that have the fully-qualified type paths in them don't seem wrong or irrelevant. It's not a very common way to declare Rust types outside of proc macros and other codegen, but it's perfectly valid.
I dislike the fact that a special-case is needed, and if anyone ever decides to extend those tests with a field with a fully-qualified type other than HashMap<String, String>
then the tests will fail in a way that's probably going to be surprising.
On the other hand, it's actually not a realistic invariant that Rust -> JSON RPC -> Rust transform must always exactly reproduce the input Rust AST. Codegen needs to fully-qualify types that would not normally be fully-qualified. So another solution might be to run the decanonicalization pass on both actual
and expected
. Now that I think of it, that seems like the better approach. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of running it on both actual and expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed def2173 which does just that. It's definitely better that way.
Yes that's pretty much the whole fix. A few words to describe, and only +28,792 −13,699 lines to implement ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- how do we handle testing
- are there other functional changes (and if so, can we revert?)
model-context-protocol.json is large; is it adding value beyond what we get from rust-collisions?
Can you please take a look at rustdoc before and after? I don't recall if we'll see simplified or fully qualified types. The latter would be fine... but perhaps distracting / ugly.
Thanks.
// Decanonicalize the types generated by typify so that we can compare them to the original | ||
// Rust types' ASTs that definitely do not use canonicalized std types. | ||
let actual = decanonicalize_std_types(actual); | ||
|
||
// Make sure they match. | ||
if let Err(err) = expected.syn_cmp(&actual, ignore_variant_names) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would another approach be to change the implementation of impl SynCompare for TypePath
to be fuzzier i.e. to identify ::std::...::Foo
as equivalent to Foo
?
Alternatively, I'm inclined towards changing the tests to just use the fully qualified type name.
What do you think of these options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would another approach be to change the implementation of impl SynCompare for TypePath to be fuzzier i.e. to identify ::std::...::Foo as equivalent to Foo?
Yes I suppose you could change the SynCompare
impl to strip path components any time it sees ::std::
before the comparison. My personal preference is to not do that, since it's more spooky action at a distance, while this impl is very explicit about what's happening.
Alternatively, I'm inclined towards changing the tests to just use the fully qualified type name.
That would certainly work, but if anyone adds new test cases will they remember that the types have to be fully-qualified, and if they forget will the resulting test failure make it obvious what their mistake was?
It's your codebase, your call, but speaking in terms of my personal preference, I prefer the explicit decanonicalization approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I suppose you could change the
SynCompare
impl to strip path components any time it sees::std::
before the comparison. My personal preference is to not do that, since it's more spooky action at a distance, while this impl is very explicit about what's happening.
Fair enough.
That would certainly work, but if anyone adds new test cases will they remember that the types have to be fully-qualified, and if they forget will the resulting test failure make it obvious what their mistake was?
I think so.
It's your codebase, your call, but speaking in terms of my personal preference, I prefer the explicit decanonicalization approach.
Let's go with your approach.
typify-impl/src/type_entry.rs
Outdated
e.to_string(), | ||
) | ||
}) | ||
Self::try_from( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer we keep this as parse()
-- I have some ambivalence about the use of To/From in these cases.
Are there other instances (I may have missed) where there's a functional change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that should be the only one. I've reverted this in a subsequent commit.
I should have been clearer; I was summarizing my code review feedback.
Thanks!
I'm just a little reluctant to extend testing time (etc.) if there's not a ton of value; please think it over and do what you think is best.
Thanks for checking on that! |
I just noticed in my comment #705 (comment), I just happened to pick at random one of the generated builder structs, and there is a I looked into it further, it appears that Note that as a result of this, the generated code for the test case
I'm not sure if this is considered acceptable for the generated code or not. It does not appear to be caused by any of my changes, it's just coming to light now that I've enabled generating builder structs in the test case. To confirm this, I checked out |
I did some testing (all on my local M1 Pro MBP).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some suggestions regarding comments; feel free to consider the spirit and not the literal wording
::std::string::String::deserialize(deserializer)? | ||
.parse() | ||
.map_err(|e: self::error::ConversionError| { | ||
<D::Error as ::serde::de::Error>::custom( | ||
e.to_string(), | ||
) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustfmt is happy with this change in indentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised as well, but I've run cargo fmt
before committing. I just re-ran it again to make sure; no files were changed.
@@ -64,7 +64,8 @@ fn validate_schema(path: std::path::PathBuf) -> Result<(), Box<dyn Error>> { | |||
"std", | |||
typify::CrateVers::Version("1.0.0".parse().unwrap()), | |||
None, | |||
), | |||
) | |||
.with_struct_builder(true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this to improve test coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See, see my comment #705 (comment) for a detailed explanation. There was one place left in the codegen where String
was used, and all of the tests passed because that one place was in the codegen for optional struct builders. So I enabled that for all of the schema tests, the tests started to fail, then I fixed that one remaining use of a simple String
type and verified that the tests passed. I left this in place because it seems like these tests are meant to exercise the code generation in its entirety.
I ran into oxidecomputer#568 when trying to use typify to generate Rust bindings for the [Model Context Protocol](https://spec.modelcontextprotocol.io). The Anthropic have published a (JSON Schema specification for this protocol)[https://github.com/modelcontextprotocol/specification/blob/main/schema/schema.json], but as soon as I tried to codegen Rust bindings for it, I ran into the aforementioned issue. I added the entire MCP JSON schema to `typify/tests/schemas/model-context-protocol.json`, which as expected immediately caused the `schemas` test to fail. Fixing this was a relatively simple matter of finding every place in the codegen that referred to the Rust `std::result::Result` type as simply `Result`. That was enough to make the MCP schema work right. But then I looked a bit closer and realized there were some other Rust built-in types that were being referenced in such a way as to cause either errors during codegen or generation of code that doesn't compile. So I made a diabolical test schema `rust-collisions.json` that is truly cursed. It defines a bunch of types that collide with Rust built-in types and keywords, and made the `schemas` test explode with such force that there may have been inquiries from the neighbors. Upon reassuring them that that's just how the Rust compiler expresses affection, I made some more codegen changes to resolve all of the issues that my cursed schema exposed. I've tried to make the changes minimal, and to my knowledge none of the changes I have made would be breaking from the perspective of any user of the generated Rust code. I had to modify several existing tests in `typify-impl` that broke because they were expecting the non-canonicalized versions of Rust standard types. Pay particular attention to the hack I put in `test_util::validate_output_impl` which is not something I'm particularly proud of but I think it retains the spirit of the test. Closes oxidecomputer#568
Remove some `dbg!` that I left in by accident. Revert a couple of unintnetional changes to the indentation of the JSON in the test.
Based on a PR comment from @ahl. Remove some ugly special-case handling, and unconditionally decanonicalize all `std`-prefixed types in both the generated code as well as the original types' ASTs. With this in place, we can expect the two ASTs to match exactly, which some of the tests assert.
This will result in a lot of other changes in the generated code; those changes are in a separate commit to make code review easier.
This was in the codegen for struct builders, which `schemas.rs` do not enable by default so it didn't fail with my cursed schema. This change enables generating struct builders for *all* of the test cases in `schemas.rs`, and verifies that the generated code properly compiles now that the remaining `String` is canonicalized. This also changes all of the generated tests but that will be in a separate commit for easier reviewing.
See the prior commit, Fix (and test) a missing fully-qualified `String` use. This commit is just the changes to the automatically-generated test outputs.
The bug this originally reproduced is also reproduced by the `rust-collisions.json` schema which is much simpler. Having the MCP schema slow down clean builds by 31% on my M1Pro MBP, so it's not worth it.
Co-authored-by: Adam Leventhal <[email protected]>
8e75438
to
a0cdeff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one fix, merge Cargo.toml, and we'll merge
typify-impl/src/test_util.rs
Outdated
/// Our code generation logic, always canonicalizes | ||
/// any standard type (eg, `Option` is output as `::std::option::Option`), to avoid potential | ||
/// conflicts with types in the JSON schema with conflicting names like `Option`. Unfortunately, | ||
/// this complicates the test cases that start with a Rust type that implements `JsonSchema`, use | ||
/// that to generate a JSON schema for that type, then use typify to generate a Rust binding for | ||
/// that type, and expects that the Rust AST for the source type and the generated type are exactly | ||
/// the same. | ||
/// this complicates the test cases that validate that a round-trip from Rust to JSON back to Rust | ||
/// are exactly the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap, remove comma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrapped and removed the extraneous comma.
I rebased against upstream main
to resolve the conflict with Cargo.toml
.
Hopefully this is good to go now...
Thanks so much for this contribution! Greatly appreciated! |
I ran into #568 when trying to use typify to generate Rust bindings for the Model Context
Protocol. Anthropic have published a JSON Schema specification for this protocol, but as soon as I tried to codegen Rust bindings for it, I ran into the aforementioned issue.
I added the entire MCP JSON schema to
typify/tests/schemas/model-context-protocol.json
, which as expected immediately caused theschemas
test to fail. Fixing this was a relatively simple matter of finding every place in the codegen that referred to the Ruststd::result::Result
type as simplyResult
, and modify the codegen to use the fully-qualified::std::result::Result
type name instead.. That was enough to make the MCP schema work right.But then I looked a bit closer and realized there were some other Rust built-in types that were being referenced in such a way as to cause either errors during codegen or generation of code that doesn't compile.
So I made a diabolical test schema
rust-collisions.json
that is truly cursed. It defines a bunch of types that collide with Rust built-in types and keywords, and made theschemas
test explode with such force that there may have been inquiries from the neighbors. Upon reassuring them that that's just how the Rust compiler expresses affection, I made some more codegen changes to resolve all of the issues that my cursed schema exposed, by using fully-qualified type paths::std::....
for every built-in Rust type that is referenced in the generated code.I've tried to make the changes minimal, and to my knowledge none of the changes I have made would be breaking from the perspective of any user of the generated Rust code.
I had to modify several existing tests in
typify-impl
that broke because they were expecting the non-canonicalized versions of Rust standard types. Pay particular attention to the hack I put intest_util::validate_output_impl
which is not something I'm particularly proud of but I think it retains the spirit of the test.Closes #568