Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review of number deserialization #4484

Open
wants to merge 4 commits into
base: 2.18
Choose a base branch
from

Conversation

davidmoten
Copy link
Contributor

@davidmoten davidmoten commented Apr 18, 2024

This PR for discussion only, not merge (but could happen if deemed useful).

Related to discussions started in #4453, this PR includes a NumericDeserializationReviewMain class that checks various number deserialization scenarios using both @JsonProperty deserialization and single-arg constructor deserialization.

After running that Main class and inspecting the output this is my summary:

Conclusions for property deserialization

  • integer types (byte, short, int, long) throw exception on overflow (good)
  • integer in exponential form 1e2 (or 1.0e2) not parsed by any integer type (fail)
  • integer with decimal point (100.0) not parsed by any integer type (fail)
  • cannot parse Integer.MAX_VALUE or Long.MAX_VALUE to float (fail, should accept precision loss)
  • decimal types go to Infinity on overflow (acceptable)
  • decimal types lose precision rather than throw (good)

Conclusions for single-arg deserialization

  • no support for byte, short, float (fail)
  • integer in exponential form 1e2 (or 1.0e2) not parsed by any integer type (fail)
  • 1e2 not parsed by BigDecimal (fail)
  • 9223372036854775807 (Long.MAX_VALUE) not parsed by BigDecimal (fail)
  • 100 not parsed by BigDecimal (fail)
  • large decimal (not exponential form) not parsed by BigDecimal (fail)
  • large decimal (not exponential form) when parsed to Double should be Infinity but throws (fail)

I'd like to get some agreement from project owners about what is good and what is bad behaviour (my opinions are above), and then we can plan some fixes.

Output:


------ Property maxByte-----------
  value 127

Ok ByteProperty 127
Ok ShortProperty 127
Ok IntProperty 127
Ok LongProperty 127
Ok FloatProperty 127.0
Ok DoubleProperty 127.0
Ok BigIntegerProperty 127
Ok BigDecimalProperty 127

------ Property smallDecimalFormInteger-----------
  value 100.0

Err ByteProperty Cannot deserialize value of type `byte` from String "100.0": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "100.0": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "100.0": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "100.0": not a valid `long` value
Ok FloatProperty 100.0
Ok DoubleProperty 100.0
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "100.0": not a valid representation
Ok BigDecimalProperty 100.0

------ Property smallExponentialFormInteger-----------
  value 1e2

Err ByteProperty Cannot deserialize value of type `byte` from String "1e2": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "1e2": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "1e2": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "1e2": not a valid `long` value
Ok FloatProperty 100.0
Ok DoubleProperty 100.0
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "1e2": not a valid representation
Ok BigDecimalProperty 1E+2

------ Property maxShort-----------
  value 32767

Err ByteProperty Cannot deserialize value of type `byte` from String "32767": overflow, value cannot be represented as 8-bit value
Ok ShortProperty 32767
Ok IntProperty 32767
Ok LongProperty 32767
Ok FloatProperty 32767.0
Ok DoubleProperty 32767.0
Ok BigIntegerProperty 32767
Ok BigDecimalProperty 32767

------ Property maxInt-----------
  value 2147483647

Err ByteProperty Cannot deserialize value of type `byte` from String "2147483647": overflow, value cannot be represented as 8-bit value
Err ShortProperty Cannot deserialize value of type `short` from String "2147483647": overflow, value cannot be represented as 16-bit value
Ok IntProperty 2147483647
Ok LongProperty 2147483647
Err FloatProperty Non-terminating decimal expansion; no exact representable decimal result.
Ok DoubleProperty 2.147483647E9
Ok BigIntegerProperty 2147483647
Ok BigDecimalProperty 2147483647

------ Property maxLong-----------
  value 9223372036854775807

Err ByteProperty Cannot deserialize value of type `byte` from String "9223372036854775807": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "9223372036854775807": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "9223372036854775807": Overflow: numeric value (9223372036854775807) out of range of int (-2147483648 -2147483647)
Ok LongProperty 9223372036854775807
Err FloatProperty Non-terminating decimal expansion; no exact representable decimal result.
Err DoubleProperty Non-terminating decimal expansion; no exact representable decimal result.
Ok BigIntegerProperty 9223372036854775807
Ok BigDecimalProperty 9223372036854775807

------ Property bigInteger-----------
  value 9223372036854775808

Err ByteProperty Cannot deserialize value of type `byte` from String "9223372036854775808": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "9223372036854775808": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "9223372036854775808": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "9223372036854775808": not a valid `long` value
Prec FloatProperty 9.223372E18
Prec DoubleProperty 9.223372036854776E18
Ok BigIntegerProperty 9223372036854775808
Ok BigDecimalProperty 9223372036854775808

------ Property bigIntegerExponentialForm-----------
  value 1.23e56

Err ByteProperty Cannot deserialize value of type `byte` from String "1.23e56": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "1.23e56": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "1.23e56": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "1.23e56": not a valid `long` value
Err FloatProperty Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
Ok DoubleProperty 1.23E56
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "1.23e56": not a valid representation
Ok BigDecimalProperty 1.23E+56

------ Property maxFloat-----------
  value 3.4028235E38

Err ByteProperty Cannot deserialize value of type `byte` from String "3.4028235E38": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "3.4028235E38": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "3.4028235E38": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "3.4028235E38": not a valid `long` value
Ok FloatProperty 3.4028235E38
Ok DoubleProperty 3.4028235E38
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "3.4028235E38": not a valid representation
Ok BigDecimalProperty 3.4028235E+38

------ Property maxDouble-----------
  value 1.7976931348623157E308

Err ByteProperty Cannot deserialize value of type `byte` from String "1.7976931348623157E308": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "1.7976931348623157E308": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "1.7976931348623157E308": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "1.7976931348623157E308": not a valid `long` value
Err FloatProperty Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
Ok DoubleProperty 1.7976931348623157E308
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "1.7976931348623157E308": not a valid representation
Ok BigDecimalProperty 1.7976931348623157E+308

------ Property bigDecimal-----------
  value 1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2

Err ByteProperty Cannot deserialize value of type `byte` from String "1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2": not a valid `byte` value
Err ShortProperty Cannot deserialize value of type `short` from String "1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2": not a valid `short` value
Err IntProperty Cannot deserialize value of type `int` from String "1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2": not a valid `int` value
Err LongProperty Cannot deserialize value of type `long` from String "1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2": not a valid `long` value
Err FloatProperty Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
Err DoubleProperty Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
Err BigIntegerProperty Cannot deserialize value of type `java.math.BigInteger` from String "1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2": not a valid representation
Ok BigDecimalProperty 1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2

------ SingleArg maxByte-----------
  value 127

Err OfByte no creators
Err OfShort no creators
Ok OfInt 127
Ok OfLong 127
Err OfFloat no creators
Ok OfDouble 127.0
Ok OfBigInteger 127
Err OfBigDecimal no creators

------ SingleArg smallDecimalFormInteger-----------
  value 100.0

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (100.0)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (100.0)
Err OfFloat no creators
Ok OfDouble 100.0
Err OfBigInteger no creators
Ok OfBigDecimal 100.0

------ SingleArg smallExponentialFormInteger-----------
  value 1e2

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (100.0)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (100.0)
Err OfFloat no creators
Ok OfDouble 100.0
Err OfBigInteger no creators
Ok OfBigDecimal 100.0

------ SingleArg maxShort-----------
  value 32767

Err OfByte no creators
Err OfShort no creators
Ok OfInt 32767
Ok OfLong 32767
Err OfFloat no creators
Ok OfDouble 32767.0
Ok OfBigInteger 32767
Err OfBigDecimal no creators

------ SingleArg maxInt-----------
  value 2147483647

Err OfByte no creators
Err OfShort no creators
Ok OfInt 2147483647
Ok OfLong 2147483647
Err OfFloat no creators
Ok OfDouble 2.147483647E9
Ok OfBigInteger 2147483647
Err OfBigDecimal no creators

------ SingleArg maxLong-----------
  value 9223372036854775807

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no long/Long-argument constructor/factory method to deserialize from Number value (9223372036854775807)
Ok OfLong 9223372036854775807
Err OfFloat no creators
Err OfDouble Non-terminating decimal expansion; no exact representable decimal result.
Ok OfBigInteger 9223372036854775807
Err OfBigDecimal no creators

------ SingleArg bigInteger-----------
  value 9223372036854775808

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no BigInteger-argument constructor/factory method to deserialize from Number value (9223372036854775808)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no BigInteger-argument constructor/factory method to deserialize from Number value (9223372036854775808)
Err OfFloat no creators
Err OfDouble Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfDouble` (although at least one Creator exists): no BigInteger-argument constructor/factory method to deserialize from Number value (9223372036854775808)
Ok OfBigInteger 9223372036854775808
Err OfBigDecimal no creators

------ SingleArg bigIntegerExponentialForm-----------
  value 1.23e56

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (1.23E56)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (1.23E56)
Err OfFloat no creators
Ok OfDouble 1.23E56
Err OfBigInteger no creators
Ok OfBigDecimal 1.23E+56

------ SingleArg maxFloat-----------
  value 3.4028235E38

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (3.4028235E38)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (3.4028235E38)
Err OfFloat no creators
Ok OfDouble 3.4028235E38
Err OfBigInteger no creators
Ok OfBigDecimal 3.4028235E+38

------ SingleArg maxDouble-----------
  value 1.7976931348623157E308

Err OfByte no creators
Err OfShort no creators
Err OfInt Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfInt` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (1.7976931348623157E308)
Err OfLong Cannot construct instance of `com.fasterxml.jackson.databind.deser.std.NumericDeserializationReviewMain$OfLong` (although at least one Creator exists): no double/Double-argument constructor/factory method to deserialize from Number value (1.7976931348623157E308)
Err OfFloat no creators
Ok OfDouble 1.7976931348623157E308
Err OfBigInteger no creators
Ok OfBigDecimal 1.7976931348623157E+308

------ SingleArg bigDecimal-----------
  value 1797693134862315700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.2

Err OfByte no creators
Err OfShort no creators
Err OfInt infinity
Err OfLong infinity
Err OfFloat no creators
Err OfDouble Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.
Err OfBigInteger no creators
Err OfBigDecimal Character I is neither a decimal digit number, decimal point, nor "e" notation exponential mark.

@cowtowncoder
Copy link
Member

cowtowncoder commented Apr 18, 2024

Couple of random notes:

  • Exponent and "plain" forms for floating points numbers have no difference (both exposed by JsonParser as JsonToken.VALUE_NUMBER_FLOAT. So, 1e2 and 100.0 are same for any and all handling.
  • There is no (and, IMO, should be no) automatic support for trying to auto-detect "integer-like" FP numbers like 100.0 (or 1e2) -- second case mostly because parser does not expose it differently. Put another way, JSON-spec level distinction between integral and floating-point numbers is followed to avoid level of complexity needed for other type of handling.
  • Support for BigDecimal might be the next most desirable thing
  • Supporting all number types for 1-arg (delegating) creators might require bigger refactoring, generalization of handling.
  • Failures for float/Float are probably due to lacking support generally as each target type needs supported separately (so breaking down kinds of things not supported may not make sense)

@davidmoten
Copy link
Contributor Author

Put another way, JSON-spec level distinction between integral and floating-point numbers is followed to avoid level of complexity needed for other type of handling

1e2 and 1.01e2 are definitely integers and it's easy to determine that a number in exponential form is an integer (1-based position of last non-zero figure after decimal point must be <= exponent). We should treat them as such IMO. The JSON spec describes a range of number formats but this should have no impact on what we do with them, we should just deal with the mathematical number presented regardless of the format. I'll have to take your word for it in terms of complexity but I imagine there would be a very small performance hit to do those extra checks.

@cowtowncoder
Copy link
Member

cowtowncoder commented Apr 18, 2024

1e2 and 1.01e2 are definitely integers and it's easy to determine that a number in exponential form is an integer (1-based position of last non-zero figure after decimal point must be <= exponent). We should treat them as such IMO. The JSON spec describes a range of number formats but this should have no impact on what we do with them, we should just deal with the mathematical number presented regardless of the format. I'll have to take your word for it in terms of complexity but I imagine there would be a very small performance hit to do those extra checks.

Hmmh. No, I think disagree here... this path leads to no complexity overall.
Not in the fact that these are actual integers "in disguise" but on whether it's worthwhile to try to deduce that fact from them being syntactically FP tokens.

Or at very least, let's tackle other issues first -- I think this is not one of lower hanging fruits and there are other fish to fry.

@davidmoten
Copy link
Contributor Author

davidmoten commented Apr 18, 2024

Yep, I agree. Can you pick which of the items I reported you'd like to progress, and an order? Happy to make PRs. BigDecimal fixes?

@cowtowncoder
Copy link
Member

@davidmoten yes, I think BigDecimal sounds like a good place to start.

And one more thing on my thinking wrt Int-vs-Float: I think of Java/C style coercion, so:

  1. From syntax level we get type (int or FP) that matches "naturally" without coercion (into int or FP Java type, respectively)
  2. Beyond natural match there are coercions: one (int -> FP) is allowed by default; the other (FP->int) requires setting of CoercionConfig to allow

and this somewhat rigid system is predictable and easy/easier to explain. It's not quite as convenient as more advanced detection of "actually integral number despite using E-notation" or "integral since fractional part is all zeroes", but at this point implementation is complicated enough that I prefer not adding this complexity. Even if I see why others would choose differently.

But that's enough about philosophical/design aspect. :)

@davidmoten
Copy link
Contributor Author

Thanks @cowtowncoder, sounds good.

BTW I found this quote from json-schema.org that explains my position on "actually an integer" better than I did:

JSON does not have distinct types for integers and floating-point values. Therefore, the presence or absence of a decimal point is not enough to distinguish between integers and non-integers. For example, 1 and 1.0 are two ways to represent the same value in JSON. JSON Schema considers that value an integer no matter which representation was used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants