Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect nanosecond serialization for very large or small values #771

Open
matt-phylum opened this issue Jul 19, 2024 · 3 comments · May be fixed by #772
Open

Incorrect nanosecond serialization for very large or small values #771

matt-phylum opened this issue Jul 19, 2024 · 3 comments · May be fixed by #772
Labels
bug Something isn't working

Comments

@matt-phylum
Copy link

TimestampNanoSeconds has the following serialization behavior:

  • ..1385-06-12T00:25:26.290448385Z: panic overflow when multiplying duration by scalar
  • 1385-06-12T00:25:26.290448385Z..1677-09-21T00:12:43.145224192Z: incorrect positive value
  • 1677-09-21T00:12:43.145224192Z: panic attempt to negate with overflow
  • 1677-09-21T00:12:43.145224193Z..2262-04-11T23:47:16.854775808Z: correct
  • 2262-04-11T23:47:16.854775808Z..2554-07-21T23:34:33.709551616Z: incorrect negative value
  • 2554-07-21T23:34:33.709551616Z..: panic overflow when multiplying duration by scalar

Not being able to serialize values outside the range 1677-09-21T00:12:43.145224193Z..2262-04-11T23:47:16.854775808Z is expected since it's serializing to an i64, but if the value is outside that range it should return an error, not panic or return incorrect values.

use chrono::{DateTime, Utc};
use serde::Serialize;
use serde_with::serde_as;

fn main() {
    #[serde_as]
    #[derive(Serialize)]
    struct S(#[serde_as(as = "serde_with::TimestampNanoSeconds")] DateTime<Utc>);

    fn try_serialize(v: DateTime<Utc>) {
        let s = std::panic::catch_unwind(|| {
            serde_json::to_string(&S(v)).unwrap_or_else(|e| format!("Serialization error: {e}"))
        })
        .unwrap_or_else(|p| {
            format!(
                "Panic: {}",
                p.downcast_ref::<&str>()
                    .map(|s| &**s)
                    .unwrap_or_else(|| p.downcast_ref::<String>().map(|s| &**s).unwrap())
            )
        });
        println!("{v}: {s}");
    }

    try_serialize(DateTime::<Utc>::MIN_UTC);
    try_serialize("1385-06-12T00:25:26.290448384Z".parse().unwrap());
    try_serialize("1385-06-12T00:25:26.290448385Z".parse().unwrap());
    try_serialize("1677-09-21T00:12:43.145224191Z".parse().unwrap());
    try_serialize("1677-09-21T00:12:43.145224192Z".parse().unwrap());
    try_serialize("1677-09-21T00:12:43.145224193Z".parse().unwrap());
    try_serialize("2262-04-11T23:47:16.854775807Z".parse().unwrap());
    try_serialize("2262-04-11T23:47:16.854775808Z".parse().unwrap());
    try_serialize("2554-07-21T23:34:33.709551615Z".parse().unwrap());
    try_serialize("2554-07-21T23:34:33.709551616Z".parse().unwrap());
    try_serialize(DateTime::<Utc>::MAX_UTC);
}
-262143-01-01 00:00:00 UTC: Panic: overflow when multiplying duration by scalar
1385-06-12 00:25:26.290448384 UTC: Panic: overflow when multiplying duration by scalar
1385-06-12 00:25:26.290448385 UTC: 1
1677-09-21 00:12:43.145224191 UTC: 9223372036854775807
1677-09-21 00:12:43.145224192 UTC: Panic: attempt to negate with overflow
1677-09-21 00:12:43.145224193 UTC: -9223372036854775807
2262-04-11 23:47:16.854775807 UTC: 9223372036854775807
2262-04-11 23:47:16.854775808 UTC: -9223372036854775808
2554-07-21 23:34:33.709551615 UTC: -1
2554-07-21 23:34:33.709551616 UTC: Panic: overflow when multiplying duration by scalar
+262142-12-31 23:59:59.999999999 UTC: Panic: overflow when multiplying duration by scalar

A similar problem exists with TimestampNanoSecondsWithFrac:

  • ..1385-06-12T00:25:26.290448385Z: panic overflow when multiplying duration by scalar
  • 1385-06-12T00:25:26.290448385Z..2554-07-21T23:34:33.709551616Z: correct
  • 2554-07-21T23:34:33.709551616Z..: panic overflow when multiplying duration by scalar

TimeStampNanoSecondsWithFrac should never fail because it is serializing to f64 and f64 can represent much larger values, at the expense of precision.

The other TimeStamp* serializers are not affected because the range of values that can be represented by DateTime is smaller than the range of values that can be represented by an i64.

DurationNanoSeconds and DurationNanoSecondsFrac also have a similar problem:

  • ..18446744073.709551616s: correct
  • 18446744073.709551616s..: panic: overflow when multiplying duration by scalar

DurationNanoSeconds should return an error instead of panicking, and DurationNanoSecondsFrac should never fail.

DurationMicroSeconds and DurationMicroSecondsFrac fail the same way for 18446744073709.551616s... 18446744073709551.616s.. for DurationMilliSeconds and DurationMilliSecondsWithFrac. DurationSeconds and DurationSecondsWithFrac are unaffected.

@matt-phylum
Copy link
Author

I was planning to open another issue for having a DefaultOnError that works for serialization, but SerializeAs doesn't have a way to distinguish between errors because the value is unrepresentable and errors because the underlying serializer reported an error. Maybe the fallible (without fraction) versions could to take a type parameter that defines whether an out of range value is skipped or None or 0.

@jonasbb
Copy link
Owner

jonasbb commented Jul 21, 2024

Thank you for the report. Yes, there is indeed a problem here. The code uses the normal math operations and some as casts. Fixing all of the instances will be a bit difficult, since the clippy lints for that will lead to many warnings, but not all of them actually being problematic.
Many of the values will not be supported since they cannot be represented internally.

How did you come up with those time values to test? I wonder if you have a list of such strings/timestamps or if you used automated tools.

I do want to fix the issue, but it might take a while. It seems to require some steps

  • Check/Remove the use of ops::Mul and ops::Neg since they can fail. This covers the panics reported here.
  • Check all as casts. That should address "2262-04-11 23:47:16.854775808 UTC: -9223372036854775808".
  • Check all arithmetic operations. If there is an adjustment necessary, because of sub-unit precision, then the value might need to be adjusted +/- 1, which could lead to over/underflows. Not sure if that applies here, since those are all Nanoseconds, but would apply when using TimestampSeconds with a >500ms subsecond part.
  • Document the supported time ranges for the different types.

@jonasbb jonasbb added the bug Something isn't working label Jul 21, 2024
@matt-phylum
Copy link
Author

I used binary search to find these edges between behaviors. I don't know if there's a list of such timestamps, but 9,223,372,036,854,775,807 is the limit of an i64 so those are probably good values to check for any implementation.

jonasbb added a commit that referenced this issue Dec 25, 2024
jonasbb added a commit that referenced this issue Dec 25, 2024
This removes some potential arithmetic issues

See #771 for reference
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants