-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support BigInt for i64 and u64 #24
Conversation
src/de.rs
Outdated
use serde::{de, serde_if_integer128}; | ||
use js_sys::{Array, ArrayBuffer, BigInt, JsString, Number, Object, Reflect, Symbol, Uint8Array}; | ||
use serde::{ | ||
de::{self}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
de::{self}, | |
de, |
src/de.rs
Outdated
|
||
fn deserialize_small_signed<'de, V: de::Visitor<'de>>(&self, visitor: V) -> Result<V::Value> { | ||
match self.as_safe_integer() { | ||
Some(v) => visitor.visit_i32(v as _), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. With as
you're casting away the whole range between 2^32...2^53 which will lead to acceptance of invalid numbers for the target data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. My bad
src/de.rs
Outdated
} | ||
let big_int: BigInt = self.value.clone().into(); | ||
// Safe to do unwraps since we know the type is a bigint | ||
let rust_string = big_int.to_string(10).unwrap().as_string().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm it's pretty wasteful to go via string conversion/passing/parsing just to pass numbers. Since there's no built-in API in BigInt to get 64-bit integer yet, I'd declare my own binding here to get number directly and use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think there is any way to do this safely. When converting an i64 into a JsValue it goes first into a string https://rustwasm.github.io/wasm-bindgen/api/src/wasm_bindgen/lib.rs.html#853. I think the same has to be done in the reverse.
A possible optimization is to first check if the number is a safe integer and use the normal f64 conversion if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woah that's bad.
Emscripten does this by either passing 2x 32-bit integers (in standard mode) or just by passing 64-bit integer directly (when using -s WASM_BIGINT
, aka https://github.com/WebAssembly/JS-BigInt-integration proposal). I think this should be doable in wasm-bindgen as well, either by making a PR upstream or doing something similar here inline.
Do you want to give it a try to make an upstream PR? If not, I might give it a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to give it a try to make an upstream PR? If not, I might give it a go.
You know what, I'm now tempted enough that I'll give it a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe for now it's better to do it here and then upstream. Anyway, my idea is that you can add bindings like this:
#[wasm_bindgen]
extern {
#[wasm_bindgen(js_name = BigInt)]
fn from_u64(x: u64) -> BigInt;
#[wasm_bindgen(js_name = BigInt)]
fn from_i64(x: i64) -> BigInt;
...
}
Those will bind to the JS BigInt
function with different signatures and use wasm-bindgen's native support for passing 64-bit integers as the argument. This will avoid string representation altogether, so the bound BigInt
function will behave just like a no-op identity function, retrieving bigint and returning one.
Similarly, for the opposite conversion you can do
#[wasm_bindgen(js_name = BigInt)]
fn to_u64(x: BigInt) -> u64;
...
and check that the original number didn't get truncated during conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I made those changes. The bindings seem to work and do not lose any precision. Do you know if wam_bindgen's function support for u64/i64 uses strings or the native BigInt support?
Do you want to give it a try to make an upstream PR? If not, I might give it a go.
I think you probably should do that as I am much less familiar with wasm_bindgen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't use strings, but doesn't use native BigInt support either yet AFAIK - instead, it passes two 32-bit integers and reassembles them on the other side. This should be still a lot faster than string conversion & allocation.
For the serializer, it will need an option as it's a breaking change. For the deserializer though, I'm tempted to say it's safe and easier to just allow both |
- fixed loss of percision - tests for feature flags
I made the deserializer work for both BigInts as well as regular JS numbers |
src/ser.rs
Outdated
@@ -267,7 +267,13 @@ impl<'s> ser::Serializer for &'s Serializer { | |||
serialize_str(&str); | |||
} | |||
|
|||
// TODO: we might want to support `BigInt` here in the future. | |||
#[cfg(feature="bigint_serialization")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making this a compile-time feature, that will affect all usages of serde-wasm-bindgen
across the entire crate tree, can you make this a runtime option on Serializer
like there is already serialize_maps_as_objects
? It would allow each user to decide which format they prefer.
- runtime option to do BigInt conversion
src/de.rs
Outdated
if self.value.is_bigint() { | ||
let big_int: BigInt = self.value.clone().into(); | ||
// Safe to do unwraps since we know the type is a bigint | ||
let rust_string = big_int.to_string(10).unwrap().as_string().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is still using strings, but shouldn't anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
src/ser.rs
Outdated
@@ -214,6 +215,7 @@ impl ser::SerializeStruct for ObjectSerializer<'_> { | |||
#[derive(Default)] | |||
pub struct Serializer { | |||
serialize_maps_as_objects: bool, | |||
serialize_64_bit_numbers_as_big_int:bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(here and around) please make sure to run cargo fmt
in the end, or, better yet, you might want to configure format-on-save in your editor.
Co-authored-by: Ingvar Stepanyan <[email protected]>
- fixed use of string conversion
tests/serde.rs
Outdated
(-(MAX_SAFE_INTEGER + 1)).serialize(&big_int_serializer).unwrap(); | ||
|
||
// Handle extreme values | ||
std::i64::MIN.serialize(&big_int_serializer).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here about missing assert_eq!
or something.
tests/serde.rs
Outdated
(MAX_SAFE_INTEGER + 1) | ||
.serialize(&big_int_serializer) | ||
.unwrap(); | ||
std::u64::MAX.serialize(&big_int_serializer).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also please add a few tests for conversion from a BigInt
that doesn't actually fit into 64 bits? (you can create one from a string, it's fine to do so in tests)
It should throw an error, but I suspect right now it's not handled correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a js file to do validation on the BigInt's. I think it is the easiest and lowest overhead way of handling the bounds check. It replaces the to_i64
and to_u64
bindings used previously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the JS integration doesn't work in all wasm-bindgen modes, see Caveats. While I agree it would be lower overhead, I'd rather keep this crate working for all output modes, as otherwise it's a pretty significant breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the initial comment where I proposed the bindings idea, you can just compare the resulting u64
with the original BigInt
instance. It will require one more roundtrip to JS but shouldn't be that bad and will be working in all modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the JS integration doesn't work in all wasm-bindgen modes, see Caveats. While I agree it would be lower overhead, I'd rather keep this crate working for all output modes, as otherwise it's a pretty significant breaking change.
I missed the lack of nodejs support while reading that page of the docs. I appreciate all of the help, I am pretty new to wasm-bindgen.
src/de.rs
Outdated
match self.as_safe_integer() { | ||
Some(v) => visitor.visit_i64(v), | ||
None => self.invalid_type(visitor), | ||
if self.value.is_bigint() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to use .dyn_into()
here instead to combine the is_*
check and .into()
cast (which does its own check internally) into one.
Co-authored-by: Ingvar Stepanyan <[email protected]>
- added more thorough tests
Slightly off-topic for this PR, but what about |
I have given it some thought and I am unsure what the situation looks like for bindings for i128 and u128. Are they the same as i64 and u64 where 32-bit ints are used and assembled? If so it shouldn't be super hard to add support for them. Would you like me to add it to this PR? |
I thought so, but looks like they're not yet correctly supported by wasm-bindgen, so they would have to either go via strings or be implemented upstream first. Sine it's a bit more complicated, I think we can keep their implementation out of this PR, unless you want to implement string-based option for them for now - up to you. Either way, I'd probably at least rename the option to |
- removed js module bindings - added tests for i128, u128
Ok, I enabled serialization support for i128 and u128 when |
src/de.rs
Outdated
if &bindings::big_int_from_i64(converted_number) == big_int { | ||
visitor.visit_i64(converted_number) | ||
} else { | ||
Err(de::Error::custom("i64 attempted to be constructed from Bigint that was larger than i64::MAX or less than i64::MIN")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it's BigInt
(same in the other message below).
src/ser.rs
Outdated
@@ -215,7 +215,7 @@ impl ser::SerializeStruct for ObjectSerializer<'_> { | |||
#[derive(Default)] | |||
pub struct Serializer { | |||
serialize_maps_as_objects: bool, | |||
serialize_64_bit_numbers_as_big_int: bool, | |||
serialize_large_number_types_as_big_ints: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also nitpicking, but JS uses bigint
for primitive type, so I think we should write them together in the API too.
src/ser.rs
Outdated
if self.serialize_large_number_types_as_big_ints { | ||
Ok(JsValue::from(v)) | ||
} else { | ||
Err(Error::custom("i128 is not supported")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, shouldn't this fallback to JS number when option isn't activated? I'm of two minds on this, perhaps it's fine for it to error out, but then it would be good to at least expand the error message to clarify that it's supported, just needs to be enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The current behavior is to error so I was going to uphold that. In addition, I don't think there are many cases where it would be a good idea to convert a 128-bit integer to a 53 bit one (Js Number).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah erroring is fine if that's current behaviour, just saying the error message probably needs to be clearer and nudge the user in the right direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even something like "i128 is supported only in serialize_large_number_types_as_big_ints
mode" would already go a long way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad fixed the error in one place and not the other. It is updated now
tests/serde.rs
Outdated
{ | ||
const MAX_SAFE_INTEGER: i64 = 9_007_199_254_740_991; | ||
|
||
// Should be bigint | ||
0_i64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: instead of converting, you can probably just do assert!(expr.is_bigint())
. Or does this one provide a better error message or something?
This is looking good! Left last round of nitpicks, but I think we're close :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added slight rewordings to error messages. Otherwise I think this is good to go now.
Co-authored-by: Ingvar Stepanyan <[email protected]>
Co-authored-by: Ingvar Stepanyan <[email protected]>
Co-authored-by: Ingvar Stepanyan <[email protected]>
Co-authored-by: Ingvar Stepanyan <[email protected]>
tests/serde.rs
Outdated
@@ -218,7 +218,7 @@ fn numbers() { | |||
// Big ints that are too large or small should error | |||
let u64_max_bigint = std::u64::MAX.serialize(&bigint_serializer).unwrap(); | |||
from_value::<i64>(u64_max_bigint).unwrap_err(); | |||
let negative_one = -1_i64.serialize(&bigint_serializer).unwrap(); | |||
let negative_one = -(1_i64.serialize(&bigint_serializer).unwrap()); | |||
let to_small = std::u64::MAX.serialize(&bigint_serializer).unwrap() * negative_one; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo for too_small
?
tests/serde.rs
Outdated
@@ -218,7 +218,7 @@ fn numbers() { | |||
// Big ints that are too large or small should error | |||
let u64_max_bigint = std::u64::MAX.serialize(&bigint_serializer).unwrap(); | |||
from_value::<i64>(u64_max_bigint).unwrap_err(); | |||
let negative_one = -1_i64.serialize(&bigint_serializer).unwrap(); | |||
let negative_one = -(1_i64.serialize(&bigint_serializer).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you didn't mean the opposite - (-1_i64).serialize(...)
? I'm not sure if negation operator is overloaded for JsValue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also, if you use those values only as temporaries and not for testing serialization, you might as well just create BigInt::from(...)
directly like BigInt::from(-i128::from(std::u64::MAX))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, if we go there, even BigInt::from(i128::MIN)
since you only care about value being out of bounds of i64
/u64
and that one is the easiest to get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Changed
Thanks a lot! |
Released in 0.4. |
This is the same as #22. I ended up needing this for a project I am working on so I decided to implement it. Let me know if there are any changes that are needed. Since this is a breaking change for the consumers of the API, should this be behind a feature flag?