Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: decimal conversion looses value on lower precision #6836

Merged
merged 3 commits into from
Dec 12, 2024

Conversation

himadripal
Copy link
Contributor

@himadripal himadripal commented Dec 4, 2024

Which issue does this PR close?

Closes #6833

Closes #.

Rationale for this change

Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases.

What changes are included in this PR?

It adds a decimal validation after conversion to check if the converted result can fit into the specified precision and scale

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 4, 2024
@himadripal
Copy link
Contributor Author

@andygrove @viirya @alamb please take a look.

@@ -112,8 +112,19 @@ where
};

Ok(match cast_options.safe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to match on a boolean rather than just using an if statement. I know this is how the existing code was, but perhaps we could improve this while we are here.

@himadripal himadripal changed the title decimal conversion looses value on lower precision, throws error now … fix: decimal conversion looses value on lower precision Dec 4, 2024
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The logic now matches the logic in cast_floating_point_to_decimal128. Thanks @himadripal

let result = cast_with_options(&array, &output_type, &options);

assert_eq!(result.unwrap_err().to_string(),
"InvalidArgumentError(123456789 is too large to store in a Decimal128 of precision 6. Max is 999999)");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message format needs updating:

  left: "Invalid argument error: 123456789 is too large to store in a Decimal256 of precision 6. Max is 999999"
 right: "InvalidArgumentError(123456789 is too large to store in a Decimal256 of precision 6. Max is 999999)"

@andygrove
Copy link
Member

There is a regression in test_cast_decimal128_to_decimal256_negative. The new validation check is correctly throwing an error, but it looks like we also need to add this validation when creating decimal arrays since the current test is creating invalid arrays before the cast:

let array = vec![Some(i128::MAX), Some(i128::MIN)];
let input_decimal_array = create_decimal_array(array, 10, 3).unwrap();

I would expect this to fail, so we probably need to add the same validation there.

@andygrove
Copy link
Member

andygrove commented Dec 4, 2024

There is a regression in test_cast_decimal128_to_decimal256_negative. The new validation check is correctly throwing an error, but it looks like we also need to add this validation when creating decimal arrays since the current test is creating invalid arrays before the cast:

let array = vec![Some(i128::MAX), Some(i128::MIN)];
let input_decimal_array = create_decimal_array(array, 10, 3).unwrap();

I would expect this to fail, so we probably need to add the same validation there.

On second thoughts, it seems it was an intentional design decision not to validate this on array creation. Instead, an array.validate_decimal_precision method can optionally be called on the array to validate it after creation, so we should probably just update the test as needed (@alamb @tustvold perhaps you could correct me if I am wrong about this).

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we run any benchmarks (I'm not sure if any actually exist) to confirm this doesn't significantly regress performance.

It seems unfortunate to be always performing overflow checks, when in many cases it should be possible to prove that precision overflow can't occur and need not be checked for

@@ -156,19 +179,9 @@ where
T::Native: DecimalCast + ArrowNativeTypeOp,
{
let array: PrimitiveArray<T> = match input_scale.cmp(&output_scale) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be changed to <=

array.try_unary(|x| {
f(x).ok_or_else(|| error(x))
.and_then(|v|{
O:: validate_decimal_precision(v, output_precision).map(|_| v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do this when computing the value (the code above), and only when this can fail?
(for example widening precision cannot fail, so no need to spend cycles on validating the produced values)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but then there are other examples where is it done this way, so kept it there. We can also do this as part of first error check point - inside this method

let error = cast_decimal_to_decimal_error::<I, O>(output_precision, output_scale);

@andygrove
Copy link
Member

Have we run any benchmarks (I'm not sure if any actually exist) to confirm this doesn't significantly regress performance.

It seems unfortunate to be always performing overflow checks, when in many cases it should be possible to prove that precision overflow can't occur and need not be checked for

I'll create a separate PR (probably tomorrow) to add some criterion benchmarks

@himadripal
Copy link
Contributor Author

himadripal commented Dec 5, 2024

There is a regression in test_cast_decimal128_to_decimal256_negative. The new validation check is correctly throwing an error, but it looks like we also need to add this validation when creating decimal arrays since the current test is creating invalid arrays before the cast:

let array = vec![Some(i128::MAX), Some(i128::MIN)];
let input_decimal_array = create_decimal_array(array, 10, 3).unwrap();

I would expect this to fail, so we probably need to add the same validation there.

On second thoughts, it seems it was an intentional design decision not to validate this on array creation. Instead, an array.validate_decimal_precision method can optionally be called on the array to validate it after creation, so we should probably just update the test as needed (@alamb @tustvold perhaps you could correct me if I am wrong about this).

Changed the test to pass.

// input_scale < output_scale
let array: PrimitiveArray<T> = if input_scale <= output_scale {
// input_scale <= output_scale
// the scale doesn't change, but precision may change and cause overflow
Copy link
Member

@viirya viirya Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we also check precision and skip convert_to_bigger_or_equal_scale_decimal if it won't overflow?

Copy link
Contributor Author

@himadripal himadripal Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't it still has to go through the convert_to_bigger_or_equal_scale_decimal to add the zero at the end for bigger precision conversion. IMO, we can only skip the call if precision is less and scales are equal. Please correct me here.
we can do check at the beginning like this

if (array.validate_decimal_precision().is_err()) 

but we need to return null for safe=true and throw error for safe=false
is that what you are suggesting?

Copy link
Member

@viirya viirya Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, original code simply returns the original array for same scale. If the new precision is bigger, I think we can still do that, no?

Copy link
Contributor Author

@himadripal himadripal Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets assume, we are converting 12.345 from (5, 3) to (6,3), then it needs to be 123450 - if we do array.clone() as per original code- wondering how it will add the 0 at the end.. Then it should still be 12.345

Copy link
Contributor Author

@himadripal himadripal Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is correct, then we can return original array if precision is bigger and scale is equal.

select arrow_typeof(cast(cast(1.23 as decimal(10,3)) as decimal(12,3))),
       cast(cast(1.23 as decimal(10,3)) as decimal(12,3));
----
Decimal128(12, 3) 1.23

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12345 in (5, 3) is 12.345. When casting to (6, 3), it is still 12345, why it needs to be 123450? 123450 in (6, 3) is 123.45. If I understand it correctly.

Copy link
Contributor Author

@himadripal himadripal Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. fn signature only has output_precision, I can get the input_precision from the calling fn, it's available there and change the signature to include input_precision. would that be fine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. @viirya please check.

…eded.

revert whitespace changes

formatting check
@himadripal himadripal force-pushed the fix_decimal_conversion_bug branch from 68b0f68 to 6c50fe3 Compare December 5, 2024 17:32
@himadripal
Copy link
Contributor Author

himadripal commented Dec 6, 2024

Can anyone please let the build run - workflows waiting for approval.

@andygrove
Copy link
Member

Have we run any benchmarks (I'm not sure if any actually exist) to confirm this doesn't significantly regress performance.

It seems unfortunate to be always performing overflow checks, when in many cases it should be possible to prove that precision overflow can't occur and need not be checked for

I created a simple benchmark for decimal casting in #6850.

Unsurprisingly, validating that the results are correct is slower than not validating the results.

before

cast_decimal            time:   [45.281 ns 45.549 ns 45.871 ns]

after (this PR)

cast_decimal            time:   [247.97 ns 248.78 ns 249.78 ns]
                        change: [+435.06% +439.47% +443.15%] (p = 0.00 < 0.05)

We currently have the config option of safe on or off:

pub struct CastOptions<'a> {
    /// how to handle cast failures, either return NULL (safe=true) or return ERR (safe=false)
    pub safe: bool,

So, yes, it is a performance regression, but the previous behavior was incorrect. This PR now makes this work as advertised.

@tustvold Is there a use case we need to support for faster casts without validating results per the CastOptions?

@tustvold
Copy link
Contributor

tustvold commented Dec 7, 2024

@tustvold Is there a use case we need to support for faster casts without validating results per the CastOptions?

No, the cast should be checked, apologies if that wasn't clear, my concern was the PR as originally formulated blindly performed the checked conversion regardless of the input type, even when the cast was increasing the precision. Given the whole purpose of tracking precision is to avoid overflow checks, it seemed a little off.

I'll try to find some time to take another look at this PR as from a quick scan this looks to have been addressed

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still overly pessimistic, for example, if I increase both the precision and scale by the same amount I shouldn't need to perform any checks.

However, the kernel is already fallible (try_unary/unary_opt vs unary) and so this kernel already won't be vectorizing properly, and so it isn't like we're regressing a highly optimised kernel here.

I've therefore instead opted to file #6877 and if people care they can action it.

@tustvold tustvold merged commit eb7ab83 into apache:main Dec 12, 2024
27 checks passed
andygrove pushed a commit to andygrove/arrow-rs that referenced this pull request Jan 3, 2025
* decimal conversion looses value on lower precision, throws error now on overflow.

* fix review comments and fix formatting.

* for simple case of equal scale and bigger precision, no conversion needed.

revert whitespace changes

formatting check

---------

Co-authored-by: himadripal <[email protected]>
alamb pushed a commit that referenced this pull request Jan 5, 2025
* decimal conversion looses value on lower precision, throws error now on overflow.

* fix review comments and fix formatting.

* for simple case of equal scale and bigger precision, no conversion needed.

revert whitespace changes

formatting check

---------

Co-authored-by: Himadri Pal <[email protected]>
Co-authored-by: himadripal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Casting Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases
5 participants