Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cast between Intervals #4182

Merged
merged 5 commits into from
Jun 2, 2023
Merged

feat: cast between Intervals #4182

merged 5 commits into from
Jun 2, 2023

Conversation

izveigor
Copy link
Contributor

@izveigor izveigor commented May 8, 2023

Which issue does this PR close?

Closes #4181

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Yes

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 8, 2023
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some review of the overflow behaviour

Comment on lines 470 to 478
let array = array
.as_any()
.downcast_ref::<PrimitiveArray<IntervalYearMonthType>>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Internal Error: Cannot cast interval to IntervalArray of expected type"
.to_string(),
)
})?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let array = array
.as_any()
.downcast_ref::<PrimitiveArray<IntervalYearMonthType>>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Internal Error: Cannot cast interval to IntervalArray of expected type"
.to_string(),
)
})?;
let array = array.as_primitive::<IntervalYearMonthType>();

Comment on lines 493 to 501
let array = array
.as_any()
.downcast_ref::<PrimitiveArray<IntervalDayTimeType>>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Internal Error: Cannot cast interval to IntervalArray of expected type"
.to_string(),
)
})?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let array = array
.as_any()
.downcast_ref::<PrimitiveArray<IntervalDayTimeType>>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Internal Error: Cannot cast interval to IntervalArray of expected type"
.to_string(),
)
})?;
let array = array.as_primitive::<IntervalDayTimeType>();

)
})?;

let iter = array.iter().map(|v| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to perform checked conversion, and can make use of unary_opt and try_unary to achieve this based on the cast options

Comment on lines 8882 to 8889
casted_array
.as_any()
.downcast_ref::<IntervalMonthDayNanoArray>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Failed to downcast to IntervalMonthDayNanoArray".to_string(),
)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
casted_array
.as_any()
.downcast_ref::<IntervalMonthDayNanoArray>()
.ok_or_else(|| {
ArrowError::ComputeError(
"Failed to downcast to IntervalMonthDayNanoArray".to_string(),
)
})
Ok(casted_array.as_primitive::<IntervalMonthDataNanoType>().clone())

arrow-cast/src/cast.rs Show resolved Hide resolved
@alamb
Copy link
Contributor

alamb commented May 9, 2023

Thank you for taking this on @izveigor

@izveigor izveigor marked this pull request as draft May 10, 2023 18:05
)))
} else {
Ok(Arc::new(
array.try_unary::<_, IntervalMonthDayNanoType, ArrowError>(|v| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an unsafe way needed in this case, if the overflow never happens?

Copy link
Contributor

@tustvold tustvold Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the method is infallible you can just use unary without needing to use either unary_opt or try_unary

|v| {
let (days, ms) = IntervalDayTimeType::to_parts(v);
Some(IntervalMonthDayNanoType::make_value(
0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the method: months = days * 30 is not entirely accurate?

@izveigor
Copy link
Contributor Author

@tustvold Sorry for the long delay. I left a few questions on this PR. Can you pay attention to them if you have time.

@izveigor izveigor marked this pull request as ready for review May 30, 2023 20:27
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, thinking about this further I think the only well-formed interval casts are from a smaller representation to a larger representation.

I'm not sure there is a "correct" way to go from IntervalUnit::MonthDayNano to IntervalUnit::DayTime.

However, I think just providing the infallible "widening" casts, is good enough, and will be sufficient for most use-cases. At the very least they seem like a good place to start

Some(IntervalMonthDayNanoType::make_value(
0,
days,
(ms * mul).into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(ms * mul).into(),
ms as i64 * mil,

cast_options: &CastOptions,
) -> Result<ArrayRef, ArrowError> {
let array = array.as_primitive::<IntervalDayTimeType>();
let mul = 10_i32.pow(6);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mul = 10_i32.pow(6);
let mul = 1_000_000;

let array = array.as_primitive::<IntervalDayTimeType>();
let mul = 10_i32.pow(6);

if cast_options.safe {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that this actually cannot overflow, and so can just use unary.

It is performing a widening multiplication of i32 milliseconds to i64 nanoseconds

let (_, days, nanos) = IntervalMonthDayNanoType::to_parts(v);
Some(IntervalDayTimeType::make_value(
days,
(nanos / mul).try_into().unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(nanos / mul).try_into().unwrap(),
(nanos / mul).try_into().ok()?,


if cast_options.safe {
Ok(Arc::new(array.unary_opt::<_, IntervalDayTimeType>(|v| {
let (_, days, nanos) = IntervalMonthDayNanoType::to_parts(v);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to discard the months component?

array: &dyn Array,
cast_options: &CastOptions,
) -> Result<ArrayRef, ArrowError> {
let array = array.as_primitive::<IntervalMonthDayNanoType>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people would find it surprising that this discards any days or nanoseconds. I suspect we either need to error if they are anything other than 0, or make an approximation about the number of hours in a day and the number of days in a month...

@alamb
Copy link
Contributor

alamb commented Jun 1, 2023

I wonder if this PR will make the next release (so we can use it in datafusion) 🤔

@tustvold
Copy link
Contributor

tustvold commented Jun 2, 2023

Thank you

@tustvold tustvold merged commit 863d599 into apache:master Jun 2, 2023
mcheshkov added a commit to cube-js/arrow-rs that referenced this pull request Jul 18, 2024
mcheshkov added a commit to cube-js/arrow-rs that referenced this pull request Jul 22, 2024
mcheshkov added a commit to cube-js/arrow-rs that referenced this pull request Aug 21, 2024
See apache@863d599
See apache#4182

Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
mcheshkov added a commit to cube-js/arrow-rs that referenced this pull request Aug 21, 2024
See apache@863d599
See apache#4182

Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
mcheshkov added a commit to cube-js/arrow-rs that referenced this pull request Aug 21, 2024
See apache@863d599
See apache#4182

Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cast between Intervals
3 participants