-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cast between Intervals
#4182
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs some review of the overflow behaviour
arrow-cast/src/cast.rs
Outdated
let array = array | ||
.as_any() | ||
.downcast_ref::<PrimitiveArray<IntervalYearMonthType>>() | ||
.ok_or_else(|| { | ||
ArrowError::ComputeError( | ||
"Internal Error: Cannot cast interval to IntervalArray of expected type" | ||
.to_string(), | ||
) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let array = array | |
.as_any() | |
.downcast_ref::<PrimitiveArray<IntervalYearMonthType>>() | |
.ok_or_else(|| { | |
ArrowError::ComputeError( | |
"Internal Error: Cannot cast interval to IntervalArray of expected type" | |
.to_string(), | |
) | |
})?; | |
let array = array.as_primitive::<IntervalYearMonthType>(); |
arrow-cast/src/cast.rs
Outdated
let array = array | ||
.as_any() | ||
.downcast_ref::<PrimitiveArray<IntervalDayTimeType>>() | ||
.ok_or_else(|| { | ||
ArrowError::ComputeError( | ||
"Internal Error: Cannot cast interval to IntervalArray of expected type" | ||
.to_string(), | ||
) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let array = array | |
.as_any() | |
.downcast_ref::<PrimitiveArray<IntervalDayTimeType>>() | |
.ok_or_else(|| { | |
ArrowError::ComputeError( | |
"Internal Error: Cannot cast interval to IntervalArray of expected type" | |
.to_string(), | |
) | |
})?; | |
let array = array.as_primitive::<IntervalDayTimeType>(); |
arrow-cast/src/cast.rs
Outdated
) | ||
})?; | ||
|
||
let iter = array.iter().map(|v| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to perform checked conversion, and can make use of unary_opt and try_unary to achieve this based on the cast options
arrow-cast/src/cast.rs
Outdated
casted_array | ||
.as_any() | ||
.downcast_ref::<IntervalMonthDayNanoArray>() | ||
.ok_or_else(|| { | ||
ArrowError::ComputeError( | ||
"Failed to downcast to IntervalMonthDayNanoArray".to_string(), | ||
) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
casted_array | |
.as_any() | |
.downcast_ref::<IntervalMonthDayNanoArray>() | |
.ok_or_else(|| { | |
ArrowError::ComputeError( | |
"Failed to downcast to IntervalMonthDayNanoArray".to_string(), | |
) | |
}) | |
Ok(casted_array.as_primitive::<IntervalMonthDataNanoType>().clone()) |
Thank you for taking this on @izveigor |
… IntervalUnit::DayTime -> IntervalUnit::MonthDayNano
arrow-cast/src/cast.rs
Outdated
))) | ||
} else { | ||
Ok(Arc::new( | ||
array.try_unary::<_, IntervalMonthDayNanoType, ArrowError>(|v| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an unsafe way needed in this case, if the overflow never happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the method is infallible you can just use unary
without needing to use either unary_opt
or try_unary
arrow-cast/src/cast.rs
Outdated
|v| { | ||
let (days, ms) = IntervalDayTimeType::to_parts(v); | ||
Some(IntervalMonthDayNanoType::make_value( | ||
0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the method: months = days * 30
is not entirely accurate?
@tustvold Sorry for the long delay. I left a few questions on this PR. Can you pay attention to them if you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, thinking about this further I think the only well-formed interval casts are from a smaller representation to a larger representation.
I'm not sure there is a "correct" way to go from IntervalUnit::MonthDayNano
to IntervalUnit::DayTime
.
However, I think just providing the infallible "widening" casts, is good enough, and will be sufficient for most use-cases. At the very least they seem like a good place to start
arrow-cast/src/cast.rs
Outdated
Some(IntervalMonthDayNanoType::make_value( | ||
0, | ||
days, | ||
(ms * mul).into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(ms * mul).into(), | |
ms as i64 * mil, |
arrow-cast/src/cast.rs
Outdated
cast_options: &CastOptions, | ||
) -> Result<ArrayRef, ArrowError> { | ||
let array = array.as_primitive::<IntervalDayTimeType>(); | ||
let mul = 10_i32.pow(6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mul = 10_i32.pow(6); | |
let mul = 1_000_000; |
arrow-cast/src/cast.rs
Outdated
let array = array.as_primitive::<IntervalDayTimeType>(); | ||
let mul = 10_i32.pow(6); | ||
|
||
if cast_options.safe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that this actually cannot overflow, and so can just use unary
.
It is performing a widening multiplication of i32 milliseconds to i64 nanoseconds
arrow-cast/src/cast.rs
Outdated
let (_, days, nanos) = IntervalMonthDayNanoType::to_parts(v); | ||
Some(IntervalDayTimeType::make_value( | ||
days, | ||
(nanos / mul).try_into().unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nanos / mul).try_into().unwrap(), | |
(nanos / mul).try_into().ok()?, |
arrow-cast/src/cast.rs
Outdated
|
||
if cast_options.safe { | ||
Ok(Arc::new(array.unary_opt::<_, IntervalDayTimeType>(|v| { | ||
let (_, days, nanos) = IntervalMonthDayNanoType::to_parts(v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to discard the months component?
arrow-cast/src/cast.rs
Outdated
array: &dyn Array, | ||
cast_options: &CastOptions, | ||
) -> Result<ArrayRef, ArrowError> { | ||
let array = array.as_primitive::<IntervalMonthDayNanoType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people would find it surprising that this discards any days or nanoseconds. I suspect we either need to error if they are anything other than 0, or make an approximation about the number of hours in a day and the number of days in a month...
I wonder if this PR will make the next release (so we can use it in datafusion) 🤔 |
Thank you |
See apache@863d599 See apache#4182 Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
See apache@863d599 See apache#4182 Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
See apache@863d599 See apache#4182 Can drop this after rebase on commit 863d599 "feat: cast between Intervals (apache#4182)", first released in 41.0.0
Which issue does this PR close?
Closes #4181
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?
Yes