-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pydeephaven-ticking arrow type support inconsistent with server #6044
Comments
A temporary workaround might be to translate the LocalDate into an Instant, either on the server, or from the client before subscribing. my_table = (
my_table
.update_view(
[
"my_timestamp=toInstant(my_date, java.time.LocalTime.MIDNIGHT, 'America/New_York')"
]
)
.drop_columns(["my_date"])
) |
I noticed at least we are missing |
Would be a good follow-up for #3455 and related tickets, as we start expanding support for arrow types. |
I am wondering if the same issue exists in the Go/R clients. @alexpeters1208 ? |
@alexpeters1208 and I tested the R client and it suffers almost the same problem and what's different is that the error is reported by the C++ client which the R client relies on. Therefore it seems that changes are needed for both the C++ client and the Python ticking client (cython) to expand the list of supported Arrow types. Once that's achieved, the R client should just work, that would leave us only the Go client to deal with. |
I used the following table from @jmao-denver as the test case for the R client:
Using the R client, I get a TableHandle for this table, which succeeds. Some of that TableHandle's methods fail with the following error:
As it turns out, all of the failing methods have something in common: they all call the Despite this, I can still successfully convert this table from server-side Deephaven to R arrow, and it has the correct types. This is because the particular implementation of this operation does not touch the Deephaven C++ client at all (at least not in the C++ binding I wrote), but rather exclusively relies on Arrow's C++ client. Relevant code is |
In case anyone is interested, here's a full reproducer of this behavior using the R client. library(rdeephaven)
library(arrow)
client <- Client$new("localhost:10000")
script <- "
import pyarrow as pa
from datetime import datetime
from deephaven import arrow as dharrow
pa_types = [
pa.time64('ns'),
pa.date64()
]
pa_data = [
pa.array([1_000_001, 1_000_002]),
pa.array([datetime(2022, 12, 7), datetime(2022, 12, 30)])
]
fields = [pa.field(f'f{i}', ty) for i, ty in enumerate(pa_types)]
schema = pa.schema(fields)
pa_table = pa.table(pa_data, schema=schema)
dh_table = dharrow.to_table(pa_table)
"
# create table on the server with bad types
client$run_script(script=script)
# get table handle
th <- client$open_table("dh_table")
# most methods do not eventually invoke TableHandle::schema()
th$nrow()
new_th <- th.update("X = ii")
new_at <- as_arrow_table(new_th)
# others do call TableHandle::Schema()
th$dim()
th$ncol()
arrow_table(new_th) |
Hi, Thanks @alexpeters1208 for the repros. In my "opinion" supporting three time types is a bit much. I can see why someone would maybe like LocalDate, but supporting the timezoneless LocalTime seems like taking a trip straight to Sad Town. That said, implementing this in C++ is fairly straightforward. I would just have to do the mapping, as above, and then either create some new data types or find some C# data types that are close enough. In any case, having all of our clients support the same set of types is the right thing to do. |
Both LocalDate and LocalTime are inherently timezoneless, and while Instant (and maybe ZonedDateTime) types have their place in terms of "this thing happened (or will/may happen) and here's when", but the fleshy meatbags who use computers insist on all kinds of more specific timezoneless formats for all kinds of stuff... "Set my alarm for 7am" okay, when you travel 'at hour west' (whatever that means), do you get up at 7am in the new timezone? And if you get there a week before a daylight saving time adjustment, why should I have to adjust what time I wake up, the rest of the world should fit my preferred starting time. "My birthday is April 26" sure, but is that from 00:00 to 23:59 in CEST, or australia DST? Or maybe we only celebrate at the Instant you were born, plus N years. I don't disagree that having more "time" types seems worse than fewer types, but having "not enough" types is an even bigger pitfall... |
I agree with that set of observations but would add (hold on while I consult my Euler diagram) the intersection between the kinds of loosey-goosey dates humans use (birthdays etc) and the kinds of data that properly belong in a high-performance streaming join database for financial professionals might be small or empty. |
This PR adds support for `LocalDate` and `LocalTime` to pydeephaven/ticking. Note this PR depends on #6105, so that one should be merged first. It also depends on #6137, which also needs to be merged first. I also add a comment to pydeephaven/README.md to remind people to make a venv. Also note that this is a fix for #6044
There is a mismatch between server side arrow type support and pydeephaven-ticking arrow type support, as evidenced by this error:
On the server, after converting the table the arrow schema includes:
There may be additional types besides date64 that need to be checked.
Relevant code includes
py/server/deephaven/arrow.py
:py/client-ticking/src/pydeephaven_ticking/_core.pyx
:We should also investigate
pydeephaven
(non-ticking) completeness as well.The text was updated successfully, but these errors were encountered: