Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime: support timezone bindings #1588

Merged
merged 1 commit into from
Jul 13, 2022

Conversation

davidhewitt
Copy link
Member

This PR implements safe wrappers for the Python 3.7 timezone bindings described in #207.

Rather than try to do anything complicated with a new type representing exactly datetime.timezone, I return &PyTzInfo from the three new functions timezone_utc(), timezone_from_offset() and timezone_from_offset_and_name(). I think this is good enough to be useful; datetime.timezone is a subclass of datetime.tzinfo so doing this is correct.

I also adjusted a few constructors taking tzinfo arguments to take Option<&PyTzInfo> instead of Option<&PyObject>.

Finally, I tweaked the ffi symbol PyDateTime_TimeZone_UTC I added recently in #1573 to resolve to &'static *mut ffi::PyObject, which I think is a more correct representation of the "unsafe ffi" nature of that global.

Closes #207

cc @pganssle in case you might be interested / have an opinion on this.

@davidhewitt davidhewitt force-pushed the timezone_offset branch 6 times, most recently from 30d35ba to bd6ecde Compare April 30, 2021 22:44
@@ -279,7 +252,7 @@ impl PyTime {
minute: u8,
second: u8,
microsecond: u32,
tzinfo: Option<&PyObject>,
tzinfo: Option<&PyTzInfo>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be a breaking change? It may probably affect pyo3-chrono crate. Maybe need to ping the author if this is indeed the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and we'll release this as part of the 0.14 release. I've notified this in the CHANGELOG, and we are still pre-1.0, after all. If it turns out to be a big problem we can revert it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite remember why I went with &PyObject instead of &PyTzInfo here originally. It's entirely possible that I misunderstood how inheritance in Rust and Python can play together well, The datetime.tzinfo class is an abstract base class (but I think one that was created before Python supported either abstract base classes or Protocols).

In theory it should be a trait, but in practice datetime enforces that it is an actual subclass of datetime.tzinfo and I don't know if I'll be able to change that any time soon. Presumably it is safe to change this from PyTzInfo back to PyObject after the 1.0 release, since all PyTzInfo objects are PyObjects but not vice versa?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I remember why it has to be &PyObject now — tzinfo can also be None (and in fact probably is None in the common case, since datetime.timezone is not a very good time zone representation but datetimes are useful anyway).

I'm actually fairly surprised that this doesn't break any tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry to keep going back and forth here, but I forgot that this is Option<&PyTzInfo>, and that the None is handled by the None case of Option, so I'm back to not remembering why this was PyObject instead of PyTzInfo.

@@ -445,12 +445,11 @@ pub static PyDateTimeAPI: _PyDateTimeAPI_impl = _PyDateTimeAPI_impl {
inner: GILOnceCell::new(),
};

/// Safe wrapper around the Python C-API global `PyDateTime_TimeZone_UTC`. This follows a similar
/// Wrapper around the Python C-API global `PyDateTime_TimeZone_UTC`. This follows a similar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not a safe wrapper now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I made it less safe because it gives you &'static *mut ffi::PyObject, which is a pointer and thus less safe to use. The safe wrapper is now the pyo3::types::timezone_utc() function.

Comment on lines 335 to 509
/// Equivalent to `datetime.timezone.utc`
#[cfg(all(Py_3_7, not(PyPy)))]
pub fn timezone_utc(py: Python) -> &PyTzInfo {
unsafe {
&*(&*ffi::PyDateTime_TimeZone_UTC as *const *mut ffi::PyObject
as *const crate::Py<PyTzInfo>)
}
.as_ref(py)
}

/// Equivalent to `datetime.timezone(offset)`
#[cfg(all(Py_3_7, not(PyPy)))]
pub fn timezone_from_offset<'py>(py: Python<'py>, offset: &PyDelta) -> PyResult<&'py PyTzInfo> {
unsafe { py.from_owned_ptr_or_err(ffi::PyTimeZone_FromOffset(offset.as_ptr())) }
}

/// Equivalent to `datetime.timezone(offset, name)`
#[cfg(all(Py_3_7, not(PyPy)))]
pub fn timezone_from_offset_and_name<'py>(
py: Python<'py>,
offset: &PyDelta,
name: &str,
) -> PyResult<&'py PyTzInfo> {
let name = name.into_py(py);
unsafe {
py.from_owned_ptr_or_err(ffi::PyTimeZone_FromOffsetAndName(
offset.as_ptr(),
name.as_ptr(),
))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't these be easier to discover if we put it within impl PyTzInfo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on that. Yes, it probably would. However I wasn't sure if that was totally "correct" because they're actually creating a subtype. If you also think it'd be nicer to have them on impl PyTzInfo then I'm persuaded that's the right thing to do 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am just thinking that datetime module could have tons of functions already and separating these would increase those later on, since these will only create PyTzInfo I don't see why not put it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely don't think this should go in impl PyTzInfo. PyTzInfo exists only to be a base class, so you shouldn't really be able to get instances of it. Despite Rust's lack of nominal type hierarchies, it needs Rust wrappers so you can do stuff like this:

#[pyclass(extends=PyTzInfo)]
pub struct TzClass {}
#[pymethods]
impl TzClass {
#[new]
fn new() -> Self {
TzClass {}
}
fn utcoffset<'p>(&self, py: Python<'p>, _dt: &PyDateTime) -> PyResult<&'p PyDelta> {
PyDelta::new(py, 0, 3600, 0, true)
}
fn tzname(&self, _py: Python<'_>, _dt: &PyDateTime) -> String {
String::from("+01:00")
}
fn dst(&self, _py: Python<'_>, _dt: &PyDateTime) -> Option<&PyDelta> {
None
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean with &self, I just mean something like

impl PyTzinfo {
    fn utcoffset<'py>(py: Python<'py>, offset: &PyDelta) -> PyResult<&'py PyTzInfo> {
        ...
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pickfire Regardless, I don't think it's the right abstraction, because it implies that this is a constructor for PyTzInfo, which is an abstract base class.

I do think it would make sense for these methods to turn into a wrapper class for datetime.timezone like the others (minus the _Check methods and such), in which case these could be alternate constructors for PyFixedTimezone or whatever it's called.

@pganssle
Copy link
Member

pganssle commented May 1, 2021

I'm not sure I understand what is complicated about having a new type representing datetime.timezone. Is it the inheritance parts (datetime.timezone is not supposed to be a superclass, so maybe that's why it's weird compared to the other classes?).

All of the other type wrappers we have are Rust types with constructors corresponding to the equivalent C API constructors.

@pickfire
Copy link
Contributor

pickfire commented May 1, 2021

But this one is a python wrapper. Wouldn't PyAny have some performance cost when doing the conversion unlike the C API? Since it uses __slots__ within python do we have any way to convert it directly?

@davidhewitt
Copy link
Member Author

I'm not sure I understand what is complicated about having a new type representing datetime.timezone.

We haven't yet had to model the inheritance hierarchy of types like this. Would PyTimeZone implement Deref<Target=PyTzInfo> instead of Deref<Target=PyAny>? It probably should (otherwise you wouldn't be able to use the return type of PyTimeZone::utc() in a call to PyTime::new, for example), but that'd mean reworking PyO3's internal macros.

... I'm going to work on a couple of PRs to see if implementing Deref like that is practical, please hold with this...

@pganssle
Copy link
Member

pganssle commented May 3, 2021

We haven't yet had to model the inheritance hierarchy of types like this. Would PyTimeZone implement Deref<Target=PyTzInfo> instead of Deref<Target=PyAny>? It probably should (otherwise you wouldn't be able to use the return type of PyTimeZone::utc() in a call to PyTime::new, for example), but that'd mean reworking PyO3's internal macros.

It has unfortunately been a while since I had a halfway decent understanding of the way Python inheritance is modeled in PyO3 (if I ever did). This sounds appealing to me, but is there a more general problem that sometimes your PyTzInfo needs to be a PyAny, so we'd need to do the Deref trick twice?

Is it not sufficient to implement From<&PyFixedTimezone> for both &PyAny and &PyTzInfo and satisfy both requirements?

In this case, it would be really nice if we could use a trait instead, since the actual memory layout of the tzinfo subclass doesn't matter at all for datetime's purposes, but datetime does an isinstance check on the tzinfo object meaning it must be an actual subclass. We could theoretically change that in Python 3.11, but in some ways that feels unnecessary since: 1. no one has actually expressed any desire to write a time zone class that doesn't inherit from datetime.tzinfo and 2. all the PyO3 types will be wrappers around things that actually are tzinfo subclasses, and our only struggle is in convincing Rust that that's the case. So while that change would make things more aligned with reality (we only care that you implement the tzinfo protocol, not that you pass an actual tzinfo object), it's also a totally "on paper" change that would have no practical benefits. Also, that change wouldn't help us in the short-run, since PyO3 obviously supports Python versions earlier than 3.11.

@pganssle
Copy link
Member

pganssle commented May 3, 2021

One other comment on this: Other than exposing the UTC singleton (which is a great convenience), I don't think the datetime.timezone class is terribly important. It's trivially simple to re-implement in Rust if you need something like that (see this, which could easily be tweaked to be equivalent to datetime.timezone), and I can't think of any particular benefits of using datetime.timezone if your tzinfo implementation is already in a compiled language.

Plus, it's generally not a good idea to use datetime.timezone in the first place. datetime.timezone exists so that stuff like the datetime.fromisoformat and datetime.strptime can construct fixed-point datetimes from some serialized datetime format with a fixed offset using just the standard library. If possible, you should strongly prefer to construct datetime objects using zoneinfo. I realize that chrono's various design flaws makes fixed offsets easier to use and it would be nice to have the timezone API exposed, but I'm ambivalent enough about making datetime.timezone easy to use that it's probably worth taking the time to come up with a decent interface rather than rushing to get something implemented.

@davidhewitt
Copy link
Member Author

davidhewitt commented May 3, 2021

This sounds appealing to me, but is there a more general problem that sometimes your PyTzInfo needs to be a PyAny, so we'd need to do the Deref trick twice?

Yes, and if I implemented this we'd have impl Deref<Target = PyTzInfo> for PyTimeZone, and impl Deref<Target = PyAny> for PyTzInfo.

Is it not sufficient to implement From<&PyFixedTimezone> for both &PyAny and &PyTzInfo and satisfy both requirements?

Something like this could be done, yes. However I'm a bit uneasy about adding new trait mappings like From without making this an across-the-board change for PyO3.

I think that changes like #1056 need to be resolved before we commit to more ways to interconvert Python types. I'm similarly uneasy about the Deref design, though that does at least have precedent in web-sys, where Javascript types Deref to their super types.

Plus, it's generally not a good idea to use datetime.timezone in the first place. [...]

Understood, however I think there's also a practical notion that these APIs would be helpful for chrono as current, and these are exposed in the Python C-API so it's reasonable for PyO3 to expose these too.


Overall, I'm leaning towards merging this PR as-is, because it's a simple and practical approach even if the new functions aren't super discoverable. Perhaps their lack of discoverability may be a feature 😉

We can make future improvements to this API as PyO3 and the Python C-API evolve...

@pganssle
Copy link
Member

pganssle commented May 3, 2021

Overall, I'm leaning towards merging this PR as-is, because it's a simple and practical approach even if the new functions aren't super discoverable. Perhaps their lack of discoverability may be a feature

I'm still kinda -1 for the same reason that I didn't like the idea of adding chrono bindings directly to PyO3. It feels out of scope to be adding convenience functions that don't necessarily map to things in the C API. If you just want a nice convenient way to get any fixed offset timezone, it seems like it would be pretty simple to implement a Rust "fixed offset" type in its own crate that doesn't bother with datetime.timezone. If chrono is growing PyO3 support (in this PR from @pickfire), it could also include a simple FixedOffset implementation like the one I linked to from the examples.

In general, it's not going to be useful to implement specifically datetime.timezonechrono support for a number of reasons anyway, so we only need the chronodatetime side, so chrono also has the option of simply calling the datetime.timezone constructor, no?

impl ToPyObject for FixedOffset {
    fn to_object(&self, py: pyo3::Python) -> pyo3::Result<pyo3::PyObject> {
        let dt_module = py.import("datetime")?;

        let seconds_offset = self.utc_minus_local();  // Throw exception if not ±24h?

        let td = pyo3::PyDelta::new(py, seconds_offset, 0, true)?;

        dt_module.call1("timezone", (td,))?
    }
}

This would be slightly slower than calling the C API directly, but if you are concerned about that, a cache would likely make it much faster than even calling the C API directly; since there are a maximum of ~172,800 possible fixed offsets, and in practice there's maybe 30-40 that you'll actually encounter, any process doing a large number of conversions of fixed offsets to datetime.timezone objects would very quickly saturate the cache and you'd go from making python function calls to looking up offsets in a Rust hashmap. On my computer it takes ~500-600ns to call datetime.timezone(datetime.timedelta(seconds=3600)), so construction 172,800 of these would take 95ms, Depending on how much overhead Rust adds, I think it would be 12-14MB of memory to store all those timedeltas and timezones, and that's the worst case scenario where you cache every timezone object you create and you create every possible timezone that represents a chrono::FixedOffset.

So at the end of the day, I'd say that we should concentrate on creating nice, consistent bindings for CPython in pyo3 and leave the "convenient way to create an aware datetime from a chrono datetime" for either chrono or for its own crate.

@pickfire
Copy link
Contributor

pickfire commented May 3, 2021

Plus, it's generally not a good idea to use datetime.timezone in the first place. datetime.timezone exists so that stuff like the datetime.fromisoformat and datetime.strptime can construct fixed-point datetimes from some serialized datetime format with a fixed offset using just the standard library. If possible, you should strongly prefer to construct datetime objects using zoneinfo. I realize that chrono's various design flaws makes fixed offsets easier to use and it would be nice to have the timezone API exposed, but I'm ambivalent enough about making datetime.timezone easy to use that it's probably worth taking the time to come up with a decent interface rather than rushing to get something implemented.

I think you missed the context here. What I am trying to do is provide chrono and pyo3 integration in which the python timezone is almost the same as chrono provided functions. chrono also does not provide timezone, so rather than having users do the conversion, we can just support it on library level so the experience is seamless.

it seems like it would be pretty simple to implement a Rust "fixed offset" type in its own crate that doesn't bother with datetime.timezone.

I think that's probably not a good idea in the long run, keeping it align with the C API might be a better idea.

In general, it's not going to be useful to implement specifically datetime.timezone → chrono support for a number of reasons anyway, so we only need the chrono → datetime side, so chrono also has the option of simply calling the datetime.timezone constructor, no?

Other than chrono there are also other time libraries or other stuff that needs the timezone. I think implementing it here is better than having everyone have their own implementation which they can easily get wrong (like me, I still can't figure out how to do the FFI thing by myself).

The code you showed doesn't seemed like a good idea. pyo3 is exposing safe wrappers over python API. Of course not all parts of C API is necessary but doing it this way seemed weird, the utc_minus_local thing feels like a hack to me.

@pganssle
Copy link
Member

pganssle commented May 3, 2021

Other than chrono there are also other time libraries or other stuff that needs the timezone. I think implementing it here is better than having everyone have their own implementation which they can easily get wrong (like me, I still can't figure out how to do the FFI thing by myself).

The problem is that chrono does everything with fixed offsets, which has a number of problems. It's not a very well-designed library and I am very sad that it is the de facto standard datetime library in Rust. I'm hoping that other libraries handling dates and times would use something other than fixed offsets, in which case datetime.timezone would not be especially useful anyway. For the most part, well-designed uses of datetime will either use UTC, local time (naïve, no offset) or an IANA time zone, so if you have something in Rust using fixed offsets that you want to send over to the Python side, you'll want to convert to one of those three before using it anyway. In any case, I'm mostly just saying that there's no reason to make a quick release to support this since I'm fairly convinced that it will be an attractive nuisance in the long run anyway.

The code you showed doesn't seemed like a good idea. pyo3 is exposing safe wrappers over python API. Of course not all parts of C API is necessary but doing it this way seemed weird, the utc_minus_local thing feels like a hack to me.

The utc_minus_local part of that is not the important part. If you have a preferred way to get the offset in minutes and seconds from a FixedOffset, then do that instead. To put my suggestion another way, any person could easily write their own version of @davidhewitt's timezone_from_offset like so:

fn timezone_from_offset<'py>(py: Python<'py>, offset: &PyDelta) -> PyResult<&'py PyTzInfo> {
    let dt_mod = py.import("datetime")?;
    dt_mod.call1("timezone", (offset))?
}

This is a trivial operation and it should be simple to do it from within your chronopyo3 bindings. The main benefit of using the C API and C struct is probably going to be speed since (I think) there will be fewer interactions between you and the Python interpreter, but as I mentioned earlier a cache-based solution to any speed problems will probably be much more fruitful if constructing datetime.timezone objects ends up as a performance bottleneck.

Additionally, it returns the exact same object that you'd get if you were using the ffi wrappers, meaning that if / when pyo3 works out the right way to represent a datetime.timezone object, it would be a simple matter to replace the contents of the function with something using the C API wrappers.

@pickfire
Copy link
Contributor

pickfire commented May 4, 2021

The problem is that chrono does everything with fixed offsets, which has a number of problems. It's not a very well-designed library and I am very sad that it is the de facto standard datetime library in Rust. I'm hoping that other libraries handling dates and times would use something other than fixed offsets, in which case datetime.timezone would not be especially useful anyway. For the most part, well-designed uses of datetime will either use UTC, local time (naïve, no offset) or an IANA time zone, so if you have something in Rust using fixed offsets that you want to send over to the Python side, you'll want to convert to one of those three before using it anyway. In any case, I'm mostly just saying that there's no reason to make a quick release to support this since I'm fairly convinced that it will be an attractive nuisance in the long run anyway.

I still don't see what is the issue with fixed offsets? Can you please explain it or open an issue with chrono? FixedOffset can be set to the second precision so I am not sure what is the issue with that, which is the same as python timezone.

Ah, I didn't know the code will work. Will try it out later and if this patch didn't get merged I will do that. But if other people come do they need to go search the pull request and find that comment? I think having a function there is simpler such that others does not have to even worry about the low level stuff to have to go search the issues and pull requests and read the source code like I did to figure out how to do the same thing but using a lower-level API, just using a high-level API like the one in python would be easier to find and figure out how it can be used.

@pganssle
Copy link
Member

pganssle commented May 4, 2021

I still don't see what is the issue with fixed offsets? Can you please explain it or open an issue with chrono? FixedOffset can be set to the second precision so I am not sure what is the issue with that, which is the same as python timezone.

Python offsets can have sub-second precision as well, as of Python 3.7 or 3.8 (I think 3.7). It is also the case that in general, Python datetimes should have a zoneinfo.Zoneinfo or equivalent time zone attached, not a fixed offset.

The biggest problem I have with fixed offsets is that they are rarely actually all that useful. Generally speaking, you either want an absolute time, which represents a fixed point in time (e.g. UTC, an epoch timestamp, etc) or you want a civil time, which represents what the clock will say on the wall at a specific time in a specific place. Absolute times are usually useful for recording when something happened in the past (e.g. timestamps on logs) and civil times are usually useful for human-centric events that may take place in the future. When working with absolute times, it's almost always sufficient to record it and work with it in UTC and convert it to local civil time when you want to display it to a human. When working with future times it is not appropriate to use UTC or a fixed offset thereof — imagine we plan an event at 14:00 a month from now, and then the country we're meeting in decides to eliminate DST. If we had recorded the meeting as 14:00-04:00 — in local time that is now 13:00, even though from our perspective we always want the meeting to be at 14:00.

The way Python works is that you attach a tzinfo which is an object mapping "civil" times to UTC (the UTC tzinfo is a no-op). This way what is stored is not "local time + offset" but "local time + rule set", which is much more versatile. Fixed offsets are basically just UTC in disguise, and may not be accurate for a given civil time. They are useful for when you want to be able to easily print a given datetime with a given fixed offset, but it's very rare that you'd want to do this. It's also useful for representing all the information you have from, say, an ISO 8601 datetime — in case you want to losslessly re-serialize it. So, for example, if you are given 2021-05-22T03:21:19+02:00, you don't know what rule set generated that, but you know both the local time (2021-05-22 at 03:21:19) and the offset from UTC (-02:00). datetime.timezone exists to satisfy basically this use case, but usually you'd want to convert this to UTC or an IANA zone as soon as you can before doing much with it.

Ah, I didn't know the code will work. Will try it out later and if this patch didn't get merged I will do that. But if other people come do they need to go search the pull request and find that comment? I think having a function there is simpler such that others does not have to even worry about the low level stuff to have to go search the issues and pull requests and read the source code like I did to figure out how to do the same thing but using a lower-level API, just using a high-level API like the one in python would be easier to find and figure out how it can be used.

First of all, I don't think my suggestion here was terribly complicated. If you do any appreciable amount of coding with Rust ↔ Python or C ↔ Python interop, you'll quickly find that there aren't C API hooks for even most of the standard library, so if you can't find a way to accomplish something using the C API, you just make a Python function call and get a PyObject * in your compiled code. Before Python 3.7, anyone would have to do that to construct a datetime.timezone from C anyway, and you still must do this to construct the much more useful zoneinfo.ZoneInfo object, since zoneinfo doesn't (and probably won't) have C API bindings.

I am in a weird position here because I do think that we should expose datetime.timezone bindings, but I also think that there's a decent chance that they'll do more harm than good 😅. My point is that in general it should be pretty rare that you want to use one of these things in the first place, so as long as the UTC singleton is already exposed, we should probably take our time carefully designing the timezone bindings, rather than introducing convenience functions that give you convenient access to a minor footgun.

@pickfire
Copy link
Contributor

pickfire commented May 4, 2021

I am in a weird position here because I do think that we should expose datetime.timezone bindings, but I also think that there's a decent chance that they'll do more harm than good sweat_smile. My point is that in general it should be pretty rare that you want to use one of these things in the first place, so as long as the UTC singleton is already exposed, we should probably take our time carefully designing the timezone bindings, rather than introducing convenience functions that give you convenient access to a minor footgun.

If it's a bug that people rely on. It's not a bug. It's a feature. -- Linus

At least in this case, based on what you said taking DST into account it is a bad design. But considering that RFC 3339 and ISO 8601 supports fixed offsets by default in datetime but not IANA or something similar, I think it is still good to have an API for it, at least python already does it even though it is not recommended, it is not really a foot gun for this to be exact, even chrono does the same thing. But yeah, in the docs we can just put a note that it is more recommended to use IANA rather than the fixed offset (but we don't have a pointer to say what to use yet).

I think it still depends on @davidhewitt call. If he merged then I will use the convenience functions, if he didn't then I can just use the one you showed.

@pickfire
Copy link
Contributor

Some time has passed, is there any decisions on this or what's pending?

@davidhewitt
Copy link
Member Author

Overall I'm still leaning towards merging this eventually. I understand the concerns that this is not IANA timezone, however we also don't have a C API for the zoneinfo stdlib. The position I have on this one is that PyO3 remains unopinionated and offers the APIs that it can; chrono integration will find these useful even if we really need a zoneinfo API too.

The main reason I have not been rushing to merge this is that it's currently a bit clunky to use custom PyTzInfo . See #1662. I would like to solve that issue at the same time, which will probably be a 0.17 thing at current rate of progress. Having #1662 would make it easier for us (or third-party crate) to define custom zoneinfo for IANA support if they needed it.

@davidhewitt davidhewitt added this to the 0.17 milestone Jan 16, 2022
@pickfire
Copy link
Contributor

Okay, then I guess I will just use the other method to call python code directly for chrono, when I have the time to work on that.

@pganssle
Copy link
Member

Overall I'm still leaning towards merging this eventually. I understand the concerns that this is not IANA timezone, however we also don't have a C API for the zoneinfo stdlib. The position I have on this one is that PyO3 remains unopinionated and offers the APIs that it can; chrono integration will find these useful even if we really need a zoneinfo API too.

The concern is less that this isn't an IANA timezone and more that these shouldn't be useful. I agree that we should probably expose the relevant part of the C API just for completeness, but I don't agree with adding convenience functions for constructing these. Giving people low-friction options for doing things they shouldn't do is a recipe for wide adoption of bad practices.

One reason it seems unlikely that we would bother with zoneinfo bindings for the C API is that constructing them will basically never be the slow part of any operations you are doing, because they've got a built-in cache (plus it's trivial to make your own cache if you don't fit into the 99.9% case where zoneinfo.ZoneInfo construction is not the bottleneck in your code). Whatever we come up with, it may be worth a note in the documentation saying, "This is here for consistency, but you probably shouldn't use it" or something of that nature.

@davidhewitt
Copy link
Member Author

One reason it seems unlikely that we would bother with zoneinfo bindings for the C API is that constructing them will basically never be the slow part of any operations you are doing,

That's fine. I hope that one day we'll have easy ways to generate type-safe wrappers for Python types to reduce the amount of dynamic code users have to write. We could eventually use that to wrap zoneinfo.

I started such a project at https://github.com/davidhewitt/pyo3-pytypes, however I think it'll be a long time before I next get a chance to sit down and try again to make it happen.

@pickfire
Copy link
Contributor

pickfire commented Mar 30, 2022

I think we should also have PyDateTime_DATE_GET_TZINFO to get timezone information as part of this patch? Or maybe I can create another patch for it? I tried doing it but got stuck on conversion.

impl PyTzInfoAccess for PyDateTime {                                        
    fn get_tzinfo(&self) -> Option<PyObject> {                              
        let ptr = unsafe { PyDateTime_DATE_GET_TZINFO(self.as_ptr()) };     
        if unsafe { ffi::Py_None() } == ptr {                               
            None                                                            
        } else {                                                            
            Some(ptr as PyObject)              // this                               
        }                                                                   
    }                                                                       
}                                                                           

@davidhewitt
Copy link
Member Author

I'd suggest opening as a separate PR, it's probably going to be quite a while before I have a chance to finish this off unfortunately.

@pickfire pickfire mentioned this pull request Mar 31, 2022
@davidhewitt davidhewitt force-pushed the timezone_offset branch 8 times, most recently from d1edb34 to 0cdf99b Compare July 4, 2022 21:18
@davidhewitt
Copy link
Member Author

After a long delay, I'm leaning towards merging this for 0.17.

I appreciate the concern that this does not expose IANA timezones. While that is perhaps something that we can consider in future, for now the Python C API lacks any bindings for zoneinfo (as far as I am aware). A future PyO3 could release a statically-typed wrapper around zoneinfo which just makes dynamic Python calls underneath (or this could be added as a community crate). For now, I see that chronotope/chrono#542 has incorrect unsafe code which could have been avoided by this timezone_utc function being available, so I see immediate value to the community in having this safe API added.

@pganssle
Copy link
Member

pganssle commented Jul 4, 2022

I appreciate the concern that this does not expose IANA timezones.

As I said before, this not exposing IANA time zones is not the problem. The problem is that these fixed offsets are both not useful and they seem like they are useful, which is a dangerous combination.

@davidhewitt
Copy link
Member Author

@pganssle Would it resolve your concern if I took the two timezone_from_offset functions out of this PR? (Leaving just timezone_utc, as the most commonly used timezone imo.)

@pganssle
Copy link
Member

Would it resolve your concern if I took the two timezone_from_offset functions out of this PR? (Leaving just timezone_utc, as the most commonly used timezone imo.)

Yes, I think those two are the main problem I have with this change, the other stuff all seems positive.

The only comment I'd have about timezone_utc is that it seems like a bit of an odd interface to use a function to access a singleton like that, but I don't think it hurts anything to have it this way.

src/types/datetime.rs Outdated Show resolved Hide resolved
@davidhewitt
Copy link
Member Author

The only comment I'd have about timezone_utc is that it seems like a bit of an odd interface to use a function to access a singleton like that, but I don't think it hurts anything to have it this way.

The complication for safety on the Rust side is that if the singleton hasn't yet been imported then we need to hold the GIL to safely import. This could be done behind the scenes in a Deref impl, however it seems better for users' control if they explicitly pass the GIL in.

Thanks for all the reviews, proceeding to merge this.

@davidhewitt davidhewitt merged commit 402018e into PyO3:main Jul 13, 2022
@davidhewitt davidhewitt deleted the timezone_offset branch July 13, 2022 22:11
Comment on lines +564 to +571
{
let dt = PyDateTime::from_timestamp(py, 100.0, Some(timezone_utc(py))).unwrap();
py_run!(
py,
dt,
"import datetime; assert dt == datetime.datetime.fromtimestamp(100, datetime.timezone.utc)"
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your indentation here is off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, #2508 to fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Python 3.7 timezone in datetime bindings
3 participants