Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time + Time::MonthSpan inaccurate around DST change #6522

Closed
HCLarsen opened this issue Aug 11, 2018 · 163 comments
Closed

Time + Time::MonthSpan inaccurate around DST change #6522

HCLarsen opened this issue Aug 11, 2018 · 163 comments
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:time

Comments

@HCLarsen
Copy link
Contributor

When adding or subtracting a Span or MonthSpan to a Time object that isn't UTC, the result is off by an hour. It appears that it's adding a specific number of seconds instead of days, weeks, months or years, as you would expect.

time = Time.utc(1997, 10, 21, 9, 0, 0)
time + 1.weeks  #=> 1997-10-28 09:00:00.0 UTC

time = Time.new(1997, 10, 21, 9, 0, 0)
time + 1.weeks  #=> 1997-10-28 08:00:00.0 -05:00 Local

time = Time.new(1997, 10, 25, 9, 0, 0, location: Time::Location.load("America/Toronto"))
time + 1.days  #=> 1997-10-26 08:00:00.0 -05:00 Local
@asterite
Copy link
Member

I won't comment on whether this is good or not (it does look good to me, though), but yes, time math goes through seconds, always.

@asterite
Copy link
Member

Actually, for months it doesn't go through seconds. So:

time = Time.new(1997, 10, 25, 9, 0, 0, location: Time::Location.load("America/Toronto"))
p time           # => 1997-10-25 09:00:00.0 -04:00 America/Toronto
p time + 1.month # => 1997-11-25 09:00:00.0 -05:00 America/Toronto

So yes, one of the behaviors should maybe be fixed, though I can imagine it being really hard to implement, specially when right now you can combine units (except months).

@HCLarsen
Copy link
Contributor Author

Think about implementation though. When you're thinking of 1 week from today, you expect that to be the same hour, not an hour earlier because you set your clocks forward in the meantime.

@HCLarsen
Copy link
Contributor Author

The workaround I implemented in my project is only 7 lines long. I used a comparison of DST on the receiver and the argument.

@asterite
Copy link
Member

I don't know. Let's say I say "in one hour" and dst starts having effect. Then your 7 lines diff will change the hour again. Then in real life, an hour wouldn't have passed, but maybe zero or two.

Maybe with days and weeks it's different, because they are usually tied to a date.

How does Ruby's activrsupport behave regarding this?

@HCLarsen
Copy link
Contributor Author

No, my workaround doesn't change the hour if the span is 1.hour.

I've been trying to figure out how Ruby's Active Support does this, but I'm not sure yet.

@HCLarsen
Copy link
Contributor Author

Active Support does exactly what one would expect. A week after 9am on October 21st is 9am on October 28th.

@HCLarsen
Copy link
Contributor Author

HCLarsen commented Aug 11, 2018

HOW they do it is more interesting. They actually construct and return a new time object constructed from the values of the old object, substituting day and week values when new values are provided.

@straight-shoota
Copy link
Member

Yes, this problem has already been mentioned before (see #4556), but I don't think there has been a dedicated issue yet.

Time libraries usually solve this by having two different concepts for time span, one accurate, based on elapsed time (like Time::Span) and the other nominal based on calendrical units (like Time::MonthSpan but with higher granularity).

  • Java Date-Time API has Duration and Period. These classes have seconds, nanoseconds and days, months, years properties.
  • Elixirs timex has Duration and Interval. The latter is essentially a Duration anchored at a specific start instance and can such be interpreted as calendrical unit.

Some however only have one, time-based type (for example Rusts chrono Duration, Golang Duration). This obviously makes it hard to use calendrical calculations.

I guess it could also be an option to have one type with individual fields for (nano)seconds, hours, days, months, years and it can be used both ways depending on how you populate the fields (either 1 day or 24 hours). But that's probably going to be more confusing than anything else because you never know if an instance is meant to be time-based or calendrical.

They actually construct and return a new time object constructed from the values of the old object, substituting day and week values when new values are provided.

That's actually the most sane way to do this. Because it is a calendrical concept, it needs to go through the calendrical interface. That's much easier than figuring out the exact amount clock shifts to calculate an offset of elapsed time.

I would suggest to leave Time::Span as is (but maybe rename it to Time::Duration to be more concise) and add an additional Time::Period type, probably with properties nanoseconds, hours, days, months, years. The details need to be discussed, though.

@HCLarsen
Copy link
Contributor Author

Well, it's an important discussion to have, to be certain. I suspect that many will get tripped up by this before it's fixed. I'm more than happy to help in any way I can.

I don't think Time::Period is the best naming, it doesn't read very well as English. Perhaps Time::DaySpan to be analogous to the MonthSpan that already works in the way you'd expect it.

As asterite pointed out, when people are talking about hours, they're going to mean the specific hours. Same goes for smaller values. This behaviour only really comes into play when people are talking about intervals of days or weeks.

@asterite
Copy link
Member

The problem comes with, for example, expiration dates. You can say something expires in 3.days, for example a cookie or a session, and with that you mean 72 hours. The problem is that days is ambiguous.

Maybe we should remove those convenience methods from the standard library, just like in Go. Alternatively we can document it always refers to computed seconds.

@HCLarsen
Copy link
Contributor Author

The thing about that is, 3.days should be treated differently than 72.hours. If you want to describe a span of 72 hours specifically, you're going to say 72.hours, not 3.days. On the other hand, if you want an event to repeat every 3 days, you want it happening at the same time every day. Your work schedule isn't going to shift by an hour just because of daylight savings time.

I vote against removing the convenience methods. I think they're brilliant, and very useful.

How about, expanding on straight-shoota's idea, we leave span the way it is, and add another struct for date spans. I think DateSpan or Interval would be good names for it. While the Span still operates on seconds and milliseconds, this new object would hold 3 values: seconds, days and months. That way, addition and subtraction can work in the ActiveSupport way, with days being added evenly, regardless of DST, and months can be added without worrying about leap years.

@straight-shoota
Copy link
Member

The behaviour and existence of the convenience methods is actually a subordinate issue.

Foremost we need to discuss how to express the different semantics in terms of data structures.

@asterite
Copy link
Member

Yeah, making days and week return a different struct, similar to MonthSpan, sounds good. Or even representing all time spans as a combination of seconds, days and months, as suggested (I don't mind it will be a bit larger, it's not like time spans are stored and used as much as time instances).

@straight-shoota
Copy link
Member

it's not like time spans are stored and used as much as time instances

That depends. If you're doing a lot of time measurements, it might be quite important. In my opinion it doesn't make much sense to merge both concepts into one type. The size is one issue, but more importantly, there should be a clear distinction between a time duration (i.e. what's returned by Time.measure) and an abstract time period.

@asterite
Copy link
Member

@straight-shoota How would you distinguish them when constructing them?

@straight-shoota
Copy link
Member

Time::Duration.new(days: 5) vs Time::Period.new(days: 5). Let's just leave the convenience methods out for now and figure that out later.

@asterite
Copy link
Member

What's wrong with treating days, weeks and months as such, instead of a fixed amount of seconds?

@straight-shoota
Copy link
Member

Not sure I understand you question... there's nothing wrong with that, I actually propose adding a type for treating them individually.

@asterite
Copy link
Member

I mean, 1.day would mean period, 1.hour would mean duration. Those are the constructors. We just need to fix the semantic.

@HCLarsen
Copy link
Contributor Author

I know it's just semantic, but I really don't like the name Period. Elixer's Interval name sounds far better to me.

@HCLarsen
Copy link
Contributor Author

Here are my thoughts on how it would work:

span = Time::Span.new(72, 0, 0)
interval = Time::Interval.new(0, 0, 3, 0, 0, 0)  #=> 3 days

time + span       #=> 2018-11-06 08:00:00.0 -05:00 Local
time + interval   #=> 2018-11-06 09:00:00.0 -05:00 Local

@asterite
Copy link
Member

asterite commented Aug 13, 2018

Having Period, Duration, Interval, or any combination of that is bad, because the names are interchangeable and almost mean the same, so it'll be confusing as hell.

I propose what you, @HCLarsen, proposed: a Time::Span should consist of:

  • seconds, of fraction of seconds which can be specified with seconds, milliseconds, microseconds nanoseconds, minutes and hours
  • days, with days and weeks
  • months, with months and years

Then there's no confusion at all. The only confusion might arise if someone does 3.days expecting that to be exactly 72 hours. For that case, we should document that one must use 72.hours and not 3.days, and that's it. I prefer that small confusion that can be explained in the docs over having two or more new types, which behave slightly different from one another, and where it's not clear what name belongs to which behavior.

@HCLarsen
Copy link
Contributor Author

a Time::Span should consist of:

Actually, my idea was that the Interval class would consist of that, and the Span would remain as is, just seconds and milliseconds. Although, I recognize that this idea has validity as well.

@asterite
Copy link
Member

The problem with having two types is that one can no longer do: 1.day. Is it one type or the other? So we must use the more verbose Time::Interval.new(...) or Time::Period.new(...). Or we could have 1.day(period: true) or something like that, but it's still more complex than just having 1.day as the only possibility with only one possible meaning.

Maybe those 2.days convenience methods are not that important and we could drop them, I don't know. I like them :-)

@HCLarsen
Copy link
Contributor Author

Crystal seems to handle that just fine now with the Span and MonthSpan types. 1.months returns a MonthSpan, and 31.days returns a Span. In my new idea, it would work like this:

72.hours  #=> 3.00:00:00 Time::Span
3.days    #=> 3.00:00:00 Time::Interval

Technically, the same time period, but different methods return different types.

@asterite
Copy link
Member

Sure, that could work.

The only inconvenience is that right now you can do:

# You can add spans
span = 1.hour + 5.seconds

# However, you can't add a span with a month span
span = 1.month + 3.hours # => doesn't compile

With just a single type holding all this information, it could work. And then if you have anything in the "days" or "months" fields, you can treat "hours" as fixed hours, so "1.day + 3.hours" after 9am would always give the next day at 12am, regardless of DST. Or maybe these fields shouldn't be combined to avoid this new confusion.

@HCLarsen
Copy link
Contributor Author

You could still do that if you overloaded the + method.

struct Time::Interval
  def +(span : Time::Span) : Time::Interval
    @seconds + span.total_seconds.to_i
  end
end

@straight-shoota
Copy link
Member

Please let's not worry about the utility constructors right now. This is one step ahead. They're certainly nice and I'd like to find a way to keep them with improved semantics.

@asterite Yes, having two different types with similar names can be confusing. But maybe we can find names that clearly express purpose and avoid that.

Having only one type is very much confusing, because then there's no type safety. If you have two instances, there is no way to tell by the type if they're meant to represent elapsed time or calendrical units. I'd expect this could be a source of even more confusion when applying mathematical operations.
In many applications, such as benchmarking, time measurements, timeouts etc. you really just need to handle elapsed time. If the same type could also have days, month and year fields, all these use cases would have to make sure to properly handle these as well, or raise an error (but that could only happen at runtime).

I haven't found a single instance of a multi-purpose time span type in any time library. I have seen they either have two separate types or only support elapsed-time.

@asterite
Copy link
Member

asterite commented Aug 13, 2018

What about activesupport? According to OP it seems io be doing what he wants, and you only create them with the utility methods.

@HCLarsen
Copy link
Contributor Author

The example mentioned by @HCLarsen about adding 24 hours vs. 1 day near a DST change actually does not work as an argument for separate types. Neither of them defines a fixed amount of time. Instead both, hours and days are calender units and thus would be expressed using a calendar-based type.

Not true. An hour is not a calendar unit. It's a specific passage of time, 3600 seconds to be exact. For example, 1 hour after 1:00AM on March 10, 2019 1:00AM 3:00AM on the same day. Alternatively, 1 day after 1:00AM is always 1:00AM.

While any calendar based app is going to use calendar units, any sort of scientific calculation is going to need to look at the specific passage of time. The actual rotation of the Earth isn't 1 day, it's 86,400 +/-20 seconds. That's the reason for two separate classes, the two very separate use cases.

@anykeyh
Copy link

anykeyh commented Feb 20, 2019

For a consensual approach.

I must confess I overthought the thing the whole night but eventually I thought about a consensual idea.

Assuming below Duration/Span is an absolute time in seconds, while Period is a calendar time in seconds, days and months where the absolute length (in seconds) depends of the context.

I do understand than period must be explicitly converted to span, while the other way is not true. A duration can be a period or composed of a period without any problem, period. (pun intended)

So basically, the arguments are:

  • Period and Span behave differently, they must be treated differently
  • Having them in the same structure is dev-friendly, easy to store and interface with database type "interval" and "duration".
  • If calling 1.seconds return a Span, what if I wanted my second as Period instead?

Below this prosal should answer to all the good points above; tell me if it sounds good:

Structures

2 structures exists:

struct Time::Span
  @nanoseconds: Int64
  @seconds: Int64
end

struct Time::Period
  @nanoseconds : Int64 #Or microseconds? Do we really need nano on period? 
  @days : Int32
  @months : Int32
end

Now, the helpers on Int should behave like this:

  x = 1.hour # => return Time::Span(seconds: 3600)
  y = 1.month # => return Time::Period(month: 1)
  z = x + y # => return Time::Period(month: 1, nanoseconds: 3_600_000_000_000_u64)

Both can be added or subtracted from a date. While both can be mixed, you cannot revert from
Time::Period to Time::Span unless using a referential point in time (see below).
However, you can easily cast from Span to Period:

x = 1.hour.to_period #For storage purpose
# or
x = Time::Period.new(hour: 1)
# or even (tricky one)
x = 0.day + 1.hour

Period have to_period member returning self, which allow Union type to be merged easily:

class MyClass
  @x : Time::Period

  def period=(x)
    @x = x.to_period
  end
end

MyClass.new.period = 1.hour #< Note here we cast the Time::Span
MyClass.new.period = 1.day #< Note here we return self
  • day, week, month and year returns a Period.
  • hours, minutes, seconds returns a Span.

To get a duration from a period, it's also possible:

  (1.month+1.day).to_span(Time.now) # => Equal to Time::Span.new(seconds: ((1.month+1.day).from_now - Time.now))

A referential must be however provided. Idea: This referential can be Time.now by default?

Time::Period#to_s

  • The string output is made of 3 blocks:
  1. years and months, using the month counter
  2. days (and week?), using the day counter
  3. hours, using the seconds counters

Example of output:

13.month + 1.day + 171.hours + 82.minutes # => "1 yr 1mo 1d 172:22:00"
1.day - 1.hour # => "1d -1:00:00"
8.day + 24.hour #=> "1w 1d 24:00:00"

Time::Span#to_s is more simple to compute, as it's an absolute time counter.

IMPORTANT note

It still face the issue of the order of transformation:

 Time.now + (1.day + 1.month)
 # and
 Time.now + 1.day + 1.month
 # MAY NOT RETURN THE SAME TIME!!!!

This is because a Period is applying always month, followed by day,
while applying day then month can lead to different result on month leap.

However this issue exists in all case unless we assume than period is
a list of transformations played in a specific order. Which cause others problem
(how to store it properly, how to export it in JSON or to a DB,
performance issue, memory consumption etc...)

WDYT? Sounds not too complex nor memory/computation heavy and a good compromise?

TLDR:

  • Mixing Calendar Period with Time Span always return Calendar Period.
  • Time Span can be transformed to Calendar Period, while in the other way we need a Time referencial.
  • API are made to make it easy to write whatever the use case using the Int helpers.
    That's mean day,month,year helpers are always treated as Period (calendar units), while hours,minutes and seconds are treated as Time::Span. Scientific calculus must then use 24.hours helper to express one day.

@straight-shoota
Copy link
Member

@asterite

But in Java the days are present in both Period and Duration. In one, a day is a calendric unit. In the other a day is exactly 24 hours.

The Duration class does not represent days, only seconds (and nanoeconds). It provides a few methods for working with days, but they're immediately converted to (nano)seconds. You can't tell a difference whether an instance was created as 1 day or 24 hours or 86400 seconds.

So if you say 1.day, which one is it?

A day is a calendar unit, thus 1.day should obviously be a calendrical type. It can be converted to a time-based type with a conversion method (for example #to_span).

Maybe we can remove the + method and have these methods: plus_duration, plus_period, minus_duration, minus_period?

I don't think that would be a great solution. It just makes the API more complicated. And these are not the only methods. All operations with two duration values essentially would need such a distinction as well.
A better solution is to simply use different types in the first place, giving the benefit of type safety.

Both possibilities seem to have pros and cons, unless there's something I'm not seeing in the "mixed in a single type" proposal.

I'm pretty sure most of the examples we've been discussing here are mostly about calendrical type. The argument for separate types also bases on the use cases for a time-based type. That is, where you only need actual time spans and not have to deal with ambiguous calendrical units. Examples for this are all applications related to Time.monotonic/Time.measure, like measuring elapsed time or calculating timed events.

@anykeyh

Your proposal for a Period class is not sufficient to fully represent the units of the ISO calendar. It goes a part of the way, but it's better to either go fully or don't even try. For example, the value of 1 hour should be represented in a unit of hours, not seconds. 3600 seconds is not equivalent to 1 hour.

If we agree on having two types, one for calendrical and one for fixed-time durations, there are IMHO two alternatives for the calendrical one:
The small option is like Java 8's Period class which only models the date components years, months, days (where years can internally be represented as months). The big option is like the Period class in Joda Time (which the Java 8 Datetime API is based on) supporting years, months, weeks, days, hours, minutes, seconds, nanoseconds.

However, I'd like to not get to much into the details of this, unless we're agreed on the general idea of having two different duration types.

@asterite
Copy link
Member

Yeah, I'm now more inclined to having two types.

@anykeyh
Copy link

anykeyh commented Feb 20, 2019

And about my proposal, what's about the seamless integration of both type with minute and hour etc... with both types?

@straight-shoota
Copy link
Member

@anykeyh What do you mean with "seamless integration"?

@HCLarsen
Copy link
Contributor Author

HCLarsen commented Feb 20, 2019

3600 seconds is not equivalent to 1 hour.

@straight-shoota what exactly are you basing that statement on?

The small option is like Java 8's Period class which only models the date components years, months, days (where years can internally be represented as months).

That's essentially what I've created with the DateSpan class. Internally, it possesses @days and @months, but it takes years in it's constructor, multiplying them by 12 and adding them to @months.

@j8r
Copy link
Contributor

j8r commented Feb 20, 2019

Definition of second:

The duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium-133 atom

Definition of minute

[...] equal to ​1⁄60 (the first sexagesimal fraction[1]) of an hour, or 60 seconds. In the UTC time standard, a minute on rare occasions has 61 seconds, a consequence of leap seconds [...] minute is accepted for use with SI units

Definition of hour

[...] reckoned as ​1⁄24 of a day and scientifically reckoned as 3,599–3,601 seconds, depending on conditions.

Definition of day

A day, symbol d, defined as 86 400 seconds

Definition of week

A week is a time unit equal to seven days.

TL;DR: we can easily support second (SI) and its derived units: minute, day and week. There values are constant. However, all hour, month and year units varies.

@jgaskins
Copy link
Contributor

@j8r A day isn't 86400 seconds when the boundaries for that day straddle a DST boundary. Leap seconds are less problematic because they're rare and cause calculations to be off by only a second, but 1.day being off by an entire hour for all calculations made for 2 full days each year causes real problems. That's the issue at hand. 😄

@j8r
Copy link
Contributor

j8r commented Feb 20, 2019

@jgaskins that depends of if we are talking of a civil day, which can have plus or minus an hour or a SI day (d), which is always 86 400 seconds.
That's important. Do we stick with SI, or change the behavior depending of the local time?

@HCLarsen
Copy link
Contributor Author

@j8r as the author of this issue, I can tell you that the need for software to match the behaviour of local time is the entire reason for this discussion.

@anykeyh
Copy link

anykeyh commented Feb 21, 2019

Before we move foward, just few false assumptions I want to discuss:

For example, the value of 1 hour should be represented in a unit of hours, not seconds. 3600 seconds is not equivalent to 1 hour.

False assumption. An hour is equivalent to 3600 seconds, even during DST leap.
Adding one hour at DST shifting moment to winter time will return the same hour but will change the timezone.

Basically, you may face 2am (IST) + 1.hour = 2am (GMT). And that's normal behavior.

Because DST is treated as timezone shift, not hour shift 😄. Algorithm is:

  * Add 1 to the hour counter
  * Check whether the modification trigger DST rules
  * If yes, change the Timezone accordingly

It's normal behavior, because the business case of adding or substracting 1 hour is really 1 hour. If I have a system which ping every hour a specific service, I assume it to works even on DST leap, and ping at 2am then 2am then 3am. I expect my system to work the same way if I wrote "add 1 hour", "add 60 minutes" or "add 3600 seconds".

However if I write a system which add a day to a time, I expect it to return the next day, same hour.
Basically in case of DST leap, it will return a duration of 23/25 hours.

To summarize:

  • There's no case where time + 1.hour != time + 3600.seconds.
  • There is case where time + 1.day != time + 24.hours.
  • There's no case where time + 1.week != time + 7.days
  • There is case where time + 30.day != time + 1.month.
  • There's no case where time + 12.months != time + 1.year

The only counters which matters are months, day and [nano]seconds.

So the ISO norm might declare interval of week, day, hours etc... But at it ends, the only interest I see in this system is: Having counter for each and any ISO field keep the "intention" of the user.
ISO is made to normalize systems, and to be very clear with what data you deal. But it's not mean to be implemented as-is, as it may interfers with perfomance and memory.

And the trade is real, should I really do a benchmark to show it to you? The ironical part is that if we use ISO counters storage, we will ends up treating week as 7 days, minutes as 60 seconds and so on for performance reason during addition and substraction.

My proposal above covers ALL cases with the best ratio space/performance and that's why it's what used to store in database software.

I'm not digging into the nanoseconds subject, I may have a lot to say on how much innaccurate and useless it is in a general purpose and standard library time computation system, but this one is not so much of an issue in my opinion.

Leap seconds are less problematic because they're rare and cause calculations to be off by only a second

You must understand that leap seconds should just disappear from the time counter. So basically they are not computed at all in both duration and calendar system.
At the perspective of a computer, they should not exists. That's why UT1 has been created.
So basically they should not even a subject of discussion in this thread.
With or without leap seconds, a normal (non-DST) day will ALWAYS compute to 86 400 seconds, even if the "physical" day will be 86 401 seconds.

It matters only in computing time range when range may be negative and for sorting things by date.

Also, leap seconds happens everyday in computer, when it resynchronize by NTP. Some implementation smear leap seconds, making adjustment to just accelerate (or slowdown) temporary the clock, allowing it to never fallback to a previous state and making it ALMOST IMPOSSIBLE to see at computer-level of awareness.

EDIT: Time class use unix_timestamp, so leap seconds are not covered at all, and thanks god. Using smearing or adjustment must be done on system-wise, not in the Crystal's software. Time.now.to_s should never return 2013/12/31 23:59:60; por favor, this will lead to a lot of more bugs while interfacing with systems than just ignoring the leap second. System relaying on absolute time should not use Time.now but use an additional library.

Also, monotonic clock is made for treating this case. Anyway it's not our subject of discussion.

If someone need to compute a precise interval of time, they need to use something else than the time clock which can leap, end of story.

@anykeyh What do you mean with "seamless integration"?

I mean that a with the false assumption above covered, a Time::Span IS a subset of a Time::Period. And thus, allowing to write this is full of sense:

  period = 1.month + 1.hour

  struct AStoragePlace
     @period : Time::Period 
  end

It has a business sens, it has 100% sens. I really don't want to store two objects while having only a Time object in Crystal.
For me it's really strange and unconsistent: the team agree that Date object should not exists and must relay on Time (which I'm 100% with), but now we need to treat addition and substraction of time with two differents structures, one covering the calendar and one covering the seconds.

Edit: I rewrote a bit, I think I was a bit toxic on the first version, sorry for that ;) 😄

@j8r
Copy link
Contributor

j8r commented Feb 21, 2019

It looks correct to have 2 types like @straight-shoota said:

  • one for Time operations, like scientific, using derived units based on the SI Atomic second. This unit values are constant.
  • an other for Gregorian Calendar operations, which deal with DST, timezones, eventually leap seconds. This unit values varies depending on conditions.

@straight-shoota
Copy link
Member

@anykeyh Time zone shifts are not always +/- 1 hour, thus there are more cases where the typical relations between calendrical units don't apply. Please take a look at #6522 (comment) which provides practical examples for every case you claimed to be impossible (except for the relation 1 year = 12 months, which is uncontested).

You must understand that leap seconds should just disappear from the time counter. So basically they are not computed at all in both duration and calendar system.

This is correct. In fact, Crystal's time library is already based on a calendar which ignores leap seconds and assumes leap smear instead.

I really don't want to store two objects while having only a Time object in Crystal.

Yes, we only have a single Time type, but actually support two different concepts of time. Monotonic time is simply expressed as a Time::Stamp. This makes perfect sense since the monotonic clock returns an amount of elapsed time. That's also why we need a separate type for representing elapsed time vs. conceptual calendar units. And these types can not be subsets of each other, because they represent different concepts. You can convert between them, but this needs to be done explicitly. But that should not be used very often because as I already stated, their applications are pretty distinct.

@HCLarsen
Copy link
Contributor Author

That's also why we need a separate type for representing elapsed time vs. conceptual calendar units. And these types can not be subsets of each other, because they represent different concepts. You can convert between them, but this needs to be done explicitly. But that should not be used very often because as I already stated, their applications are pretty distinct.

Precisely. I can't foresee it being a common situation where one would need a union of the two types, or convert from one to the other, simply because the two types represent two different purposes.

@anykeyh
Copy link

anykeyh commented Feb 21, 2019

Ok, I'm giving up my idea. “If you're alone, that's probably because you're wrong”, say the proverb. I guess I am on this one...
So you want the big real thing. As the proverb say (Yeah I'm proverbial tonight), go big or go home.

Okay then, let's go big. Let's make it real, let's make me seriously feel that I was wrong on the whole line 😐. Now, forgetting our disagreement, I might help you if you want on this crazy idea, because this shit the work is real.

  • At first we should prepare implementation for country-based decision about TZ shift. Since the Samoa Island decide to go -11 to +13, and the Venezuela shifted twice 30mn from their calendar, and probably other decisions like that has been made, we need to prepare the std lib to handle them. For the sake of the arguments which decided to go this way.
  • At second because since 3-characters timezone are now deprecated in favor of region/city, it sounds logicial than a region which has changed their mind the last 15 years should reflect it in the calendar. I don't think we should handle very distant events, yet I think it would be great if we can add them / customize, to handle also the new events which will probably occurs without the need to a new crystal version. I guess the first step should be to decide how to maintain a table of the decisions made, easy enough to populate every time a country decide something (sometime only few weeks prior to the application).
    This table should be also compute-wise fast enough while dealing with calendar shifting, even if it become bloated (humans are good to mess around).

On another subject, let's think how we are going to deal with second leaps and scientific calculus precision, and how you are going to implement TAI timezone. I understand than Time::Span should be an absolute of time spent, and therefore it must deal with leap-seconds, no? One of the problem would lay in the fact than Time is an interface from unix_timestamp, and so the precision relay on the clock under Crystal. To be honest, even if second leaps, according to what I read above I feel like we really should go the precise way.

Also, any plan to deal with others calendar systems? I'm thinking about التقويم الهجري‎ (hijri, sort-of hard) or ปฏิทินสุริยคติ (super easy to deal with, unless we want to handle BEFORE 1940). Both of them are used on daily-basis based on my experience, in addition to Gregorian calendar. Java stdlib implements different calendars, now we are going to add ISO-compliant period type, I think it's useful to deal with different calendars too. In some way, the hijri is used more or less by 1 billion people. If i understand then, adding 1.month to a time should add one islamic month if the calendar is hijri-based; actually that's sounds good 😄.

I think the initial PR should not contains all the things above. It would be a sabotage by overcharge of work, and would let you think I'm not sincere. However it should prepare the land for it; honestly, after all this discussion and the precision given to all the details, there's no middle-ground anymore. Or we decide to go full-support, or we don't. I'm genuinely interested by this subject and willing to help 😄

Cheers,

@straight-shoota
Copy link
Member

straight-shoota commented Feb 21, 2019

@anykeyh You're mostly talking about problems that have already been solved. Time::Location supports zone offset with second precision and time zone data is provided by the operating system, or a custom copy of the time zone database. So by default, no time zone data is baked into the binary (this is obviously possible, though) and can be easily updated externally. Leap seconds are ignored and assume leap smear to be applied by the originating clock.

Currently, there are no plans for supporting anything else but the ISO calendar, and I don't think there will be anytime soon. Making the Time API calendar-agnostic would add immense complexity to all parts of the library. The ISO calendar is the most commonly used calendar and an international standard. That's sufficient for the standard library. Additional calendar implementations can be provided by a shard.

@straight-shoota
Copy link
Member

However, we're drifting away from the topic of this issue which has become way too long anyway.

It seems we're pretty much in agreement on the general idea of having two separate types for time durations. So the next step would be diving deeper into the design questions of a calendrical duration type. I've been collecting some insights and will be ready to submit a propsal in the next days.

@HCLarsen
Copy link
Contributor Author

If I may make a request @straight-shoota, please include in your proposal an explanation of your statement that

3600 seconds is not equivalent to 1 hour.

I'm honestly, extremely confused by this claim, as I know of no situation on this Earth where that is true (with the exception of leap seconds, which we all agree aren't being considered).

@straight-shoota
Copy link
Member

straight-shoota commented Feb 22, 2019

@HCLarsen We're talking about the representation in a calendrical duration type. And there it can make a difference whether the time span of an "hour" is supposed to be treated as a SI unit (then it's simply defined as the duration of 3600 seconds) or as a calendrical unit representing the difference in the hour component of local time (in which case it usually but not always equals to the duration of 3600 seconds).

When the time zone offset changes by an amount other than a full hour, the wall clock will eventually show a difference of one hour but parts of this hour have either been skipped of repeated, which means that the "hour" actually lasted fewer or more than 3600 seconds.

That's exactly the same thing as a day not always lasting 24 hours, just one level below. And not very common. However, I've already cited a real example of a time zone change of 30 minutes ( #6522 (comment)).

So in practice, this distinction is rarely relevant. But sometimes it is. And it's generally important to preserve the intention and be able to tell apart whether a duration instance is meant to represent 3600 seconds or 1 hour.

@HCLarsen
Copy link
Contributor Author

@straight-shoota if I understand correctly, you mean referring to an hour as the time passed between any two full hours on the clock, say between 1:00 and 2:00. I have to disagree that there's any practicality of doing so, as no system, nor person I know, does things that way.

Take a look at RFC5545, the iCalendar specification, which lists hours as an "accurate duration" along with seconds and minutes, as opposed to a "nominal duration" like weeks or days. When talking about time zone changes, the RFC gives a specific example:

March 11, 2007 at 3:30 A.M. EDT (UTC-04:00), one hour after 1:30 A.M. EST (UTC-05:00).

We also have to consider that in some cases, it's impossible to measure a duration in this manner. Take my upcoming DST change that comes in 3 weeks. The clocks will simply go from 1:59:59 to 3:00:00. Now, if you consider what's 1 hour after 1:00, and you look at as 2:00, well, that's impossible, since there is no 2:00 on that particular day.

Consider, if you will, how people talk about the DST changes. In the spring, they refer to it as "losing an hour," and the fall change is referred to as "gaining an hour."

I do hope you'll consider these things when writing your specification.

@straight-shoota
Copy link
Member

Yes, the practical impact is almost negligent. I could still construct an example where it matters, but actually, that's besides the point. This is a technical oddity that's somewhat worth considering, but the argument for having separate fields for every common calendrical unit is even more practically relevant.

Technically, the time span of 1 hour is equivalent to 60 minutes and 3600 seconds thus these values are theoretically interchangeable. But it still makes a semantic difference whether a duration is defined in hours, minutes or seconds. For example, the length of movies is usually measured in minutes, even if the running time is several hours.

If all time units are only stored in seconds, there is no distinction between different input values which equal the same amount of seconds. That effectively changes the semantics of the value. Consider a user entering a value of 120 minutes. I would expect that value to be represented in the same way as it was entered. But when there is no way to tell a difference, it's going to be implicitly converted to 2 hours (or worse: 7200 seconds).

Such a conversion might be intended in some cases, but often it is not. That's why it should only be explicitly applied. It must not be implicitly enforced because the data type can't represent the full data.

@HCLarsen
Copy link
Contributor Author

An example where it matters would actually be very important to this conversation. As it is, both of the examples you provided here are better solved without having separate fields for each unit.

In the case of the running time of movies, it makes more sense for someone just to store/measure that as an integer (or float), and use Time#shift by the number of minutes.

In the case that you want the struct to return the same value that you entered, well, new getters based on each calendrical unit would solve that. Something like Span#total_minutes that returns @seconds / 60, or 60.0 if you want a float. If you're writing software that's going to be looking at one specific unit, it's easy to just use the appropriate getter.

On the other hand, if we implement storage of all the different units; seconds, minutes, hours, etc..., then it becomes a big mess if you want to go the other way. 2.hours + 60.minutes suddenly becomes 2 hours and 60 minutes. Not something that would ever be practical. Whereas 2.hours + 60.minutes = 3.hours makes far more sense.

I don't doubt that the examples you're talking about could happen. However, I think that they're so extremely few and far between that it doesn't make sense to implement it that way in the core library, when the other implementation will be useful far more often. That's why I'm saying that we should stick to the original plan, where Time::Span represents seconds, minutes and hours as the passage of time, and Time::DateSpan represents days, months and years as calendrical units.

@HCLarsen
Copy link
Contributor Author

HCLarsen commented Mar 7, 2019

On my local branch, I've created a Time::DateSpan struct with Time#+(span : DateSpan) and Time#-(span : DateSpan) methods on the time class itself. Currently DateSpan only deals with days, months and years, and internally stores @days and @months, with years just being 12 months.

I'm not going to do a PR, because it seems like there isn't a consensus on whether or not to add smaller units of time to the DateSpan, and how they may be represented internally. Let me know once you have that proposal, and what everyone in the core team agrees should be done.

@HCLarsen
Copy link
Contributor Author

I just realized that this old issue is still open. Since it was fixed with the addition of the Time#shift method, I'm going to close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:time
Projects
None yet
Development

No branches or pull requests

10 participants