Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the [Date|Time|DateTime]Delta API #110

Open
ariebovenberg opened this issue Apr 19, 2024 · 6 comments
Open

Improving the [Date|Time|DateTime]Delta API #110

ariebovenberg opened this issue Apr 19, 2024 · 6 comments
Labels
discussion Discussion is needed before proceeding

Comments

@ariebovenberg
Copy link
Owner

There are two design questions for DateDelta:

  • Currently, DateDelta stores years, months, weeks, and days separately. However, in the Gregorian calandar, years are always 12 months, and weeks are always 7 days. It's save on storage to just store months and days. The downside is that users may want to deliberately store unnormalized deltas, such that "24 months" isn't the same as "2 years".
  • On a related note, the DateDelta components may have different signs (P4Y-3M+1W0D). While this is certainly flexible, it's unclear whether this functionality is actually used. Having one single sign would simplify things—although it may result in exceptions when doing arithmetic (whether years(a) + months(b) - years(c) would raise an exception depending on whether the result would have mixed signs).
@ariebovenberg ariebovenberg added the discussion Discussion is needed before proceeding label Apr 19, 2024
@ariebovenberg ariebovenberg changed the title Should DateDelta be normalized, and have one sign? Should DateDelta be normalized, and disallow mixed signs? Apr 19, 2024
@JaykeMeijer
Copy link
Collaborator

While I have not encountered the usecases where a delta needs to be the component distinguishing whether it's months or years (this seems to me like something that is defined by the context, not the data object), I am not sure that optimizing for memory at this level and this point is really necessary.

That said (devil's advocating here a little bit), it looks like Python's timedelta class does normalize. Is this usecase useful enough to deviate from what is "accepted"?:

>>> timedelta(hours=24)
datetime.timedelta(days=1)
>>> timedelta(hours=25)
datetime.timedelta(days=1, seconds=3600)

@ariebovenberg ariebovenberg changed the title Should DateDelta be normalized, and disallow mixed signs? Improving the [Date/Time]Delta API Apr 19, 2024
@ariebovenberg ariebovenberg changed the title Improving the [Date/Time]Delta API Improving the [Date|Time|DateTime]Delta API Apr 30, 2024
@bibz
Copy link

bibz commented Jun 3, 2024

I would like to chime in: I also think that normalised deltas are expected/assumed coming from the stdlib.

Having different behaviours for e.g. TimeDelta (normalised) and DateDelta (denormalised) is surprising, at least. (And also means DateTimeDelta has a bit of an identity crisis serving both styles of API.)

Maybe maintaining the denormalised form is out-of-scope for this component? (And could be part of a different module, if necessary?)
My uninformed point-of-view is that this library is an improvement over the behaviour of the stdlib when it comes to date and time related structures. Maintaining and exposing a more "advanced" delta API seems like a different thing altogether. That being said, my main gripe here is the departure from the normalised API in the stdlib.

Rough suggestions to move forward:

  1. Everything consistently normalised (like datetime.timedelta), the smaller delta components roll over to the big ones.
  2. Everything consistently denormalised (like in the canonical format), breaking understanding from stdlib.
  3. Everything consistently denormalised through the current API (again, à la datetime.timedelta), but normalised internally and exposed with a new property/method (e.g. DateDelta.canonical_format() will not be normalised unless called after explicit normalisation DateDelta.normalize().canonical_format()).

@bibz
Copy link

bibz commented Jun 3, 2024

(If that helps, I'd be happy to try my hand with a few test cases to cover an option I suggested.)

@ariebovenberg
Copy link
Owner Author

@bibz thanks for taking the time to chime in. Agree with many of your points. I'm AFK at the moment so will go more in depth in a week or so

@ariebovenberg
Copy link
Owner Author

ariebovenberg commented Jun 13, 2024

@bibz Agree that it's good to default to 'expected behavior' (i.e. normalization) for users coming from the standard library. However: this only makes sense so far it's possible. The issue here is that while hours/minutes/seconds/milliseconds/etc can be normalized, calendar days and months cannot.

To summarize: you always end up with at least 3 "buckets" that can be normalized internally, but not between them:

  • bucket 1: years and months. These can be normalized using the rule: 1 year = 12 months which is always true (in the proleptic gregorian calendar at least)
  • bucket 2: weeks and days. These can be normalized using the rule: 1 week = 7 days which is always true (in the proleptic gregorian calendar at least). Note that days cannot be normalized into hours, since day length may be influenced by DST (making some days 23 or 25 hours long)
  • bucket 3: hours, minutes, seconds, milliseconds, microseconds, nanoseconds. These can be normalized according to standardized conversions.

I'm leaning towards minimizing the 'number of buckets'. The 0.5 release DateTimeDelta currently has 5 buckets: years, months, weeks, days, and hours/minutes/seconds/etc.

Note that pendulum attempts to put everything into one bucket (using general rules like month=30 days), but that results in strange inconsistencies (see their issue tracker for numerous examples)

@bibz
Copy link

bibz commented Jun 14, 2024

The issue here is that while hours/minutes/seconds/milliseconds/etc can be normalized, calendar days and months cannot.

To summarize: you always end up with at least 3 "buckets" that can be normalized internally, but not between them

Yes, this makes complete sense. Thank you for the detailed explanation.
I agree, minimising the number of buckets while maintaining (strict) equality sounds great.

Note that pendulum attempts to put everything into one bucket (using general rules like month=30 days), but that results in strange inconsistencies (see their issue tracker for numerous examples)

That sounds like a very bad decision for anything in production 😱

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion is needed before proceeding
Projects
None yet
Development

No branches or pull requests

3 participants