Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ease conversion between Units of Measure (UoM) and undecorated numerals and simplify casting #892

Open
5 tasks done
abelbraaksma opened this issue Jul 2, 2020 · 17 comments

Comments

@abelbraaksma
Copy link
Member

abelbraaksma commented Jul 2, 2020

Ease of use for Units of Measure (UoM)

After being asked 3-4x in just as many weeks on Slack by different newcomers and experienced programmers alike, I propose we introduce a set of intrinsics related to UoM to make it less painful to convert to undecorated numerals and back, and to simplify conversion between numeric types.

I suggest to do that by adding a few sensibly-named functions to the LanguagePrimitives.

Remove UoM

Pair each LanguagePrimitives.XXXWithMeasure with a LanguagePrimitives.XXXWithoutMeasure.

Remove UoM generically

Add LanguagePrimitives.RemoveMeasure: 'T<UoM> -> 'T.

Add UoM generically

It's already possible to add UoM for specific types, but not generically for any type. We could add LanguagePrimitives.WithMeasure: 'T -> 'T<UoM>, where 'T is one of the signed integral types.

Support conversion

If at all possible, I suggest to make all cast operators sensitive to the target, to make this seamless. That means, allowing:

let x: float<cm> = float 45<cm>

See also SO question: https://stackoverflow.com/questions/1894250/f-unit-of-measure-casting-without-losing-the-measure-type

Allow no-op cast on collections for UoM

It's currently impossible to remove Array<int<cm>> from its cm UoM, unless you loop over it. Same is true for any other collection.

EDIT: the retype trick below can actually be adopted for this, thanks to @BillHally (comment).

let myArray = [|3<cm>; 42<cm>|]
let x: int array = myArray |> Array.map LanguagePrimitives.WithoutMeasure  // this should be optimized away

Existing way of doing this

The existing way of approaching this problem in F# by multiplying by 1<_> or 1.0<_>, but this is type-specific. For generics, you can only do so by using unsafe compiler-specific code, like this (courtesy @gusty):

// use compiler specific code
let inline retype (x: 'T) : 'U = (# "" x: 'U #) 

// remove UoM decoration
let inline removeUoM (x: '``T<'M>``) =
    let _ = x * (LanguagePrimitives.GenericOne : 'T)
    retype x :'T

Conversion from float<cm> to int<cm> etc is currently possible by reinterpreting the cast, but hard to discover even for experienced users:

x |> float |> LanguagePrimitives.FloatWithMeasure

Pros and Cons

The advantages of making this adjustment to F# are numerous, just to name a few:

  • Better adoption of F# and simpler use, and therefor more use, of UoM
  • Solving a hard, and often-asked real-life problem, that really should be trivial to answer
  • Prevent users to ubiquitously use box/unbox tricks to reach the same
  • Allow coding above scenarios without resorting to deprecated code
  • Improve overall code generation by erasing the types when requested
  • Prevent looping over entire collections to just remove UoM

The disadvantages of making this adjustment to F# are:

  • It's a surface area change
  • Leaving this stuff inside LanguagePrimitives may still be hard to discover, but at least it gives a "single, good way" of doing things

Extra information

Estimated cost (XS, S, M, L, XL, XXL): S

I think this isn't very hard to do, except for some optimizations perhaps. The compiler mostly already has ways of coding this internally. Ultimately, we should try to keep the balance of type-safety with UoM, but really shouldn't make it too hard to work with them.

Related library is FSharp.UMX: https://github.com/fsprojects/FSharp.UMX

Related language suggestions:

Related suggestions (and SO questions):

Just a bunch of the many roaming questions related to UoM, illustrating the trouble people perceive with it:

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this: I've looked at the code a bit and would like to tackle this, if an RFC were to be accepted :).
@matthewcrews
Copy link

matthewcrews commented Jul 2, 2020

I would love this. Would greatly simplify some of the work that I am doing. UoM are great but there are some painful edge cases that need to be sanded down and smoothed over.

@abelbraaksma
Copy link
Member Author

@dsyme, there were several discussions on slack recently, which led to this suggestion, and with the awesome work done by @pblasucci on widening the UoM types to all primitives, I'd like to add this as well to the (relatively painless) improvements for UoM's.

Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC for this.

@pblasucci
Copy link

So, recapping, there are 5 different suggestions here, each of which improves quality of life for working with units of measure:

  1. Remove UoM
  2. Remove UoM generically
  3. Add UoM generically
  4. Support conversion
  5. Allow no-op cast on collections for UoM

Of the five, 1 is trivial. 2, 3 seem like they'd be straight-forward -- but I've not done much analysis. 4, 5 are more concerning -- but only in the case of ensuring well-optimized code.

As an aside, for my own work, item 4 (conversions) is the one I smack up against most frequently.

@abelbraaksma @dsyme Might it be worth taking decisions on 1, 2+3, 4+5 separately (as I could see the merits of carving up the RFCs, and the associated work, into those three somewhat more digestible chunks)?

@abelbraaksma
Copy link
Member Author

abelbraaksma commented Sep 1, 2020

I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go. I think only (5) is slightly problematic from an implementation standpoint.

The easiest ones are those that only need library functions.

@pblasucci
Copy link

This is maybe not the best place to ask, but... I have a process question:

Would we "feature gate" these behind multiple different "language features"? Would we reuse one of the existing gates? Does it really matter?

@abelbraaksma
Copy link
Member Author

abelbraaksma commented Sep 1, 2020

A bunch of these will end up in FSharp.Core, which complicates "feature gates" due to it being a compiled lib, and afaik, apart from complex features like string interpolation, a new function is just a new (isolated) function in FSharp.Core, meaning that no "gate" is necessary to enable it: people can opt-in by calling that function.

For the part that requires changes in the compiler, I'd assume a single feature gate for like uom-improvements or something would be more than sufficient, but ultimately that's up to @cartermp and @dsyme, to whom I'd like to kindly ask:

Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC(s) for this.

@pblasucci
Copy link

@abelbraaksma wrote:

I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go.

I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved 😉

@BillHally
Copy link

@abelbraaksma wrote:

  1. Allow no-op cast on collections for UoM

and:

I think only (5) is slightly problematic from an implementation standpoint.

I may be misunderstanding something, but the retype function you provide will also work for collections.

We've been using essentially the same code for a couple of years to convert e.g. int<a>[] to int<b>[] without any problems.

Here's some example code in sharplab.

@abelbraaksma
Copy link
Member Author

@BillHally, thanks for pointing that out. Retyping indeed works, but we may need to ensure only primitives are targeted to prevent illegal conversions. I haven't looked that deep in potential corner cases, but as you said, it should be a noop as long as the primitive types match.

@pblasucci
Copy link

I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved 😉

Ping?

@dsyme
Copy link
Collaborator

dsyme commented Jun 14, 2022

I apologise for missing this despite being prompted several times. Put it down to the pandemic or something.

I can see the issues here. When UoM were originally designed, the aim was for "unit soundness" that is quite a strict criteria. Over time this has been weakened allowing conversions, integers etc.

Concretely I agree with addressing each of the above in LanguagePrimitives but am concerned about the actual details. Specifically anything involving the conversion operators may be problematic, since these are integrated with constraint solving, but perhaps it's doable if a unitized target type is known.

(2) and (3) require 'T<'m> as a construct, or a new kind of constraint, or a special type-checking rule. Thus they look complex.

I'll belatedly mark this as approved but I'm concerned that the details won't satisfy when it comes to an RFC

@abelbraaksma
Copy link
Member Author

abelbraaksma commented Jul 21, 2022

Thanks @dsyme to approve this in principle. I'll set out to write an RFC for this, or perhaps two, so we can split the feature (i.e. to have the complexities that 'T<'m> brings in only one of them).

@pblasucci
Copy link

pblasucci commented Jul 21, 2022

For what it's worth, I was recently playing around with this for something that came up at my day job. I now fully understand what @dsyme means about items 2 and 3. I would definitely split the two of them into a separate RFC.

But stepping back a moment, and thinking about the larger "request", which I understand as "make it easier to do conversions involving units of measure", I'd like to suggest that part of adding items 1, 4, and 5 could be to surface them somewhere more discoverable than LanguagePrimitives. In fact, it probably makes sense to also alias (move?) the existing ___WithMeasure functions. Everything could be neatly packaged into a static type called Units or UnitsOfMeasure or Whatever-is-deemed-most-appropriate. But why a static type? Because something like Units.clear(number : float<'M>) : float, with ~13 overloads (one per numeric primitive) has arguably better "developer ergonomics" than 13 separate ___WithoutMeasure functions.

At least, that's what's come out of the things I've been doing at work lately.

@abelbraaksma
Copy link
Member Author

@pblasucci, I like those suggestions. Ease-of-discovery would certainly help. But I'm not sure what @dsyme, @KevinRansom et. al. would think about adding "another way of doing the same thing". Perhaps LanguagePrimitives in hind-sight was a bad location for any of this, and we should fix that, or we are just stuck with it.

@jwosty
Copy link
Contributor

jwosty commented Dec 7, 2024

This would be some really fantastic additions and would make using units of measure less verbose (especially when interoping with a UoM-less C# library).

I do want to note however that as much as I would like the conversion functions to be made unit-preserving... it is unfortunately a breaking change AFAIK. I think you would have to introduce separate functions. I have in some of my own codebases some functions which do exactly this, and I give them names like "floatm" or "intm". Perhaps something like that could be serviceable.

This may be a separate proposal, but... I also sometimes get bitten by the fact that UoM don't work with functions like round and abs, and even (somewhat surprisingly) LanguagePrimitives.GenericOne. This could definitely be fixed, in theory, without breaking backwards-compatibility, since it's actually just a straight-up compiler error right now (though I'm a little unsure of how the compiler would actually do that given that the current implementation of those functions just directly uses SRTP to call BCL methods on the type itself, which of course lack units of measure).

Let me know if the scope of such a change deserves a separate proposal and I'd be happy to file one.

@T-Gro
Copy link

T-Gro commented Dec 9, 2024

I think the proposal as it stands is good for a proposal - what matters more now are the specifics of an RFC, which should specify a strategy for accomplishing this - incl. surface area changes of Fsharp.Core vs. additions to the compiler itself.

Since you mentioned round,abs etc: there might be another compiler change hidden inside of it => an ability to mark T->T functions to optionally copy over UoM from the input to the output, but also work on inputs without any unit just fine.

@charlesroddie
Copy link

Since you mentioned round,abs etc: there might be another compiler change hidden inside of it => an ability to mark T->T functions to optionally copy over UoM from the input to the output, but also work on inputs without any unit just fine.

The operation needs to commute with multiplication by a positive constant for this to happen. This is true for abs: (abs(λx) = λabs(x) for λ positive). It's not true for round so round should not have this property. An example of the bug then created is that round(50 cm) would evaluate to 50cm and round(0.5m) would evaluate to 0m, which is not equivalent.

Looking through https://fsharp.github.io/fsharp-core-docs/reference/fsharp-core-operators.html the only things that should commute are:

  • numeric conversion preserving a unit of measure (currently destroys the unit)
  • abs (currently works @jwosty : I am getting abs(4.<m>) = 4.<m>)

Things that could interact with units but do not commute:

  • sign(x) = sign(x) (currently works)

Most things fail to compile, correctly. (round, sin, etc.).

So, in my tests, the only thing that needs improvement is this:

I would like the conversion functions to be made unit-preserving

I suggest adding functions Measure.convertToFloat32 which behaves like float32 but preserves units.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants