Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Default equality precision #717

Closed
dschuermans opened this issue Oct 30, 2019 · 22 comments
Closed

Question: Default equality precision #717

dschuermans opened this issue Oct 30, 2019 · 22 comments
Assignees
Labels
pinned Issues that should not be auto-closed due to inactivity.

Comments

@dschuermans
Copy link
Contributor

Hey,

I'm wondering if there's anything done concerning default equality functionality...
From the docs i found that precision is being tested up to 1E-5.

However, when doing the following:

Length unit2 = Length.FromFeet(10D * 3.28084D)

Assert.AreEqual(unit1, unit2)
Assert.IsTrue(unit1.Equals(unit2)

the test fails.

However:
unit1.Equals(unit2, 1E-5, ComparisonType.Absolute)
will return true.

The value in meters for unit2 is 10.00000032

Should we always use the overloaded Equals methods or is there a way we can configure the default Equals?

@angularsen
Copy link
Owner

angularsen commented Oct 30, 2019 via email

@tmilnthorp
Copy link
Collaborator

tmilnthorp commented Oct 30, 2019

I would argue that as a value type, Equals should work similar to how you would expect with Double.Equals. It does not return true for anything other than absolute equality.

Looking at our Equals(object) implementation, it checks of the object is the same quantity, converts it to the same unit, and checks for absolute equality. As a value type, we probably shouldn't even do the conversion. It should check that the value and the unit are exactly the same.

@lipchev
Copy link
Collaborator

lipchev commented Oct 30, 2019

I agree with @tmilnthorp - this is also how the GetHashCode function is currently implemented (I meant to open issue on the subject)- we could alternatively devise an implementation of GetHashCode that is consistent- by converting (and possibly rounding to tolerance- should this become the default equality contract) - but I think it should go into the UnitsNet.Comparison class rather than as the default implementation.
It is then up to the domain class that holds the measure units to re-define Equals/HashCode, should this be deemed appropriate: one example (from my experience) is the chemical solution- still a value object(no identity) but considered equal to another solution with the same component(s) and concentration(s) within a margin of error- specified by the domain experts.

@angularsen
Copy link
Owner

angularsen commented Oct 30, 2019

My first reaction also is that it should use double.Equals ("double-equals" from now on) and perhaps even compare both Value and Unit with no conversions, similar to Tuple<double, Enum> and anonymous type equality new { Value, Unit }.Equals() ("tuple-equals" from now on).

It feels more correct.
However, I don't think this design is terribly useful. At least the way I am using this library.

I don't really have a strong opinion either way, but here is my point of view as a consumer of the library.

Comparing unit + value

I don't immediately see a scenario where tuple-equals is something I want when comparing quantities. Typically I want to compare quantities for their amount of something - regardless of the unit it happened to be constructed with.

Do you have any examples where you think tuple equality would be more intuitive?

Comparing floating precision values

If we had used int or long as backing value then fine, compare it exactly like TimeSpan and DateTime does with their long Ticks value.

Since we are using double, then I don't find floating point equality checks very useful - because they have unpredictable rounding errors. The ReSharper plugin will even warn you if you try to compare a float or double with some numeric value.

Also, from the consumer perspective; the fact that we are using double internally is really an implementation detail. We could have used decimal if we wanted, or fixed precision with long or similar. I don't think it is fair to expect the consumer to know and understand that our equality methods are prone to subtle internal rounding errors.

For example:

Length l1 = Length.FromMeters(1.123);
Length l2 = Length.FromMeters(2.456);
Length l3 = Length.FromMeters(3.579);

// False
(l1 + l2 == l3).Dump("l1 + l2 == l3");

Here the equality is false, but I know this can trip people up due to the miniscule rounding errors that would occur for these particular choice of values. Other values may work just fine. It is unpredictable.

Proposal: Relative comparison with 1e-5 max error

My argument is that it might be an improvement for the default equality to allow for some rounding errors. The Equals and == operator would then actually be useful for something.

I have no idea what comparison parameters would be a good default choice, or if this even works, but I thought maybe 1e-5 max relative error would be a good starting point?
In theory it should work for both small and large quantities, but we'd need to run some tests to conclude more.

(l1 + l2).Equals(l3, 1e-5, ComparisonType.Relative).Dump(); // True

What do you think?
Can it be a way to go or are there maybe advantages with double-equals or tuple-equals I am missing?
Will it not work for certain values or units?

@tmilnthorp
Copy link
Collaborator

Remember that if Equals returns true, GetHashCode must return the same value for the two equal objects. Otherwise any hash based structure such as Dictionary would fall apart. Absolute equality is essential.

We could create a custom IEqualityComparer though, which would make the fact you're using relative equality explicit.

@angularsen
Copy link
Owner

That is a very good point. One that is enough to put all my above arguments to rest.

I will say, however, due to the floating point issues I would never use an IQuantity as the key of a dictionary or in a Hashset or similar. It just seems like a bad idea, but at least if we implement the tuple equality (or keep the current double equality) then at least it would be consistent in such hash structures. I can go with that.

Good point about the IEqualityComparer, that is something we could provide paired up with our Equals(other, error, comparisonType) overload methods.

@lipchev
Copy link
Collaborator

lipchev commented Oct 31, 2019 via email

@tmilnthorp
Copy link
Collaborator

HashSet may have been a better example - for set operations. It too supports an IEqualityComparer to be passed in.

@angularsen
Copy link
Owner

@lipchev

It's just a matter of rounding off the value used for the hash
calculation with the same tolerance

If I understand correctly, you propose dividing the values into ranges/buckets that produce the same hashcode?

Consider this example:

var l1 = Length.FromMeters(1);
var l2 = Length.FromMeters(1 + 0.1);
var l3 = Length.FromMeters(1 - 0.1);

// Let's say allowed relative error is 0.1 and these are true
l1.Equals(l2); 
l1.Equals(l3);

// But this is false (relative error is ~0.2)
l2.Equals(l3);

// Let's say we compute hashcode by rounding value to nearest 0.1
l1.GetHashCode().Equals(l2.GetHashCode());

// Becomes
(1.0 hashcode).Equals(1.1 hashcode); // False

// If we choose a wider rounding value 0.2, it still fails
(1.0 hashcode).Equals(1.2 hashcode); // False

// If we choose even wider rounding value 0.5 then finally the
// hashcodes are equal and matches the Equals() behavior that allowed 0.1 error
(1.0 hashcode).Equals(1.0 hashcode); // True

Maybe I am misunderstanding, but it seems tricky to get the choice of rounding value right.

At any rate, I am more inclined to provide good IEqualityComparer implementations instead. It is more explicit and I think we will avoid a lot of weird edge-cases trying to implement a default Equals() method. We also already have Equals() overloads that take parameters to control the equality behavior, so that aspect is somewhat discoverable except for maybe when writing unit tests and using Assert.AreEquals(). I think I'm leaning towrds keeping it as-is, but let me know if you see it differently! Maybe there is a good compromise lurking here.

@lipchev
Copy link
Collaborator

lipchev commented Nov 5, 2019

I tried out a few functions in excel- and managed to get some segments(centered @ zero) with ranges of 2, 1 and 0.5 respectively- but couldn't quite figure out a general formula. Note however that for the last one it is true that for any number input around the integer numbers (say a nano-meter) the hash generated for every length in the range of +/- 0.25 of a nano-meter. So my intuition tells me that there is a function that should work for the absolute difference. And likely even one that works with integer arithmetic. Any ideas? Also cool would be to have a percent rounding equivalent. Finally a little clarification about the equals/hash contract:

Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.

@lipchev
Copy link
Collaborator

lipchev commented Nov 5, 2019

image

@angularsen
Copy link
Owner

Cool figures, but I am a bit on deep waters here and don't fully understand this enough to provide any ideas to help you move this forward 🤷‍♂ 😄

@stale
Copy link

stale bot commented Jan 8, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 8, 2020
@stale stale bot closed this as completed Jan 15, 2020
@lipchev lipchev reopened this Jan 15, 2020
@stale stale bot removed the wontfix label Jan 15, 2020
@lipchev
Copy link
Collaborator

lipchev commented Jan 15, 2020

Sorry, I mean to come back to this (just quite busy right now)..

@angularsen angularsen added the pinned Issues that should not be auto-closed due to inactivity. label Jan 15, 2020
@angularsen
Copy link
Owner

No problem, can pin it with pinned label.

@lipchev lipchev pinned this issue Jan 15, 2020
@lipchev lipchev self-assigned this Jan 15, 2020
@angularsen angularsen unpinned this issue Feb 21, 2020
@lipchev
Copy link
Collaborator

lipchev commented Mar 16, 2020

Ok, I think I just came across the general formula that I was looking for: see the Rounding example from the wiki page on Quantization.
I tested it using my old excel file and at first glance it seems like it's working so here is the general formula that I suggest we use:
Digits = 6 // this should be part of the constructor of the IEqualityComparer
Delta = 1/ Math.Power(10,Digits) // or we could use this instead (possibly even using an IQuantity- otherwise we have to assume that's expressed in the unit of the first operand)
Q(x) = Delta * Math.Floor(x / Delta + 0.5) // this is the Quantizer formula from the wiki
GetHashCode(x) = Q(x).GetHashCode() // we return the hashcode for the rounded value

@lipchev
Copy link
Collaborator

lipchev commented May 8, 2020

Aside from the rounding with precision- I think there may be another way to fix this- consider the following two tests for the IComparable interface:

    [TestMethod]
    public void AveragingComparisons_For_EqualQuantities_Returns_Zero()
    {
        var firstMass = Mass.FromGrams(0.001);
        var secondMass = firstMass.ToUnit(MassUnit.Microgram);

        Assert.AreNotEqual(0, firstMass.CompareTo(secondMass));
        Assert.AreEqual(0, secondMass.CompareTo(firstMass));

        Assert.AreEqual(0, (firstMass.CompareTo(secondMass) - secondMass.CompareTo(firstMass)) / 2);
        Assert.AreEqual(0, (secondMass.CompareTo(firstMass) - firstMass.CompareTo(secondMass)) / 2);
    }

    [TestMethod]
    public void AveragingComparisons_For_DifferentQuantities_Returns_One()
    {
        var firstMass = Mass.FromGrams(0.001);
        var secondMass = Mass.FromKilograms(0.001);

        Assert.AreEqual(-1, firstMass.CompareTo(secondMass));
        Assert.AreEqual(1, secondMass.CompareTo(firstMass));

        Assert.AreEqual(-1, (firstMass.CompareTo(secondMass) - secondMass.CompareTo(firstMass)) / 2);
        Assert.AreEqual(1, (secondMass.CompareTo(firstMass) - firstMass.CompareTo(secondMass)) / 2);
    }

If I'm not somewhat greatly confused, expressing the Equals in terms of CompareTo(..) == 0 seems logical. If averaging the comparisons works as the tests suggest- then for the correct hash it's only a matter of rounding to something larger than the maximum rounding error.

There are other issues of course, when we get close to Min/Max values- if we could have those stable for the IComparable, that would be great too (e.g. Mass.MaxValue == Mass.MaxValue.ToUnit(Milligram) )

@angularsen
Copy link
Owner

angularsen commented May 16, 2020

I love that you keep digging at this, but honestly I have fallen off a bit here 😵

Too busy at work and family life to find time to really put my head into this problem space and I forget details in-between, so I currently don't have a good understanding of what exactly you are proposing and I can't really help identify potential problems with it.

It seems you are still in the process of brain dumping a bit and sorting out details for yourself, but if you feel you are nearing a solution here and would like some constructive feedback from me then I would really appreciate a concise summary to help me review it better 😀

My hunch is that we need to be pretty darn sure we aren't introducing subtle equality bugs by implementing this.

@mdro
Copy link

mdro commented Nov 17, 2021

I'd like to revive the issue of "all properties equal" vs. "equal value", as I'm currently facing the same question in my own internal library: In my case instead of a unit it is the precision at which the number was rounded, which similar to UnitsNet controls ToString() and the default precision of calculations involving this number.

I agree with @tmilnthorp that Equals() must align withGetHashCode(): Quantities could easily end up as (part of) the key of a HashSet or Dictionary, especially in the context of result caching or deduplication (LINQ .Distinct() etc.). Both have to require equal units, since the unit affects behaviour (ToString() and rounding errors in conversions).

Users need to understand in general that CompareTo() == 0 doesn't imply Equals() == true. The question is what we do with the == and != operators. I see several options:

  1. Map to Equals(): Bad idea, huge footgun due to the difference to other relational operators. A big breaking change on top of that.
  2. Map to CompareTo(): Less of a footgun, needing exact equality in manually written code (not premade algorithms in collections or from LINQ) should be quite rare. Crucially this would avoid a breaking change for code that uses the operators. Ofc it's still a breaking change in Equals() itself.
  3. No == or != operator, the other operators remain: Forces the user to consider what they need (equal units, exactly equal value, equal value within tolerance), but clunky. The tolerance issue could also be a reason to not provide the other relational operators either.
  4. Do not provide any relational operators on the quantity itself. Instead there is a property or method to obtain a proxy object that represents the value and potentially the comparison tolerance:
    // exact value
    var tm = Length.FromMeters(1000);
    var km = Length.FromKilometers(1);
    Assert(tm.Equals(km) == false);
    Assert(tm.CompareTo(km) == 0);
    // tm == km does not compile
    Assert(tm.Value == km);
    
    // with tolerance
    var foot = Length.FromFeet(1);
    Assert(foot.CompareWithPrecision(0.01) == Length.FromMeters(0.305))
    I think this would be the most elegant API design, but it would be a huge breaking change, requiring fixing every single statement with relational operators.
  5. Combine 4. with 2. to maintain compatibility for everything except Equals() itself, but mark all relational operators (or at least == and !=) on quantities themselves as deprecated.
  6. Combine 4. with 3. as a compromise between compatibility and safety.
  7. Like 5. but make == and != on measurements themselves throw if the units aren't equal. This would prevent people from getting CompareTo() == 0 behaviour when they meant Equals() while minimizing breaking changes. The price is that these breaking changes only manifest at runtime.

@lipchev
Copy link
Collaborator

lipchev commented Nov 17, 2021

Could we ship an analyzer for Equality comparison of floating point numbers. Note, that no warning is generated for comparing with 0 (Zero).

@mdro
Copy link

mdro commented Nov 18, 2021

Note, that no warning is generated for comparing with 0 (Zero).

Why would 0 be special here? For quantities other than temperatures conversions of 0 would be exact, but you could still get rounding issues when subtracting two quantities > 0. You don't even need conversions for that, there are enough examples with bare doubles that should give exactly zero but don't.

@angularsen
Copy link
Owner

Closing this as duplicate of #1017 .
A relevant PR is about to be merged into v5 #1100 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pinned Issues that should not be auto-closed due to inactivity.
Projects
None yet
Development

No branches or pull requests

5 participants