Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interval + inf returns invalid interval #535

Closed
lucaferranti opened this issue Jun 7, 2022 · 26 comments · Fixed by #558
Closed

interval + inf returns invalid interval #535

lucaferranti opened this issue Jun 7, 2022 · 26 comments · Fixed by #558
Assignees

Comments

@lucaferranti
Copy link
Member

both on master and 1.0-dev

julia> using IntervalArithmetic

julia> interval(1, 2) + Inf
[∞, ∞]

Not sure what it should return though, maybe Interval(prevfloat(Inf), Inf) ? 🤔

@lucaferranti
Copy link
Member Author

 >> infsup(1, 2) + Inf
warning: illegal interval boundaries: infimum = supremum = +/- infinity
warning: called from
    infsup at line 813 column 5
    plus at line 50 column 7

ans = [Empty]

fwiw octave behavior. I think octave converts each number to interval first so the warning originates from trying to construct infsup(Inf, Inf), which is invalid and hence returns an empty interval, the final result follows from that

@lucaferranti
Copy link
Member Author

in 1.0-dev this would be fairly simply to fix, just use the safe constructor in this line. (side note, I think the final type should be Interval{promote_type(T, S)}, but this is unrelated to this, so let us discuss elsewhere if needed)

@lucaferranti
Copy link
Member Author

similar easy fix in master by tweaking these lines

@cbilz
Copy link

cbilz commented Sep 26, 2022

The same issue exists for subtraction and multiplication (in master, I haven't checked 1.0-dev):

julia> interval(1, 2) - Inf
[-∞, -∞]

julia> interval(1, 2) * Inf
[∞, ∞]

@lbenet
Copy link
Member

lbenet commented Sep 27, 2022

Thanks a lot for the comment @cbilz. After a discussion with @lucaferranti, and going into the standard (section 10.2, the remark just after Eq. 4), we both agreed that the "correct answers" should be of the form Interval(-Inf, nextfloat(-Inf)) or Interval(prevfloat(Inf), Inf)), respectively. We checked and the same problem appears in 1.0-dev branch. We are discussing the best way to solve this.

@dpsanders
Copy link
Member

It seems that this is a problem with how Inf is converted to an Interval when promotion occurs?

@lbenet
Copy link
Member

lbenet commented Sep 27, 2022

I agree. It turns out to be related to how @round deals with Inf:

julia> IntervalArithmetic.round_expr(Inf, RoundDown)
Inf

@OlivierHnt
Copy link
Member

@lbenet I was looking into fixing this issue for the 1.0-dev branch, but I realized that I am not reaching the same conclusion as you. The interval arithmetic formalism can be quite subtle, so I am perhaps overlooking something here 😅.

For set based flavour, according to Section 10.2 of the standard, the set of intervals $\overline{\mathbb{I}\mathbb{R}}$ does not contain infinity. In particular, Inf is not part of it.
When we consider (1..Inf) + Inf we extend the operation $+ : \overline{\mathbb{I}\mathbb{R}} \times \overline{\mathbb{I}\mathbb{R}} \to \overline{\mathbb{I}\mathbb{R}}$ to allow $\pm \infty$ as operands in which case it is most natural that the result is also allowed to be $\pm \infty$. Then, Inf seems to be the best possible result of (1..Inf) + Inf.

However, returning Inf is not a viable option since we care about type-stability. So, we are in the situation where we restrict the extended $+$ to have its range in $\overline{\mathbb{I}\mathbb{R}}$ (i.e. the operands may be extended reals but not the result).

Since we have that $\emptyset + x = \emptyset$ for any $x \in \overline{\mathbb{I}\mathbb{R}}$, it is consistent to identify $\pm \infty$ with $\emptyset$. Then, emptyinterval() should be the result of (1..Inf) + Inf.

@lbenet
Copy link
Member

lbenet commented Feb 16, 2023

I was reading #558, which quotes your comment above... Sorry to reply (here) to it until now.

For set based flavor, according to Section 10.2 of the standard, the set of intervals $\overline{\mathbb{IR}}$ does not contain infinity. In particular, Inf is not part of it.

Fully agree with you. To avoid confusion, let me refer to that infinity as $\infty$. So, for the set based flavor (closed connected subset of $\mathbb{R}$) we require that the end points $\underline{x}$ and $\overline{x}$ of the interval $[\underline{x}, \overline{x}]$ both belong to $\mathbb{R},$ (i.e., $-\infty < \underline{x}, \overline{x} < \infty$).

When we consider (1..Inf) + Inf we extend the operation $+:\overline{\mathbb{IR}}\times\overline{\mathbb{IR}}\to\overline{\mathbb{IR}}$
to allow $\pm\infty$ as operands in which case it is most natural that the result is also allowed to be . Then, Inf seems to be the best possible result of (1..Inf) + Inf.

I'm not sure if I agree completely here with you. With that result you describe, the addition is extended to $\overline{\mathbb{IR^*}}$, the star meaning $\mathbb{R} \cup \{ \pm\infty \}$ (the curly braces are not appearing to denote the set!), which the set-based flavor should not be included. So the subtlety is not the type-stability issue, but the fact that we are being inconsistent with the set-based flavor definition. My point is: Inf as floating point is interpreted as $\infty$; when it is used in an interval (1 .. Inf) it should be interpreted as a real number (finite!) which is larger than prevfloat(Inf) (the largest finite number representable in the floating-point format). If we accept this interpretation, which I think is consistent with the set-based flavor, we have that (1..Inf) + Inf should return (prevfloat(Inf) .. Inf).

If, however, we consider that 1 .. Inf corresponds to the set $\{ 1 \le x \}$ (which is not bounded nor closed!), then I agree that the result should be the `Inf .. Inf, which effectively is the empty interval.

EDIT: Using \\{ and \\} in tex mode to get the curly braces!

@OlivierHnt
Copy link
Member

I may be misinterpreting Note 1 in Section 10.2 of the standard which is:

NOTE 1 — The definition implies −∞ and +∞ can be bounds of an interval, but are never members of it. In particular, (4) defines [−∞, +∞] to be the set of all real numbers satisfying −∞ ≤ x ≤ +∞, which is the whole real line R—not the whole extended real line R.

Wouldn't that imply that 1..Inf is the set $\{1 \le x \}$? Btw for the curly brackets you have to do two backslashes, so here I did \\{ 1 \le x \\}.

True, we do not really care about type unstability per se. The way I phrased my sentence was a little misleading sorry. My reasoning is only about the extension of the algebra.

@lbenet
Copy link
Member

lbenet commented Feb 16, 2023

Thanks for explaining how to get the curly braces; I've edited my comment above. And also thanks for quoting Note 1 of Sect. 10.2: it illustrates the kind of statements that make me dislike reading the standard 😸.

Isn't that clause contradictory in itself? I mean "The definition implies $−\infty$ and $+\infty$ can be bounds of an interval, but are never members of it" contradicts " $[−\infty, +\infty]$ to be the set of all real numbers satisfying $−\infty \le x \le +\infty$ ".

If "$−\infty$ and $+\infty$ are never members of it" (an interval), then $−\infty \le x \le +\infty$ should be replaced by $−\infty < x < +\infty$. On the other hand, if you accept the second statement with the $\le$, then those "values" are members of the interval.

In the reasoning I wrote above, the first sentence is our departure point. So, we may use the symbols $\pm\infty$ to denote the endpoints of an interval, but "are never members of it". So, they represent a real number larger than the largest representable floating point.

True, we do not really care about type unstability per se. The way I phrased my sentence was a little misleading sorry. My reasoning is only about the extension of the algebra.

I think I got you right: Interval(Inf,Inf) would then be (a "thin") Inf. 😄

@OlivierHnt
Copy link
Member

OlivierHnt commented Feb 16, 2023

I mean I do understand your point of view (and thanks for taking the time to discuss this).

Indeed, the clause is contradicting itself 😄. But, as you said, I also agree that $-\infty \le x \le +\infty$ is a typo and it should have been $-\infty < x < +\infty$. So then I am confused how you reached the conclusion that

So, they represent a real number larger than the largest representable floating point.

and then that 1..Inf is understood as $\{ 1 \le x \le M \}$ where M > prevfloat(Inf)?

I mean if $-\infty \le x \le +\infty$ is a typo, then what they wrote in the text is the true statement of the clause, no?

Specifically, when they write "−∞ ≤ x ≤ +∞, which is the whole real line R" one then reads "−∞ < x < +∞, which is the whole real line R". Therefore 1..Inf represents $\{ 1 \le x \}$.

@Kolaru
Copy link
Collaborator

Kolaru commented Feb 16, 2023

I think either way there is a unique interpretation because $\infty$ is not a real number, so the statement "the set of all real number that..." always excludes it (it is weird to have used $\leq$ still).

My breakdown of the discussions here is:

  • 1..Inf means the current computation can not exclude the interval contains a number greater that the largest representable float.
  • $\{1 \leq x\}$ is bounded by 1..Inf. 1..(prevfloat(Inf) + 1) is also bounded by (and only by) 1..Inf. So if you encounter 1..Inf, the real underlying set could be either. I think it is useful to see the relation between sets and interval in this very pragmatic way to avoid overthinking.
  • Inf..Inf is essentially nonsensical in this context, as +Inf doesn't make sense as a lower bound for the set-based flavor.
  • In the set-based flavor, a..b + Inf is technically tightly bounded by the empty set, since the result contains no real number. prevfloat(Inf)..Inf also bounds it, but in a slightly more informative way[1]. Decorations should probably be used in that kind of cases to track the fact there is a subtlety happening.
  • We could even error in that latter case. After all if somewhere you end up mixing interval with Inf, Inf probably comes from a non guaranteed computation that failed, and there is not much sense to put effort rectifying that. And I'm pretty sure the standard only prescribe something for intervals interacting with reals.

Ultimately I think the most natural choice would be to use the Cset flavor as a default (which allows intervals to contain $\infty$ and in which Inf..Inf is a tight interval, with more intuitive behaviors e.g. 1 / (Inf..Inf) == (0..0)).

We would still have to support the set-based flavor to be standard compliant though x_x.

[1] Funnily enough a..b + Inf = 12..32 would probably be standard compliant as well, because it indeed contains all the real numbers for which the equation is true.

@lbenet
Copy link
Member

lbenet commented Feb 16, 2023

Thanks @Kolaru for your clear explanation and pragmatical approach! (I was about to send my comment, but yours is clearer!)

@OlivierHnt
Copy link
Member

OlivierHnt commented Feb 16, 2023

Interesting I have not looked into the "Cset flavour". I focused on the "set-based flavour" because that is the main one on the 1.0-dev and we need to decide on a behaviour.

I do agree that returning prevfloat(Inf)..Inf seems more informative (kind of keeps track of the $+\infty$ part) but it does not convey the right idea which is, as we all agree, that we are dealing with an undefined operation since Inf is not part of $\overline{\mathbb{IR}}$.

I have a hard time conflating Inf with finiteness in the construction you suggest 😓.
I mean I am suggesting to consider that undefined behaviour result in the empty set; so ((1..2) + Inf) == emptyset(). By the way, we are already associating the empty set with undefined behaviour at different places in the library. Here is an example: asin(prevfloat(Inf)..Inf) == emptyset() 🙂.

I also agree about the decorations; in this context with a bare Interval, I find that returning the emptyset is even more of an appealing choice.

So in my opinion, either we throw an error or the empty set is the most suitable choice, since

  • it is consistent with the rest of the library
  • it is mathematically a consistent extension of the algebra
  • it is a clear warning sign (akin to a decoration in some sense)

@lbenet
Copy link
Member

lbenet commented Feb 16, 2023

I agree that it is uncomfortably hard to think Inf differently, when it is used as a floating point, and when it is used as an end-point of an interval (all end-points are floating point numbers). Yet, sticking to Note 1 in Sect. 10.2 (and agreeing that the $\le$ is a typo and that should be <, 😄 ), we (must!) have: that Inf ∈ (1..Inf) yields false, as we get.

Maybe I'm wrong, but I think these issues arise from the fact that there are lot's of caveats when $\mathbb{R}$ is modeled with $\mathbb{F}$, a floating point format; things like NaN and Inf which are included in $\mathbb{F}$ may make no sense in $\mathbb{R}$.

Regarding the consistency of the library, the example you chose above is not the best, IMHO, because the interval (prevfloat(Inf)..Inf) is outside the domain of asin, just as (2..3): the intersection of the interval (prevfloat(Inf)..Inf) (or 2..3) and the natural domain of asin is the empty set....

A different example is division:

julia> 1/(0..0) # empty set is returned; denominator must not be identically zero
∅

julia> 1/(0..1) # semi-infinite line
[1.0, ∞]

julia> 1/(-1..1)  # real line
[-∞, ∞]

Note that, in the latter two cases Inf appears as an end-point. These two cases illustrate my point: you can not have the value 0.0 in the denominator (division is undefined!), but because the denominator has other non-zero values, approaching zero (in the denominator) yields a very large result beyond the largest (finite) floating point number; hence we use Inf.

@OlivierHnt
Copy link
Member

I agree that it is uncomfortably hard to think Inf differently, when it is used as a floating point, and when it is used as an end-point of an interval (all end-points are floating point numbers). Yet, sticking to Note 1 in Sect. 10.2 (and agreeing that the $\le$ is a typo and that should be $&lt;$, 😄 ), we (must!) have: that Inf ∈ (1..Inf) yields false, as we get.

Absolutely.

Now concerning the examples. Let me try again to use the standard to help us out (not sure it will 😄) (cf. page 35 of the standard, so "set-based flavour"). Let $\mathbf{x}, \mathbf{y} \in \overline{\mathbb{IR}}$.

When $f$ is a binary operator $\bullet$ written in infix notation, this gives the usual definition of its natural interval extension as $\mathbf{x} \bullet \mathbf{y} = \text{hull}(\{ \mathbf{x} \bullet \mathbf{y} | x \in \mathbf{x}, y \in \mathbf{y} \text{ and } x \bullet y \text{ is defined} \})$.

[Example. With these definitions, the relevant natural interval extensions satisfy $\sqrt{[−1, 4]} = [0, 2]$ and $\sqrt{[−2, −1]} = \emptyset$; also $\mathbf{x} \times [0,0] = [0,0]$ for any nonempty $\mathbf{x} \in \overline{\mathbb{IR}}$, and $\mathbf{x}/[0,0] = \emptyset$, for any $\mathbf{x} \in \overline{\mathbb{IR}}$.]

The examples 1/(0..0), 1/(0..1) and 1/(-1..1) follow this (here $1$ is tacitly identified with the thin interval $[1,1]$; this is unambiguous), so we are good.
Now the important part is that this does not apply to the types of operations we are discussing here. Specifically, an operation such as (1..2) + Inf because Inf is not in $\overline{\mathbb{IR}}$. So we cannot conflate these operations.

So the examples to look at are more things like Inf/(0..1) == [∞, ∞] (same issue as here) and Inf/(-1..1) == [-∞, ∞] (which I also think should be the empty set).

The reasoning behind the example with asin is the following. Wouldn't we say that + (here I mean the addition between intervals) is a function with domain $\overline{\mathbb{IR}}^2$? If so, isn't Inf not in the domain of + the same way prevfloat(inf)..Inf is not in the domain of asin? The similarity is that both operations are undefined (in particular in the sense of the "set-based flavour", unlike the three examples with the division given in the previous comment).

@OlivierHnt
Copy link
Member

OlivierHnt commented Feb 16, 2023

Ah I think I found a potential tie-breaker: (-1..1)/Inf == [0, 0] or simpler (0..0) * Inf == [0, 0].
If this is imposed by the standard, then prevfloat(Inf)..Inf is a more consistent result than emptyinterval() for (1..2) + Inf.

@lbenet
Copy link
Member

lbenet commented Feb 17, 2023

We all are pleading for a clearer and unambiguous standard !! 😄

Good points, all above! Inf is not in $\overline{\mathbb{IR}}$, nor is it in $\mathbb{R}$, which I translate as "there is no natural interval extension of Inf". (This is consistent with the definition in Sect. 10.2 of the standard, Eq. (4), where I note that $\underline{x} &lt; +\infty$ and $\overline{x} &gt; -\infty$, and that by convention the empty set is represented as [Inf, -Inf].)

Playing in your favor: if it's not real nor has an interval extension, it simply makes no sense to ask (1..2) + Inf... So, throwing an error or returning the empty interval makes sense. The same for (0..1)*Inf...

Playing in the other direction: If I think of this question in terms of a function f(x) = (1..2) + x, and ask (numerically) about the limit $x\to\infty$, the answer is consistent with (prevfloat(Inf)..Inf). The "same" way of thinking applies to (1..1)/Inf == (0..0) and (0..0)*Inf == (0..0).

Another uncomfortable and inconsistent case: (0..1) * Inf; currently we return [NaN, ∞] which is definitely wrong 😞

@OlivierHnt
Copy link
Member

OlivierHnt commented Feb 17, 2023

Sorry I should have made this clear in my previous comment: $0 \times \infty$ is indeterminate so (0..0) * Inf == 0 is wrong, the result should be either to throw an error or to return emptyset() (I prefer the latter personally).

So I looked at the standard and they do not impose a behaviour because for set-based flavour they only care about operations where the inputs are both elements of $\overline{\mathbb{IR}}$.

Thinking about this in terms of limits also works in favour of returning emptyset() in my opinion. We have the following cases:

  • the limit is finite. Then we use the canonical injection of $\mathbb{R}$ into $\overline{\mathbb{IR}}$. E.g. that means (-1..1)/Inf == (0..0).

  • the limit is indeterminate (e.g. $0 \times \infty$). Then we throw an error or return the empty set (I prefer that we return an empty set). E.g. that means (0..0) * Inf == emptyset().

  • the limit is infinite $\pm \infty$. To solve this issue with floating points, people created the symbol inf. They did not return the largest representable float which is what you suggest by returning the largest possible interval prevfloat(Inf)..Inf.
    Yes it is pleasing to the eye because it "looks like" it is doing the right thing but, truly, it is not. And pretending that it does opens the door for correctness issues. For instance, with your suggestions, the result now is precision dependant and the lower bound of the output of (1..2) + Inf can be scaled... 😢. Combine this two things with a rigorous algorithm which depends on the lower bound of the input interval, and you end up with a completely wrong answer despite the fact that your algorithm is rigorous for any element of $\overline{\mathbb{IR}}$. So it fails because we have conflated infinity with the largest element of $\overline{\mathbb{IR}}$ (imagine debugging this in a code...).
    I believe the soundest option available is to return the empty set. Also, it is mathematically coherent to do so, which is not nothing!

EDIT: Concerning (0..1) * Inf, from my reasoning above the result ought to be emptyset().

@lbenet
Copy link
Member

lbenet commented Feb 18, 2023

I think an important take away is that Inf is not in $\mathbb{R}$ (trivial), nor is it in $\overline{\mathbb{IR}}$. So, Inf is simply not in the set's we consider for arithmetic operations and functions.

With this in mind, either we throw an error or return the empty set systematically (including (-1..1)/Inf), so we must check (the same way for operations involving the empty set), the presence of Inf. Regarding throwing an error or returning the empty set, the first one would be better in terms of debugging time; the second makes also sense to me (now!), and it's perhaps better in the sense that it does not interrupt running the code (the empty set will be propagated) and emits a warning.

@Kolaru @lucaferranti @dpsanders What do you think of all this discussion, and the final conclusion?

@OlivierHnt
Copy link
Member

Just a small bump to not forget about this, since this also impacts PR #558

@lbenet
Copy link
Member

lbenet commented Mar 21, 2023

Any comment @Kolaru @lucaferranti @dpsanders?

I've left a review of #558...

@Kolaru
Copy link
Collaborator

Kolaru commented Mar 27, 2023

I think there are different types of function domain we have to deal with:

  • The domain of real valued function. In this case when extending to interval valued functions, the space outside of the domain is simply ignore, possibly returning to $\emptyset$. Example: $\sqrt{[-2, -1]} = \emptyset$.
  • The domain of interval valued function. In this case value outside of what is allowed should error. Example: sqrt("I am a number, I swear"). This is not a mathematical problem, but a computer one.
  • Mixing of the two. Currently, the general strategy is to "promote" real values to intervals, and compute with them. If we promote Inf -> emptyinterval() (valid as per the standard), then returning the empty set is consistent. If we promote Inf -> (prevfloat(Inf)..Inf), in most case we have a kind of informative value, and it would be overall consistent as well. We could also, as mentioned above, take another strategy and disallow mixing with Inf, and raise an error.

Conceptually, I prefer the last one to error. However, it is impractical, because Inf is a Float64, so we can not use dispatch to deal with that :(. I don't think it is worth, especially at this stage to go through this additional complexity. If it appears that we need a strict mode, I believe we could simply disallow the real-interval mixing altogether, after all if you want truly guaranteed result, you need to make sure you only use intervals (or exact number like integers or rationals). That could be done using multiple dispatch.

My second choice would be to go with the "promotion" (I quote because I am not referring to Julia built-in promotion mechanism, just its idea) Inf -> emptyinterval(). But it is a bit irritating that it implies (-1..1)/Inf == emptyinterval(). It makes sense, it is consistent, but still, feel like we could do better.

So after thinking about it I am leaning towards the last possibility. Which I know is a bit of a stretch.

However, I want to reaffirm that the true solution to that is to use decorated intervals (that could be the default for 1.0?) or a different flavor of intervals.

Also, don't be fooled by the length of this message, I don't think this is a particularly central issue[1]. In a proper application, it is very unlikely to get them appear, and I don't think it would be fair to blame IA for the unhelpful or hard to debug results that would come out.

I'll review #558 today or tomorrow.

[1] I wanted to make my opinion clear and simple, which backfired.

@Kolaru
Copy link
Collaborator

Kolaru commented Jul 24, 2023

We noticed that #568 would help avoid all these problems too.

@OlivierHnt OlivierHnt added the 1.0 Planned for the major 1.0 release label Jul 24, 2023
@OlivierHnt
Copy link
Member

Resolved by #558.

@OlivierHnt OlivierHnt removed the 1.0 Planned for the major 1.0 release label Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants