WIP/RFC/Scary: allow signed Normed numbers #143

timholy · 2019-11-27T13:51:24Z

This is a partial implementation of a speculative change, allowing signed variants of Normed. For example, S7f8 would represent the range -128.502 to 128.498; like N0f8 it is still scaled by 255, but also has 1 sign bit and 7 bits to store absolute values bigger than 1.

The motivation here is to support differences, related to #126, #41, and coming up with a good accumulator type in CheckedArithmetic.

Speculatively, we could use S7f8 for subtraction (and maybe addition) with N0f8. Pros: computing differences among pixels in images is a common operation, and it would be nice for this to be less of a gotcha than it is currently. Cons: this is against convention in the Julia ecosystem, and in opposition to #27. Thoughts? I'd especially like to hear from @vchuravy in addition to other recent contributors.

kimikage · 2019-11-27T14:51:13Z

Wow, this is a great challenge!
I agree with the aim and concept, but this is a dramatic change, you know. So, I cannot immediately determine whether the benefits of generalization exceed those of specialization.

From a practical point of view, since N0f8 cannot be converted to 8-bit signed Normed, I don't think it has much merit. For addition / subtraction, N8f8 can be used instead of S7f8, and Float32 is suitable for many other operations.
Of course, I'll do all I can for this challenge.

Edit:
Perhaps adding SNormed{<:Signed} instead of Normed{<:Signed} may alleviate my anxiety.

timholy · 2019-11-27T15:22:37Z

From a practical point of view, since N0f8 cannot be converted to 8-bit signed Normed, I don't think it has much merit... For addition / subtraction, N8f8 can be used instead of S7f8

Not entirely. I don't entirely subscribe to what follows, but for the purposes of argument let me make the extreme case. Consider

julia> 0x01 - 0x02
0xff

If magnitude matters, it's bad when smaller - small = big. This is much more serious a problem than 0xff + 0x01 == 0x00, because it's much more important to be able to accurately measure numbers close to 0 than it is to typemax(T). (We often use small integers on a 64-bit machine, but rarely do we use integers close to ±2^63.) A very common operation is to look at the difference between images, e.g., to detect changes in a scene, and the default arithmetic rules here make this sufficiently fraught that newbies are almost guaranteed to write buggy code. Somewhat along the lines of https://stackoverflow.com/questions/10168079/why-is-size-t-unsigned, one could declare that one might ban unsigned types from arithmetic altogether.

With this line of thinking, there are two options: (1) deliberately lose one bit of precision, and for an 8-bit image demote it to 7-bit and then use the extra bit as a sign bit (so that 8-bit images get loaded as S0f7); (2) for any arithmetic operation involving unsigned types, immediately go to the next wider signed type. In option (2), when doing any mathematics at all, we promote Nmfn (where m+n is 8, 16, or 32) to S(2m+n-1)fn so that we get a sign bit with no loss of precision; as a bonus we can also store larger values, but that's not really the main goal. Note that arithmetic with Spfq (where q + q is 7, 15, or 31) does not promote, consistent with the rest of Julia. This is just an unsigned-types-are-bad-news decision, not a decision to widen all arithmetic operations.

Again, I'm making the "extreme" case to show why this is worth considering. A more balanced view would point out two important negatives:

it's different from the rest of Julia
when applied to arrays, it causes the memory requirement to double

The latter is a problem, but the fact that it's limited to a doubling (and not, say, 8x) and that RAM is cheap means that I don't think the latter is fatal. I think the former is the more serious disadvantage of such a change.

If you ask me, what is my honest opinion, I am not sure. I support the fact that operations with UInt8 return UInt8. I am less sure that I support operations with N0f8 returning N0f8; to me it seems that no one chooses UInt8 without thinking about it, but for image processing I bet most users don't choose an element type at all, they take whatever arrives when you load the image from disk.

timholy · 2019-11-27T15:34:04Z

We should make sure that major contributors to/watchers of JuliaImages are aware of this discussion: CC @kmsquire, @rsrock, @SimonDanisch, @Evizero, @tejus-gupta, @annimesh2809, @zygmuntszpak, @RalphAS, @Tokazama, @juliohm, @bjarthur, @ianshmean, @jw3126, @tlnagy. Sorry for the spam (and sorry if I omitted someone who might want to know).

johnnychen94 · 2019-11-27T16:33:47Z

I'm skeptical of its practical usage.

I am less sure that I support operations with N0f8 returning N0f8

unless all operations are done under N0f8 without conversion, which is impossible as far as I can see, this feature is of less practical usage. A more common situation at present is:

# not returning N0f8 for N0f8 input
img_in = load(imgfile) # N0f8
img_temp = foo1(img_in) # N0f8->Float32
img_out = foo2(img_temp) # Float32
img_out = n0f8(img_out) # Float32->N0f8

Note that there're only two conversions.

But if it became:

# returning N0f8 for N0f8 input
img_in = load(imgfile) # N0f8
img_temp = foo1(img_in) # N0f8->Float32->N0f8
img_out = foo2(img_temp) # N0f8->Float32->N0f8

There're four now, where two of them are unnecessary. If I understand it right, what you're trying to do here is purely replacing Float32 with signed normed numbers. If it's about saving memory usage, Float16 might be a better choice (although the current performance is said poor).

I once opened an issue related to what I said here (preserving type): JuliaImages/ImageCore.jl#82.

vchuravy · 2019-11-27T17:26:33Z

It's been a while since I deeply thought about FixedPointNumbers and the difference between Fixed and Normed.

Cons: this is against convention in the Julia ecosystem, and in opposition to #27.

I see there are two orthogonal concerns.

Is a number normed or fixed point. This is a subtle concern that tripped me up in the past when trying to unify these two concepts. Both have fractional bits, but the meaning differs slightly.
Is there a sign bit.

From a representational perspective I don't think there is any strong reason to not have signed normed numbers, although looking at my comment history I once remarked that I had no longer a strong reason for having unsigned fixed point numbers.

The normed numbers always had an odd place in the ecosystem, since I see them primarily as storage numbers and not as computational numbers.
Then there is the question of semantics, which normed numbers mostly derive from what is needed in the Color ecosystem and as far as I can tell if
one does computation on them, one might want to do saturating computation or one needs to change representation. But the semantics we have for Normed is overflow/underflow and that makes the least sense in image calculations.

In conclusion I don't have strong opinions and only existential questions. It might make sense to divorce fixed and normed and make normed an explicit color representational/computational type instead of having to worry about general semantics.

timholy · 2019-11-27T20:27:09Z

From @johnnychen94 :

A more common situation at present is:

# not returning N0f8 for N0f8 input
img_in = load(imgfile) # N0f8
img_temp = foo1(img_in) # N0f8->Float32

and @vchuravy:

The normed numbers always had an odd place in the ecosystem, since I see them primarily as storage numbers and not as computational numbers.

Yep. Both of these express the notion that the current values are mostly storage and not computation. If you already think this way, there's little advantage to this proposal.

However, we do support computation with such values. Since I think I may not have been sufficiently clear, let me walk you through an interesting experiment. This is one that someone might do to compare two images, or look for frame-by-frame temporal differences in a movie:

julia> using ImageView, TestImages

julia> mri = testimage("mri");

julia> diff1 = mri[:,:,1:end-1] - mri[:,:,2:end];

julia> diff2 = float.(mri[:,:,1:end-1]) - float.(mri[:,:,2:end]);

julia> imshow(diff1); imshow(diff2)

It's worth doing this on your own, but to make life easy here are the first frames from each:

The second one is "correct," with mid-gray encoding no difference. Or you can use colorsigned from ImageCore:

(magenta is positive and green is negative)

Now, if you're used to calling float(img) before doing any computations, there is nothing to worry about. But new users often are bothered by this kind of overflow (e.g., @johnnychen94, you got bit by this in JuliaImages/ImageCore.jl#63).

To the specifics of this proposal: aside from defining & supporting the type (which, as @kimikage indicated, is a burden), the change that would "fix" these issues is a single line, which could be written:

-(x::N0f8, y::N0f8) = S7f8(Int16(x.i) - Int16(y.i), 0)

This is definitely a breaking change, but it's not a huge computational burden:

julia> minus1(x, y) = x - y   # this is essentially what we do now
minus1 (generic function with 1 method)

julia> minus2(x, y) = Int16(x) - Int16(y)    # this is what we could do if we had S7f8
minus2 (generic function with 1 method)

when passed UInt8s is just the difference in "padding" with a byte of all zeros:

julia> @btime minus1(x, y) setup=(x = rand(UInt8); y = rand(UInt8))
  1.240 ns (0 allocations: 0 bytes)
0x22

julia> @btime minus2(x, y) setup=(x = rand(UInt8); y = rand(UInt8))
  1.239 ns (0 allocations: 0 bytes)
187

So in contrast to what @johnnychen94 said above, there isn't much/any of a conversion overhead to worry about. (For comparison, float(x) is more expensive, clocking in at 2ns.)

Now, just about the only operation this would affect are + and -; as soon as you multiply by a AbstractFloat you convert to floats and the rest of the computations would not be affected.

So, to recap, the costs of this are:

more breaking changes
more types to support
a doubling in size of images if you add or subtract them

and the benefits are

adding and subtracting will more often work more in the way users expect them to.

That's pretty much the bottom line, I think.

johnnychen94 · 2019-11-27T21:39:45Z

I'm satisfied with the "breaking change" and it's more like bug fixes to me.

An alternative way to fix this issue is making Normed a pure storage type and promotes to float or Fixed for every relevant operation. That could really save a lot of effort.

kimikage · 2019-11-27T22:40:02Z

So in contrast to what @johnnychen94 said above, there isn't much/any of a conversion overhead to worry about. (For comparison, float(x) is more expensive, clocking in at 2ns.)

As I showed in the past benchmarks, the benchmarks of a single operation can easily mislead us.:confused:

kimikage · 2019-11-28T00:35:46Z

src/normed.jl

-function _convert(::Type{U}, x::Tf) where {T, f, U <: Normed{T,f}, Tf <: Union{Float32, Float64}}
-    if T == UInt128 && f == 53
-        0 <= x <= Tf(3.777893186295717e22) || throw_converterror(U, x)
+function _convert(::Type{N}, x::Tf) where {T <: Integer, f, N <: Normed{T,f}, Tf <: Union{Float32, Float64}}


I'm sorry but this method is currently optimized for non-negative numbers and cannot handle negative numbers. (pay attention to k.) Of course, this is a trivial issue, but I mean, "optimization is specialization".

Yeah, I hadn't checked any of this, this is WIP.

I don't care about that much in FixedPointNumbers. However, it is generally not good to ask other packages for such verification and modification. If this is essential to the design, we should change it aggressively, but I don't think so.

Not sure I understand. All I meant was that this was a draft I decided to let people know I was working on. I will fix this before we merge this if we decide to do so.

I think this is a simple problem. The current Normed is always unsigned, but your Normed is not always unsigned.

kimikage · 2019-11-28T00:52:27Z

src/normed.jl

@@ -241,7 +266,8 @@ Base.Rational{Ti}(x::Normed) where {Ti <: Integer} = convert(Ti, reinterpret(x))
 Base.Rational(x::Normed) = reinterpret(x)//rawone(x)

 # Traits
-abs(x::Normed) = x
+abs(x::Normed{<:Unsigned}) = x
+abs(x::T) where T<:Normed  = T(abs(x.i), 0)


Why does not abs(::Normed{<:Signed}) return Normed{<:Unsigned}? (Of course, this is not a question.)

Try this: abs(Int8(-5)). In general Julia doesn't change the number type, even to the point where abs(typemin(Int8)) returns a negative number!

That's right. Therefore, it can be concluded that -(x::N0f8, y::N0f8) = S7f8 (Int16(x.i) - Int16(y.i), 0) is not good! 😝

I agree, this is in conflict with the rest of Julia. I guess we could return Normed{<:Unsigned}, since the only other "always agrees with expectations" is to widen which also changes the type.

kimikage · 2019-11-29T03:34:31Z

It might be better to clarify what we need.

Extra bits
Sign bit (i.e. negative numbers)
Newbie-friendly promotion
Speed

It is premature to discuss the speed because the current Fixed and Normed are not fully optimized. (Edit: Moreover, CheckedArithmetic will provide a reasonable interface, but it is not always the fastest way. cf. #41 (comment))
If we just need the extra bits for subtractions, we can use N8f8 instead of S7f8, although it is never newbie-friendly.
With regards to the image processing, it is important that there are no negative RGB colors. Thus, we should (implicitly) apply the mapping.

Edit:
From @vchuravy:

I see there are two orthogonal concerns.

Is a number normed or fixed point. This is a subtle concern that tripped me up in the past when trying to unify these two concepts. Both have fractional bits, but the meaning differs slightly.

Is there a sign bit.

I also agree about that. So, I think there are two types of the type hierarchy:

Fraction-first

FixedPoint
- AbstractFixed
  - Fixed
  - UFixed
- AbstractNormed
  - Normed
  - SNormed

Sign-first

FixedPoint
- SignedFixedPoint
  - Fixed
  - SNormed
- UnsignedFixedPoint
  - UFixed
  - Normed

I think it is a good idea to add the signed Normed, but Normed{<:Signed} is not good, as mentioned above.

vchuravy · 2019-11-29T21:32:17Z

I also agree about that. So, I think there are two types of the type hierarchy:

Yeah I have a strong preference for the first one since it emphasises the semantic difference between Normed and Fixed.

kimikage · 2019-11-30T00:10:34Z

Since we can only choose one for now, I agree completely. However, I think choosing the former is not denying the latter. If we explicitly choose the former, we should be responsible for the implicit nature of the latter. Normed{<:Signed} breaks the nature of the latter.

timholy · 2019-11-30T10:00:40Z

With regards to the image processing, it is important that there are no negative RGB colors. Thus, we should (implicitly) apply the mapping.

Not sure I fully agree here. We allow folks to construct out-of-gamut colors routinely. This is important in, e.g., computing the mean color of an image patch, where the natural implementation of mean is sum(a)/length(a). It's nice to have a type that enforces in-gamut, but that's not essential. Similar considerations apply in https://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf (linked from the C++ link above).

Normed{<:Signed} is not good, as mentioned above.

I wasn't sure which place, can you elaborate? I don't think anything in the concept of "normed" enforces "unsignedness," so I don't see any problem with Normed{<:Signed}. You can always check for Signed in the type paramter if you need to differentiate (which I needed to do in a couple of places in this PR, though did not check in enough!).

kimikage · 2019-11-30T12:28:46Z

I'm sorry. I'm not good with words.

We allow folks to construct out-of-gamut colors routinely.

What I want to say is not about the gamut, but about the visualization. Since our LCDs cannot represent negative RGB colors, we need to project abstract numbers into concrete colors. It usually involves the type conversion. If you do type conversion after all, is there any reason not to convert the numbers to a more convenient type (e.g. Float32, Float16, Fixed) first? If you do not visualize the data, what is the difference between N8f8 and S7f8 in RAMs?

Although signed Normeds are useful as an intermediate storage type, I don't realize the need to implement those arithmetic.

timholy · 2019-11-30T12:50:27Z

Good! This focuses the discussion down. Really I think we have 3 options:

The current approach: implement math with Normed numbers, with the results also Normed. Advantage: consistent with the rest of Julia. Disadvantage: "wrong" results are returned for very simple operations.
Declare that Normed numbers are storage only. Math always returns floating-point equivalents, e.g., +(x::N0f8, y::N0f8) = float(x) + float(y). Advantages: (1) "correct" results are always returned, although roundoff will become more annoying, and (2) considerable simplification of this package and the type landscape. Disadvantages: (1) less consistent with the rest of Julia. (2) there's a 4x blowup of size for most operations. (Float16 is too slow to be a practical result type.) (3) For some algorithms, like median filtering and histogram equalization, there are particularly efficient implementations that rely on the discretization provided by 8-bit values.
Implement signed variants of Normed numbers. Advantages: "correct" results are usually returned (barring overflow), and roundoff won't happen since the underlying types are integral. Disadvantages: (1) less consistent with the rest of Julia, (2) more types we have to support, (3) like option 2, discretization becomes less of an advantage even by moving to 16-bit types (there are more bins you have to check).

timholy · 2019-11-30T13:13:38Z

In comparing 2 vs 3, perhaps the key question is whether the extra complexity is worth optimizing a small number of operations (basically, addition and subtraction). For reference,

julia> function mysum(acc, A)
           @inbounds @simd for a in A
               acc += a
           end
           return acc
       end
mysum (generic function with 1 method)

julia> using FixedPointNumbers

julia> A = collect(rand(N0f8, 100));

julia> using BenchmarkTools

julia> Base.:(+)(x::N56f8, y::N0f8) = N56f8(x.i + y.i, 0)

julia> @btime mysum(zero(N56f8), A)
  23.733 ns (1 allocation: 16 bytes)
50.255N56f8

julia> @btime mysum(0.0f0, A)
  23.544 ns (1 allocation: 16 bytes)
50.254906f0

julia> @btime mysum(zero(N56f8), A)
  22.852 ns (1 allocation: 16 bytes)
50.255N56f8

julia> @btime mysum(0.0f0, A)
  23.782 ns (1 allocation: 16 bytes)
50.254906f0

so at least on my machine the performance difference seems to be within the noise. (I am surprised, TBH.)

Can others try this benchmark and report back?

kimikage · 2019-11-30T15:39:53Z

These trends apply not only to x86_64 but also to ARM (with NEON).

julia> versioninfo()
Julia Version 1.0.3
Platform Info:
  OS: Linux (arm-linux-gnueabihf)
  CPU: ARMv7 Processor rev 4 (v7l)
  WORD_SIZE: 32
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, cortex-a53)

julia> @btime mysum(zero(N56f8), A)
  1.219 μs (2 allocations: 32 bytes)
47.776N56f8

julia> @btime mysum(0.0f0, A)
  1.698 μs (1 allocation: 8 bytes)
47.776474f0

Of course, the 32-bit accumulator is faster.

julia> Base.:(+)(x::N24f8, y::N0f8) = N24f8(x.i + y.i, 0)

julia> @btime mysum(zero(N24f8), A)
  757.864 ns (2 allocations: 16 bytes)
47.776N24f8

timholy · 2019-11-30T18:55:32Z

OK, so there are architectures where it makes a 2x difference. Is that (along with its domain of applicability, which is basically summation/differencing) enough to matter?

kimikage · 2019-11-30T23:37:54Z

I think this is a compiler issue rather than a CPU architecture or CPU performance issue. Of course, compiler-friendly types are better. However, is that really what you wanted to know with the benchmark?

Edit:
I mainly support the option 1, but I do not object to adding isolated signed Normed numbers.
I am strongly against the option 2. If you do that so, you should split FixedPointNumbers into FixedPointTypes and FixedPointArithmetic.

Edit2:
My other concern is that FixedPointNumbers is immature. The packages which cannot follow this change can use the current (old) version, but who will support backporting in the future? The generalization is not so simple. We do easily (innocently) anti-generalization (see: #139).

kimikage · 2019-12-01T01:29:13Z

There are two orthogonal concepts for accumulation.

Using Float for the accumulator
Converting each element to Float

julia> function mysum_raw(acc, A)
           @inbounds @simd for a in A
               acc += reinterpret(a)
           end
           return acc / FixedPointNumbers.rawone(eltype(A))
       end

This is just a conceptual code and not practical. This difference might be noticeable for large (more practical) arrays.

It is premature to discuss the speed because the current Fixed and Normed are not fully optimized.

timholy · 2019-12-01T10:37:18Z

I don't think most algorithms should dive into this level of detail about their element types. Your code above should be equivalent to mysum(zero(N56f8), A) (on a 64-bit machine) in all respects except the result type, and of course it can be converted to Float64 depending on what you do with it next. Relying on Julia's abstraction properties means we can write generic algorithms that don't have to be hand-tuned for each element type, without paying a performance penalty. But what we decide here will end up determining what that inner loop looks like.

kimikage · 2019-12-01T13:22:26Z

My conceptual code is never never never practically good. Is it related to this discussion? 😕
I don't want to explain the problems of your benchmark code in detail

timholy · 2019-12-01T15:19:10Z

I mainly support the option 1, but I do not object to adding isolated signed Normed numbers.
I am strongly against the option 2. If you do that so, you should split FixedPointNumbers into FixedPointTypes and FixedPointArithmetic.

Is it related to this discussion?

#143 (comment) is, in my mind (flaws included), a benchmark for whether 3 provides any advantage over 2. Given that you don't like option 2, and that at least on some architectures 3 does perform better than 2, it seems our main options are 1 and 3. Option 3 would allow one to fix #41, #126, JuliaImages/ImageCore.jl#63. Option 1 is the status quo.

Can we agree on 3 as a reasonable consensus solution?

kimikage · 2019-12-01T16:37:00Z

First of all, a wrong benchmark misleads us. Since sum is a special case of addition, it is not fair. Moreover, the optimizer still has room for improvement. Do you want to "redesign" every time the compiler changes?

I think the main problem of #41 (and JuliaImages/ImageCore.jl#63 ?) will be solved by @checked in the near future. I think #126 can be solved without Option 3.

What are the specific problems which Option 1 cannot solve?

timholy · 2019-12-01T18:17:15Z

will be solved by @checked in the near future

I don't doubt you already know everything I'm about to say, so I may not be understanding your true question. But let me try: @checked supports "normal" arithmetic with the extra feature that it throws an error if the results overflow. For any nontrivial image pair, @checked img1 - img2 would result in an error. Is that a solution? I guess it depends on your perspective. Would one always want @checked arithmetic? Doubtful, since image-processing is performance-sensitive. So what is @checked for? Mainly tests and debugging, I think. It only works on surface syntax; to go "deeper" we'd need a separate package, CheckedFixedPointNumbers that checks all operations by default. I am fairly certain that there isn't a good mechanism that (1) lets you have just one type, (2) works with precompilation, (3) allows you to switch between checked and unchecked arthmetic, and (4) doesn't also incur a performance penalty. And again, the only thing that buys you is an error when overflow is detected.

Let me try to get others to chime in by making the options here crystal clear:

we can have img1 - img2 work for arbitrary images but use "wraparound" behavior. This is the status quo.
we can have img1 - img2 throw a MethodError, forcing users to write float.(img1) - float.(img2). This will avoid the "wraparound" behavior. Implementing this option would simply delete the code supporting arithmetic from FixedPointNumbers.
we can have img1 - img2 promote eltypes automatically to something that doesn't typically wrap around. This comes in two flavors:
a. promote to float for all operations
b. promote to Normed{<:Signed} for at least addition and subtraction

My hope in submitting this PR was that option 3b would "thread the needle" and be an acceptable option to everyone. I think that's pretty clearly not the case. If I understand the voting so far, it looks like this (:+1: means approve, :-1: means disapprove, nothing means relatively neutral):

	Option 1: status quo	Option 2: force conversion	Option 3a: promotion to float	Option 3b: promotion to `Normed{<:Signed}`
@kimikage	👍		👎	👎
@johnnychen94	👎
@timholy	👍	👎	👎	👍
@ianshmean	👎			slight 👍
@tlnagy	👎	👎		👍

If I've misunderstood people's votes, please correct me. How do others vote?

kimikage · 2019-12-02T00:23:54Z

Your explanation is very easy to understand. 👍

Option 2:
force conversion: 👍
delete the code supporting arithmetic: 👎 (Edit: I think the MethodError is not user-friendly. We can communicate with the users by means of a error message.)

I don't know much about the costs of education, but generally it's not bad to be able to manage your own workspace yourself.

Whether the answer is correct or not is determined by the specifications, not by the absence of "overflow". FixedPointNumbers has no way of knowing the user's specifications. Wait for someone to create FormalMethods.jl. However, the higher the level, the closer to the users.

In the first place, where do Normed numbers come from in image processing? Do the users really create them? I think breaking or narrowing the entry pathway is a common tactics. I can't agree with this much, but this is better than Option 3.

timholy · 2019-12-02T01:31:12Z

To clarify, option 2 = "force conversion" is equivalent to "delete the code supporting arithmetic." The only way to force users to convert to some other type before doing arithmetic is for arithmetic on Normed types to throw an error. Option 2 is the "Normed types are for storage only" option. So your votes on Option 2 are in conflict with one another. If you don't want to delete the code supporting arithmetic, you have to decide what type arithmetic returns, leaving options 1, 3a, and 3b.

where do Normed numbers come from in image processing

img = load("mypicture.png"). They are what we return for any integer-encoded image (e.g., 8- and 16-bit grayscale and color images, where for color 8/16 means per channel). The majority of users never make a conscious decision about the element type they receive; nor have many read the documentation on FixedPointNumbers.

I've checked every package in JuliaRegistries/General that depends on FixedPointNumbers. The only one that isn't related to image processing or colors is https://github.com/JuliaAudio/SampledSignals.jl, which from their README would apparently benefit from Normed{<:Signed}.

johnnychen94 · 2019-12-02T01:31:27Z

Option 2: neutral 😑
I'm only unsatisfied with the status quo. Option 2 is what users of FixedPointNumbers.jl can do to solve this issue at present without touching this package. I can see the importance of Option 3(especially Option 3b), but since I'm not familiar with the implementation details and not care much about the sizes at present, really I have nothing much to comment on.

What are the specific problems which Option 1 cannot solve?

If it's on test stage, ReferenceTest.jl already bring us a tool to examine the numerical results on different types. The problem with Option 1 is that we can't guarantee the numerical accuracy without test.

Plug-and-play is a very common style when we're processing images. As a playground, we generally don't write tests. When wrap around behavior happens without detected, we would tend to conclude that "my processing pipeline should be modified" and it's actually quite severe mislead in many research scenarios.

Where do Normed numbers come from in image processing? Do the users really create them?

I only manually convert Float32 to Normed when I need to benchmark different algorithms -- take advantages of the inaccuracy of representation -- just to have a more stable benchmark result, e.g.,

julia> float_x = 0.1
0.1

julia> float_y = 0.103
0.103

julia> N0f8(float_x) - N0f8(float_y)
0.0N0f8

timholy · 2019-12-02T01:53:25Z

@johnnychen94, thanks for clarifying; I've updated your votes in the table accordingly.

One thing I should also make sure is understood: Python's scikit-image uses Option 1; Matlab uses Option 1 but uses saturating arithmetic. If we choose something different, we're blazing a different path. But we're already doing that with FixedPointNumbers anyway; other frameworks that support both "fixed point" (integer) images and floating-point images are content to live with the equation 255 == 1.0 (sometimes), whereas we are not. Moreover, both Matlab & Python have to make this decision in the context of general operations on arrays with a very limited palette of element types, whereas we're in the situation of having an image-specific element type and hence have greater freedom to set our own rules. Consequently, I don't think it's at all crazy to do something "better" than what they do, if indeed it is possible to do anything that is better.

Consequently I understand @johnnychen94's dissatisfaction with the current status quo, even if I also acknowledge that it's one of the most self-consistent options we have.

kimikage · 2019-12-02T03:00:52Z

@timholy:

To clarify, option 2 = "force conversion" is equivalent to "delete the code supporting arithmetic."

I understand and clearly disagree with Option 2.

The only way to force users to convert to some other type before doing arithmetic is for arithmetic on Normed types to throw an error.

I cannot completely agree with that. I just proposed another way. (The way is not pretty good, though.)

I've checked every package in JuliaRegistries/General that depends on FixedPointNumbers.

Great survey! However, isolated signed Normed and Normed{<:Signed} should be clearly distinguished. The former should not be detrimental to the users (except for the unnatural naming), but the latter is a breaking change, even though it may be a minor change for you. Again, "optimization is specialization".

@johnnychen94:

The problem with Option 1 is that we can't guarantee the numerical accuracy without test.

That's right! Therefore, I think testing is very important. A program with no specifications or tests is a poem. Although I do never hate poetry, it is not the basis for software design.
If the errors mislead us, we should consider throwing correct and useful errors first. I don't think this is an obligation of low-level packages like FixedPointNumbers. Of course, it is good to improve FixedPointNumbers for the better errors.

This is just a thought experiment. What about NormedFloat instead of Normed{<:Signed}? I can assure you that the fully-optimized NormedFloat is not inferior to Normed{<:Signed} in terms of speed and size. However, I don't think it is good. What is the advantage of Normed{<:Signed} over NormedFloat?

Edit:
BTW, I'm somewhat violent or rude, because it is a little stressful for me to express my (abstract) thoughts in English and I'm afraid that the discussion will go on while I am translating. Forgive me for being so rude. How much time do we have for this discussion?

timholy · 2019-12-02T12:50:11Z

This is just a thought experiment. What about NormedFloat instead of Normed{<:Signed}? I can assure you that the fully-optimized NormedFloat is not inferior to Normed{<:Signed} in terms of speed and size. However, I don't think it is good. What is the advantage of Normed{<:Signed} over NormedFloat?

What is NormedFloat?

How much time do we have for this discussion?

It can go on paralyzing development forever, probably 😉. Seriously, we need to reach some kind of consensus here. Why don't you propose what you think we should do, rather than just argue against what I am proposing? AFAICT the only solution currently is the status quo. I can live with that but I'm not happy with it. Perhaps you feel you have proposed a solution, but if so I didn't understand it.

EDIT: To be concrete, I was hoping to put in a grant on Dec. 17th to try to fund work on JuliaImages. It would be great to reach consensus well before that.

timholy · 2019-12-02T12:57:30Z

I understand and clearly disagree with Option 2.

I see you edited your post to clarify. Yes, it's fine to throw an informative message. I would have done it that way, I was being a bit casual by saying we'd just delete them. Delete the content, I should have specified.

IUUC, given a friendly error message you are favorable on Option 2, no? I am confused by the reply above, however. I'll amend your votes once I understand your views correctly.

kimikage · 2019-12-02T14:16:04Z

What is NormedFloat?

I'm sorry for the lack of explanation, but it is so simple that no explanation is needed. NormedFloat means Normed{T<:AbstractFloat, f}. (Of course, it is no longer a part of FixedPointNumbers. So, I did not call it Normed.) As a matter of fact, I have already proposed this concept.
We can use Normed{Float16, 8} instead of Normed{Int16, 8}.
Regarding the addition and subtraction of N0f8 numbers (i.e. "safe" operations), Normed{Float16,8} is empirically estimated to be ~2x slower than (unoptimized) S7f8. If the difference of speed is like that, I empirically assume that the bottlenecks are memory allocations and memory accesses, and that the practical speeds will be approximately the same. Of course, the sizes are the same.
Well, this is just a thought experiment. I don't think the promotion to Normed{<:AbstractFloat} is good, too.

kimikage · 2019-12-02T14:49:08Z

Perhaps you feel you have proposed a solution, but if so I didn't understand it.

I don't think it is nice, but narrowing the entry pathway is one solution to "force conversion" , isn't it?

julia> img = load("mypicture.png")
┌ Warning: Please specify the pixel type, e.g, `load(Float32, "mypicture.png")`. You are using `N0f8` now.
└ @ Foo
2×2 Array{RGB{N0f8},2} with eltype RGB{Normed{UInt8,8}}:
 RGB{N0f8}(0.867,0.157,0.169)  RGB{N0f8}(0.984,0.133,0.851)
 RGB{N0f8}(0.855,0.984,0.945)  RGB{N0f8}(0.553,0.549,0.133)

Again, I don't think this is nice. However, I'm not against this because this is the approach which is closer to the users than the low-level FixedPointNumbers.

kimikage · 2019-12-02T15:43:49Z

My biggest concern is that there are very few people participating in this discussion rather than what the final decision is.

If you are convinced that signed Normed is great, why not complete the implementation of signed Normed without breaking changes? If it is really great, everyone will extend the current Normed. And then, there should be a movement to promote to signed Normed.
Of course, it might be too late for the funding.

I do not intend to inhibit it maliciously. However, I think "we should do" #139 and supporting @checked first. Unfortunately, the development in parallel should cause the conflicts.

IanButterworth · 2019-12-02T15:58:02Z

Just FYI, I have been following this, but I’ve not felt like my thoughts are yet robust enough to contribute.

Perhaps I’d just say a few things:

silent wraparound keeps me up at night, so I do like the idea of dealing with this
I love the idea of finding a type promotion system that maintains minimal lossless precision throughout operations (3b), but figuring out the types of each possible combination sounds like a big undertaking, and it would be interesting to see what the benefits in performance and data integrity were over float.

I guess it seems like some more fleshed-out working case studies could help frame the discussion?

HTH!

timholy · 2019-12-02T15:59:24Z

NormedFloat

I can't see any advantage in NormedFloat: why have an implicit /255 in a type that can represent the result directly? The main purpose of these types is to harmonize the values of "integer-valued" and "floating-point valued" images, which are treated differently in most image-processing frameworks (e.g., https://scikit-image.org/docs/stable/user_guide/data_types.html). This package harmonizes them by, for example, providing a number type with values ranging from 0 to 1 that's representable in 8 bits. I don't see how NormedFloat contributes to that, since you can just do the division directly.

but narrowing the entry pathway is one solution to "force conversion"

We can't force conversion of a 1TB image. My lab uses those all the time, via memory-mapping.

However, I suppose we could change all the image-loading routines to return a MappedArray using of_eltype. That would lazily-convert to Float32 representation upon usage. That would make the rules of arithmetic in this package essentially irrelevant. Thoughts?

If you are convinced that signed Normed is great, why not complete the implementation of signed Normed without breaking changes?

Yes, we could create the type and then decide how to use it later. I just don't feel like creating types for the fun of it, I'd rather be sure there's a path towards them being useful, hence the proposal to change the rules of arithmetic.

Breaking changes aren't a disaster if they are managed carefully; we can bound all packages that use FixedPointNumbers to 0.x versions, and then the packages that want to keep up can install CompatHelper to learn about the new approach. Breaking changes are acceptable especially for a 1.0 release; we broke just about everything in the Julia 0.6->1.0 transition. We already have some breaking changes queued, though any change in arithmetic rules would be a much larger change.

So for me the point of this discussion was to determine whether we can find a better way. I am thinking we can't, so perhaps I should abandon this direction.

timholy · 2019-12-02T16:57:34Z

@kimikage, I may have been remiss in not directing you to this earlier, but if you haven't read it yet I highly recommend reading https://juliaimages.org/latest/arrays_colors/, particularly starting at https://juliaimages.org/latest/arrays_colors/#fixedpoint-1 and reading to the end of the page. It's worth comparing to that scikit-image page I linked above, particularly section titled "Input types" which, you'll note, forces one to change the value when you want to change the representation. I apologize for the lack of a Japanese translation, I am aware that it makes it much more difficult. I would not fare well if the only documentation were available in Japanese 🤷‍♂️ .

kimikage · 2019-12-02T22:22:58Z

In the first place, please don't put the blame on my lack of knowledge. It is my personal problem that I am stubborn, so you can blame just me any amount. However, my (or our) claim is that we should keep it consistent with the rest of Julia. No matter how much I learn, the Julia language will never change unless I take action.

tlnagy · 2019-12-02T22:35:15Z

However, I suppose we could change all the image-loading routines to return a MappedArray using of_eltype. That would lazily-convert to Float32 representation upon usage. That would make the rules of arithmetic in this package essentially irrelevant. Thoughts?

This change would fix the problem, but we would lose lossless precision doing so and it would increase the cost of accessing a view of the image since you would need to do this conversion every time. Do we have benchmarks for how much lazy conversions like this would affect common operations?

After reading through this thread, here are my thoughts:

Option 1: 👎 I know first-hand that this isn't very user-friendly for new users and is well discussed earlier in this thread. That said, it's gotten us this far, but it's problematic.
Option 2: 👎 This is very user-unfriendly even if we provide an explanation for what to do. Things should just work, even 3a is preferable to this.
Option 3a: 😐 Eager: Not realistic for large images* Lazily: See my comments above
Option 3b: 👍 I quite like this solution. I think that if the benchmarks for realistic image sets are good, then I would prefer this solution since it would balance the tradeoffs between the different options.

* I think I've had to resort to Option 3a (recasting image eltype to float) in my own code, which I'm not happy with (ref: https://github.com/tlnagy/SegmentationTools.jl/blob/36d6b98236cf62880e171059547fd55f9f11e9b4/src/utils.jl#L39).

Consequently, I don't think it's at all crazy to do something "better" than what they do, if indeed it is possible to do anything that is better.

I would go further and say that we would be doing the greater imaging community a disfavor if we didn't try and develop something better given all the problems with the status quo and that we are in a unique position to build it.

So for me the point of this discussion was to determine whether we can find a better way. I am thinking we can't, so perhaps I should abandon this direction.

I hope not! I like the bones here and the status quo could be better and changes like these are why I love the JuliaImages ecosystem :)

johnnychen94 · 2019-12-02T22:43:50Z

FYI, I used a lot of MappedArrays.of_eltype to do the lazy conversion in ImageDistances.jl, and the routine I take there is:

channelview
of_eltype: N0f8 -> Float32
real computation: e.g., Euclidean
of_eltype: Float32 -> N0f8
colorview

And in test stage, I use ReferenceTests.jl to compare the results of all possible combination of Bool, N0f8, Float32, Gray, RGB, Lab(6 types for Gray, 2 for RGB and 1 for Lab) just to make sure all input types get the "same" results. With regard to @checked, I think ReferenceTests is more powerful and applicable (e.g., it keeps a record in disk, it support custom distances such as L0 norm, which is useful to check the results of binarization algorithm)

This approach is definitely not good in performance, but at that moment (got bit by wraparound behavior) I just wanted to guarantee accuracy first.

This discussion is becoming too long to catch up with, so I hope consensus at some level can be reached soon, and I'd like to contribute to where I'm capable of.

kimikage · 2019-12-02T23:03:39Z

I can't see any advantage in NormedFloat

NormedFloat must not be good because it is a spoiling candidate or a touchstone. Please tell us the advantages of Normed{<:Signed}, not the disadvantages of NormedFloat. At least, NormedFloat is the same asNormed{<:Signed} in the following ways:

Implement signed variants of Normed numbers. Advantages: "correct" results are usually returned (barring overflow), and roundoff won't happen since the underlying types are integral. Disadvantages: (1) less consistent with the rest of Julia, (2) more types we have to support, (3) like option 2, discretization becomes less of an advantage even by moving to 16-bit types (there are more bins you have to check).

(Edit: Of course, Normed{Float16,8}.i is integral (in add/sub operations). It can handle 11-bit-depth pixels losslessly.)

Edit2:

We can't force conversion of a 1TB image. My lab uses those all the time, via memory-mapping.

So, you have a narrower gate, right? An important point in this approach is to prevent the (new) users from using N0f8 carelessly. Since actually you are going to use S7f8, there is no difference after passing the gate.

Breaking changes aren't a disaster if they are managed carefully;

Ideally. Please face the facts(#139). We are only interested in what we need. I hate potential minor bugs. They are more vicious than obvious malfunctions.

I don't care about that much in FixedPointNumbers. However, it is generally not good to ask other packages for such verification and modification. If this is essential to the design, we should change it aggressively, but I don't think so.

Please give me some time to study. My claims have been already written, so it's okay to conclude this discussion during my absence.

kimikage · 2019-12-06T00:05:14Z

I do not mean to be rude, but the supporters of "3b" had better clarify the promotion rules. (cf. #147 (comment))
Normed has many types other than N0f8 and N0f16, even though the other types are not often used in image processing.

Tokazama · 2019-12-06T02:16:30Z

Been following this but don't have the bandwidth right now to help think upon solutions too much. The only thing I really care about on this is that:

Type conversion doesn't became a nightmare. I'm sure a motivated individual could figure out a complex set of promotion rules given a bunch of types, but that's always obnoxious.
It doesn't affect speed in downstream plotting functions.

I think both of these have been addressed to some degree, so the only other thing I'd add is that I only use fixed point numbers as a storage type passed to graphics libraries. Most of the time it doesn't make sense for an MRI to be directly represented as a fixed point number.

Allow signed Normed numbers

08c7452

kimikage reviewed Nov 28, 2019

View reviewed changes

timholy mentioned this pull request Nov 30, 2019

Overflow checked arithmetics #41

Open

timholy mentioned this pull request Nov 30, 2019

WIP: add support for CheckedArithmetic #146

Draft

kimikage mentioned this pull request Dec 3, 2019

[RFC/Coffee Break] What about a new type NormedFloat? #147

Closed

This was referenced Dec 23, 2019

Commonize code between Fixed and Normed #151

Merged

Adding rounding functions for Fixed #153

Closed

kimikage mentioned this pull request Jan 15, 2020

[RFC] Promotion rules for overflow reduction #167

Closed

timholy mentioned this pull request Jul 6, 2020

Fix reductions in Statistics #183

Merged

kimikage mentioned this pull request Jul 11, 2020

[RFC] Orthogonality between signed/unsigned and Fixed/Normed #199

Open

kimikage marked this pull request as draft July 18, 2020 22:54

timholy mentioned this pull request Sep 9, 2020

Don't wrap round #227

Open

kimikage mentioned this pull request Sep 14, 2020

[RFC] Reserving aliases for unsigned Fixed and signed Normed #228

Open

timholy mentioned this pull request Aug 9, 2021

Support FixedPoint timholy/Ratios.jl#19

Merged

WIP/RFC/Scary: allow signed Normed numbers #143

Are you sure you want to change the base?

WIP/RFC/Scary: allow signed Normed numbers #143

Conversation

timholy commented Nov 27, 2019

kimikage commented Nov 27, 2019 • edited Loading

timholy commented Nov 27, 2019

timholy commented Nov 27, 2019

johnnychen94 commented Nov 27, 2019 • edited Loading

vchuravy commented Nov 27, 2019

timholy commented Nov 27, 2019

johnnychen94 commented Nov 27, 2019

kimikage commented Nov 27, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimikage Nov 30, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimikage commented Nov 29, 2019 • edited Loading

Fraction-first

Sign-first

vchuravy commented Nov 29, 2019

kimikage commented Nov 30, 2019

timholy commented Nov 30, 2019

kimikage commented Nov 30, 2019

timholy commented Nov 30, 2019 • edited Loading

timholy commented Nov 30, 2019

kimikage commented Nov 30, 2019 • edited Loading

timholy commented Nov 30, 2019

kimikage commented Nov 30, 2019 • edited Loading

kimikage commented Dec 1, 2019 • edited Loading

timholy commented Dec 1, 2019

kimikage commented Dec 1, 2019

timholy commented Dec 1, 2019

kimikage commented Dec 1, 2019

timholy commented Dec 1, 2019 • edited Loading

kimikage commented Dec 2, 2019 • edited Loading

timholy commented Dec 2, 2019 • edited Loading

johnnychen94 commented Dec 2, 2019 • edited Loading

timholy commented Dec 2, 2019

kimikage commented Dec 2, 2019 • edited Loading

timholy commented Dec 2, 2019 • edited Loading

timholy commented Dec 2, 2019 • edited Loading

kimikage commented Dec 2, 2019

kimikage commented Dec 2, 2019 • edited Loading

kimikage commented Dec 2, 2019 • edited Loading

IanButterworth commented Dec 2, 2019

timholy commented Dec 2, 2019

timholy commented Dec 2, 2019

kimikage commented Dec 2, 2019

tlnagy commented Dec 2, 2019 • edited Loading

johnnychen94 commented Dec 2, 2019 • edited Loading

kimikage commented Dec 2, 2019 • edited Loading

kimikage commented Dec 6, 2019 • edited Loading

Tokazama commented Dec 6, 2019

kimikage commented Nov 27, 2019 •

edited

Loading

johnnychen94 commented Nov 27, 2019 •

edited

Loading

kimikage commented Nov 27, 2019 •

edited

Loading

kimikage Nov 30, 2019 •

edited

Loading

kimikage commented Nov 29, 2019 •

edited

Loading

timholy commented Nov 30, 2019 •

edited

Loading

kimikage commented Nov 30, 2019 •

edited

Loading

kimikage commented Nov 30, 2019 •

edited

Loading

kimikage commented Dec 1, 2019 •

edited

Loading

timholy commented Dec 1, 2019 •

edited

Loading

kimikage commented Dec 2, 2019 •

edited

Loading

timholy commented Dec 2, 2019 •

edited

Loading

johnnychen94 commented Dec 2, 2019 •

edited

Loading

kimikage commented Dec 2, 2019 •

edited

Loading

timholy commented Dec 2, 2019 •

edited

Loading

timholy commented Dec 2, 2019 •

edited

Loading

kimikage commented Dec 2, 2019 •

edited

Loading

kimikage commented Dec 2, 2019 •

edited

Loading

tlnagy commented Dec 2, 2019 •

edited

Loading

johnnychen94 commented Dec 2, 2019 •

edited

Loading

kimikage commented Dec 2, 2019 •

edited

Loading

kimikage commented Dec 6, 2019 •

edited

Loading