Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant type information printed for arrays with Union eltype #26847

Closed
nalimilan opened this issue Apr 18, 2018 · 8 comments · Fixed by #48822
Closed

Redundant type information printed for arrays with Union eltype #26847

nalimilan opened this issue Apr 18, 2018 · 8 comments · Fixed by #48822
Labels
arrays [a, r, r, a, y, s] display and printing Aesthetics and correctness of printed representations of objects.

Comments

@nalimilan
Copy link
Member

The new typeinfo IOContext property (#24651) has no effect for arrays with Union element types, since we require the type of elements to be exactly equal to the element type of the array. This is particular annoying for Union{T,Missing} arrays, for example with CategoricalArrays. A simple reproducer with only Base types is:

# OK
julia> [Some(1), Some(2)]
2-element Array{Some{Int64},1}:
 1
 2

# Too verbose
julia> [Some(1), missing]
2-element Array{Union{Missing, Some{Int64}},1}:
 Some(1)
 missing

The problem may be more complex to fix than it seems, given that in some cases of Union element types, type information for each element may still be needed to unambiguously identify each entry. For example, here the f0 suffix adds information:

julia> Union{Float32,Float64}[1.0f0, 1.0]
2-element Array{Union{Float32, Float64},1}:
 1.0f0
 1.0  

By the way, the following behavior is probably a bug:

julia> Union{BigFloat,Float64}[big(1.0), 1.0]
2-element Array{Union{Float64, BigFloat},1}:
 1.0
 1.0

So the simplest approach would be to add a special-case for Missing and Nothing, or maybe for singleton types in general (assuming they will generally not have the same representation as another type).

Cc: @rfourquet

@nalimilan nalimilan added arrays [a, r, r, a, y, s] display and printing Aesthetics and correctness of printed representations of objects. labels Apr 18, 2018
@JeffBezanson
Copy link
Member

Am I right that we don't have any form of printing that shows big(1.0) differently from 1.0?

@JeffBezanson
Copy link
Member

Also, I don't think redundant type information is a big problem (compared to incorrect output, errors, etc.)

@JeffBezanson
Copy link
Member

Actually, this is kind of a bug:

julia> a = [Some(1), Some(2)];

julia> repr(a)
"Some{Int64}[1, 2]"

julia> Some{Int64}[1, 2]
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type Some{Int64}

I'm not sure whether that conversion should be defined.

@nalimilan
Copy link
Member Author

Am I right that we don't have any form of printing that shows big(1.0) differently from 1.0?

Yeah, right. I just mentioned that in passing in case that wasn't intended.

Also, I don't think redundant type information is a big problem (compared to incorrect output, errors, etc.)

That's still a serious issue in some cases. It makes CategoricalArray printing completely unreadable:

julia> CategoricalArray([1, 2, 3, missing])
4-element CategoricalArray{Union{Missing, Int64},1,UInt32,Int64,CategoricalValue{Int64,UInt32},Missing}:
 CategoricalValue{Int64,UInt32} 1
 CategoricalValue{Int64,UInt32} 2
 CategoricalValue{Int64,UInt32} 3
 missing                         

This can also affect the printing of any type with a verbose printing. Of course one can omit type information in all contexts, but that defeats the purpose of the typeinfo system.

I'm not sure whether that conversion should be defined.

IIRC I omitted it intentionally since there's an annoying special case for convert(Union{Some{Int}, Nothing}, nothing), which goes against the objective ofSome, i.e. distinguishing Some(nothing) from just nothing. Maybe we should always use a verbose printing for Array{Some}. Anyway I just took that to have a pure-Base example, it works with any type which implements a verbose and a short representation.

@nalimilan
Copy link
Member Author

Actually I've been able to fix this very easily in CategoricalArrays by adapting the check for typeinfo to only use T for Union{T, Missing}. That's fine in that case since the CategoricalValue type is only supposed to be stored in for CategoricalArray objects. But maybe we still need something more general, given that it's generally not very natural to have a special case for Missing or Nothing in the printing code for custom types which are not directly related to arrays.

@quinnj
Copy link
Member

quinnj commented Aug 7, 2019

It doesn't seem like there's anything actionable here? close?

@nalimilan
Copy link
Member Author

We can certainly do something about this if we want. The question is whether we want to or not.

Here's an simpler example of the verbose printing of Union{T,Missing} arrays:

julia> Float32[1.0]
1-element Array{Float32,1}:
 1.0

julia> Union{Missing,Float32}[1.0]
1-element Array{Union{Missing, Float32},1}:
 1.0f0

The useless f0 part is not the end of the world here, but it illustrates a more general problem which becomes much more serious for types with a longer typed output, like CategoricalValue above. Maybe it should just be fixed by special-casing Missing and Nothing in the show method for each type (e.g. Float32)? It can't be fixed in general in the array printing code since e.g. Some(nothing) and nothing must be printed differently in Union{Some{T}, Nothing} arrays.

@LilithHafner
Copy link
Member

The problem here is elements not taking full advantage of typeinfo in the case when get(io, :typeinfo, Any) is a close supertype of the type being printed. We can fix this for types in Base but in general, we need to know whether convert(T y) re-creates x where y is the interpretation of showing x without type information. If either T or typeof(x) is not defined in Base, imo this is not a problem we should attempt to solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] display and printing Aesthetics and correctness of printed representations of objects.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants