Tables.jl support #3104

trulsf · 2022-10-04T19:24:40Z

Support for solution tables supporting the Tables.jl interface for JuMP variable containers. Currently has support for Array, DenseAxisArray, SparseAxisArray.

Closes #3096

src/JuMP.jl

odow · 2022-10-14T03:22:00Z

So I took a look at this. Any reason that we can't just return a Vector{NamedTuple}? This satisfies the tables interface without us needing the dependency, and the code is pretty trivial:

using JuMP
model = Model()
@variable(model, x[1:2, 1:2])
@variable(model, y[1:2, 1:2], container = DenseAxisArray)
@variable(model, z[i=1:3, j=1:3; isodd(i+j)], container = SparseAxisArray)

function _row_iterator(x::Union{Array,Containers.DenseAxisArray})
    return zip(eachindex(x), Iterators.product(axes(x)...))
end

function _row_iterator(x::Containers.SparseAxisArray)
    return zip(eachindex(x.data), keys(x.data))
end

function table(x, name::Symbol, col_names::Symbol...)
    return table(identity, x, name, col_names...)
end

function table(
    f::Function, 
    x::Union{Array,Containers.DenseAxisArray,Containers.SparseAxisArray}, 
    name::Symbol, 
    col_names::Symbol...,
)
    C = (col_names..., name)
    return [NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)]
end

julia> for row in table(JuMP.name, x, :value, :A, :B)
           @show row
       end
row = (A = 1, B = 1, value = "x[1,1]")
row = (A = 2, B = 1, value = "x[2,1]")
row = (A = 1, B = 2, value = "x[1,2]")
row = (A = 2, B = 2, value = "x[2,2]")

julia> for row in table(JuMP.name, y, :value, :A, :B)
           @show row
       end
row = (A = 1, B = 1, value = "y[1,1]")
row = (A = 2, B = 1, value = "y[2,1]")
row = (A = 1, B = 2, value = "y[1,2]")
row = (A = 2, B = 2, value = "y[2,2]")

julia> for row in table(z, :value, :A, :B)
           @show row
       end
row = (A = 3, B = 2, value = z[3,2])
row = (A = 1, B = 2, value = z[1,2])
row = (A = 2, B = 1, value = z[2,1])
row = (A = 2, B = 3, value = z[2,3])

julia> import PrettyTables

julia> PrettyTables.pretty_table(table(z, :value, :A, :B))
┌───────┬───────┬─────────────┐
│     A │     B │       value │
│ Int64 │ Int64 │ VariableRef │
├───────┼───────┼─────────────┤
│     3 │     2 │      z[3,2] │
│     1 │     2 │      z[1,2] │
│     2 │     1 │      z[2,1] │
│     2 │     3 │      z[2,3] │
└───────┴───────┴─────────────┘

trulsf · 2022-10-14T10:13:52Z

I was not aware that a vector of NamedTuple implemented the Tables.jl interface, but it makes sense. I should read the docs more carefully next time :)

The only problem with your suggested implementation is that you return an array of NamedTuples instead of a a vector for Array and DenseAxisArray, but it is fixed by e.g.

return vec([NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)])

(Maybe more efficient to fix it in the iterator?)

I can update the PR with your code or is it more efficient that you take this further if it is to be added to JuMP?

odow · 2022-10-14T10:24:15Z

You can steal my code and update the example :)

…

On Fri, 14 Oct 2022, 11:14 pm Truls Flatberg, ***@***.***> wrote: I was not aware that a vector of NamedTuple implemented the Tables.jl interface, but it makes sense. I should read the docs more carefully next time :) The only problem with your suggested implementation is that you return an array of NamedTuples instead of a a vector for Array and DenseAxisArray, but it is fixed by e.g. return vec([NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)]) (Maybe more efficient to fix it in the iterator?) I can update the PR with your code or is it more efficient that you take this further if it is to be added to JuMP? — Reply to this email directly, view it on GitHub <#3104 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6MQJLP64VZC46PXKYL64TWDEW6XANCNFSM6AAAAAAQ436XPI> . You are receiving this because you commented.Message ID: <jump-dev/JuMP. ***@***.***>

codecov · 2022-10-16T20:39:00Z

Codecov Report

Base: 97.64% // Head: 97.64% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (139ab15) compared to base (8b774fd).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #3104   +/-   ##
=======================================
  Coverage   97.64%   97.64%           
=======================================
  Files          32       33    +1     
  Lines        4330     4377   +47     
=======================================
+ Hits         4228     4274   +46     
- Misses        102      103    +1

Impacted Files	Coverage Δ
src/Containers/Containers.jl	`90.90% <ø> (ø)`
src/Containers/tables.jl	`100.00% <100.00%> (ø)`
src/copy.jl	`96.15% <0.00%> (+0.43%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

odow

@trulsf I tidied things up a little.

Questions:

Should this go in JuMP.table or JuMP.Containers.table? It seems like JuMP automatically exporting table could lead to clashes with other packages?
Is value_name, col_names the correct order? It seems to lead to a counter-intuitive ordering of the resulting named tuple. What if we just had names...?

We should also think about what documentation this needs:

julia> using JuMP, DataFrames

julia> x = Containers.@container([i = 1:2, j = [:A, :B]], (i, j))
2-dimensional DenseAxisArray{Tuple{Int64, Symbol},2,...} with index sets:
    Dimension 1, Base.OneTo(2)
    Dimension 2, [:A, :B]
And data, a 2×2 Matrix{Tuple{Int64, Symbol}}:
 (1, :A)  (1, :B)
 (2, :A)  (2, :B)

julia> DataFrames.DataFrame(table(x, :x, :i, :j))  # <- here's the ordering that feels wrong
4×3 DataFrame
 Row │ i      j       x       
     │ Int64  Symbol  Tuple…  
─────┼────────────────────────
   1 │     1  A       (1, :A)
   2 │     2  A       (2, :A)
   3 │     1  B       (1, :B)
   4 │     2  B       (2, :B)

odow · 2022-10-16T23:01:47Z

After merging, we should update the relevant discourse posts:

src/Containers/tables.jl

odow

I think I answered my two questions, and I added some docs. I think this will be quite useful.

src/Containers/tables.jl

mlubin · 2022-10-17T01:44:18Z

I'd like to take a look before we merge this. Please bug me if I don't review it quickly.

odow · 2022-10-17T01:48:23Z

Sure. I guess one other potential change is to call it Containers.rowtable to match https://tables.juliadata.org/stable/#Tables.rowtable and to reflect the fact that we're returning a one-dimensional object.

hellemo · 2022-10-17T18:09:21Z

Very nice!

Did you consider storing the names in the containers, which would simplify the use of tables further (as mentioned in #3096 and demonstrated in the last example), following up on #3088 ?

julia> using PrettyTables

julia> pretty_table(table(y)) # SparseVarArray
┌──────┬──────┬─────┐
│  car │ year │   y │
├──────┼──────┼─────┤
│  bmw │ 2001 │ 0.0 │
│ ford │ 2001 │ 0.0 │
│ ford │ 2000 │ 0.0 │
│  bmw │ 2002 │ 0.0 │
└──────┴──────┴─────┘```

odow · 2022-10-17T19:46:19Z

Did you consider storing the names in the containers

Not yet. That's a larger, more orthogonal change. We'd have to think about how names etc are preserved during broadcasting etc. Otherwise someone would say "why does table(x) work, but table(value.(x)) not?

trulsf · 2022-10-17T20:57:19Z

@odow Thanks for tidying up and fixing. Great learning experience to see how you approach and deal with the code. I fully agree with your decision to group names into one argument.

hellemo · 2022-10-17T21:36:25Z

Not yet. That's a larger, more orthogonal change. We'd have to think about how names etc are preserved during broadcasting etc. Otherwise someone would say "why does table(x) work, but table(value.(x)) not?

I guess I imagined the scope of table to be narrower and more of an alternative way to get (all) the values. So I don't quite understand why you would want to do table(value.(x)) when table(x) is shorter and hopefully optimized for fetching all values at once. A follow-up question is of course your thoughts on how to make this possible/easy (assuming it is supported by solver APIs, which I haven't checked). Subsetting or slicing might complicate this approach a bit, I imagine.

I was kind of hoping to get rid of the call to tables where you have to specify column names (you already specified them once), and rather provide defaults if they are not specified to the macro on container creation, like e.g. DataFrames.

Just picking up on these in case they are important to consider at this point. (In terms of designing API for table and considering possibilities to optimize fetching of solution values)

odow · 2022-10-17T21:50:36Z

So I don't quite understand why you would want to do table(value.(x)) when table(x) is shorter and hopefully optimized for fetching all values at once

The current design doesn't assume value is used. You can either go table(value, x, names...) or table(value.(x), names...) (the default map function is identity).

The problem with names is this:

julia> using JuMP.Containers

julia> Containers.@container(x[a=1:2, b=1:2], a + b, container = DenseAxisArray)
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
    Dimension 1, Base.OneTo(2)
    Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
 2  3
 3  4

julia> Containers.@container(y[i=1:2, j=1:2], i + j, container = DenseAxisArray)
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
    Dimension 1, Base.OneTo(2)
    Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
 2  3
 3  4

julia> x .+ y
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
    Dimension 1, Base.OneTo(2)
    Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
 4  6
 6  8

Currently, we consider two arrays to be alike if the have the same axes. The "name" associated with each axis is ignored.

I was kind of hoping to get rid of the call to tables where you have to specify column names (you already specified them once), and rather provide defaults if they are not specified to the macro on container creation, like e.g. DataFrames.

A reasonable compromise might be Symbol("x$I") for the axes, and :y for the result, if no names are specified.

odow · 2022-10-17T22:08:40Z

This is now:

julia> model = Model();

julia> @variable(model, x[i=1:2, j=i:2] >= 0, start = i+j);

julia> Containers.table(start_value, x, :i, :j, :start)
3-element Vector{NamedTuple{(:i, :j, :start), Tuple{Int64, Int64, Float64}}}:
 (i = 1, j = 2, start = 3.0)
 (i = 1, j = 1, start = 2.0)
 (i = 2, j = 2, start = 4.0)

julia> Containers.table(x)
3-element Vector{NamedTuple{(:x1, :x2, :y), Tuple{Int64, Int64, VariableRef}}}:
 (x1 = 1, x2 = 2, y = x[1,2])
 (x1 = 1, x2 = 1, y = x[1,1])
 (x1 = 2, x2 = 2, y = x[2,2])

mlubin · 2022-10-18T00:20:51Z

Agreed with the rowtable naming. table is a bit confusing and seems to "underdeliver" by returning a Vector{NamedTuple} that you need additional packages to work with nicely.

Overall this is a nice little addition.

odow · 2022-10-18T00:32:41Z

Updated to rowtable. We can't add a method to Tables.rowtable because that would be piracy, and we return a slightly different notion of table:

julia> using Tables

julia> x = rand(2, 2)
2×2 Matrix{Float64}:
 0.262588  0.086919
 0.199496  0.542848

julia> Tables.rowtable(x)
ERROR: ArgumentError: a 'Matrix{Float64}' is not a table; see `?Tables.table` for ways to treat an AbstractVecOrMat as a table
Stacktrace:
 [1] rows(m::Matrix{Float64})
   @ Tables ~/.julia/packages/Tables/T7rHm/src/matrix.jl:3
 [2] rowtable(itr::Matrix{Float64})
   @ Tables ~/.julia/packages/Tables/T7rHm/src/namedtuples.jl:105
 [3] top-level scope
   @ REPL[4]:1

julia> Tables.table(x)
Tables.MatrixTable{Matrix{Float64}} with 2 rows, 2 columns, and schema:
 :Column1  Float64
 :Column2  Float64

julia> Tables.rowtable(Tables.table(x))
2-element Vector{NamedTuple{(:Column1, :Column2), Tuple{Float64, Float64}}}:
 (Column1 = 0.262588126998895, Column2 = 0.08691902246653505)
 (Column1 = 0.19949566550192044, Column2 = 0.5428477642995122)

I wonder if we should follow the ; header::Vector{Symbol} argument for the names. It'd be simpler.

help?> Tables.table
  Tables.table(m::AbstractVecOrMat; [header])

  Wrap an AbstractVecOrMat (Matrix, Vector, Adjoint, etc.) in a MatrixTable, which satisfies the
  Tables.jl interface. (An AbstractVector is treated as a 1-column matrix.) This allows accessing
  the matrix via Tables.rows and Tables.columns. An optional keyword argument iterator header can
  be passed which will be converted to a Vector{Symbol} to be used as the column names. Note that
  no copy of the AbstractVecOrMat is made.

julia> Tables.table(x, header = [:x1, :x2])
Tables.MatrixTable{Matrix{Float64}} with 2 rows, 2 columns, and schema:
 :x1  Float64
 :x2  Float64

julia> Tables.rowtable(Tables.table(x, header = [:x1, :x2]))
2-element Vector{NamedTuple{(:x1, :x2), Tuple{Float64, Float64}}}:
 (x1 = 0.262588126998895, x2 = 0.08691902246653505)
 (x1 = 0.19949566550192044, x2 = 0.5428477642995122)

hellemo · 2022-10-18T06:56:49Z

The current design doesn't assume value is used. You can either go table(value, x, names...) or table(value.(x), names...) (the default map function is identity).

While I appreciate the elegance of this solution, doesn't this make it harder to specialize on a table method for value that is optimized for fetching all values at once? (or is there a problem with this idea more generally? - would appreciate some pointers on where to start looking into that)

Currently, we consider two arrays to be alike if the have the same axes. The "name" associated with each axis is ignored.

You could still keep that behaviour, no? I mean, you already disregard the names initially. The suggested change would just be to keep them around for when they would be of use, such as when getting results via table.

A reasonable compromise might be Symbol("x$I") for the axes, and :y for the result, if no names are specified.

Thanks.

odow · 2022-10-18T19:14:29Z

While I appreciate the elegance of this solution, doesn't this make it harder to specialize on a table method for value that is optimized for fetching all values at once? (or is there a problem with this idea more generally? - would appreciate some pointers on where to start looking into that)

You could write a specific rowtable(::typeof(JuMP.value), x::SparseVarArray) method to make it faster.

I would expect that rowtable(value, x) takes about the same time as value.(x); it has to call value(x[i]) for every element. But I think the utility of rowtable is not speed, but that it simplifies converting into tables.

hellemo · 2022-10-19T17:02:28Z

You could write a specific rowtable(::typeof(JuMP.value), x::SparseVarArray) method to make it faster.

Good idea. Would also want that for Sparse and Dense I think, if that's possible (to speed up by retrieving all at once).

I would expect that rowtable(value, x) takes about the same time as value.(x); it has to call value(x[i]) for every element. But I think the utility of rowtable is not speed, but that it simplifies converting into tables.

I would say both, but we can always revisit the performance later as long as the design permits that. Thanks.

odow · 2022-10-27T19:53:42Z

Thanks @trulsf, I think this is a nice addition.

trulsf · 2022-10-28T11:08:28Z

Thanks @odow for your work on incorporating this in JuMP. We are looking forward to using the new features!

odow · 2022-10-28T20:04:26Z

I'll make a new release: #3117

trulsf added 3 commits October 2, 2022 23:15

Initial work on Tables support for JuMP

4658c2e

Refactor dense tables, support AbstractVariableRef

08a86b2

Extend test coverage

df9c9cb

blegat reviewed Oct 5, 2022

View reviewed changes

src/JuMP.jl Outdated Show resolved Hide resolved

Change to import of Tables package

6aab4f6

Support Tables interface by Vector of NamedTuples

0c409ad

odow added 2 commits October 17, 2022 09:41

Update tables.jl

526d47e

Update tables.jl

cfae782

odow marked this pull request as ready for review October 16, 2022 20:46

odow added 2 commits October 17, 2022 09:52

Fix tests

b7d63e6

Make tests deterministic

9bf2cf8

odow reviewed Oct 16, 2022

View reviewed changes

odow and others added 3 commits October 17, 2022 11:37

Move to containers and add docs

f82a062

Change name order

e3edcb2

Update containers.md

15c6c8c

odow reviewed Oct 16, 2022

View reviewed changes

src/Containers/tables.jl Outdated Show resolved Hide resolved

Update src/Containers/tables.jl

228e89f

odow approved these changes Oct 16, 2022

View reviewed changes

odow reviewed Oct 16, 2022

View reviewed changes

src/Containers/tables.jl Outdated Show resolved Hide resolved

Update src/Containers/tables.jl

8fc96ad

Support default names

064d855

Update tables.jl

5a372bc

s/table/rowtable

132c110

Switch to header::Vector{Symbol} for names

139ab15

trulsf mentioned this pull request Oct 19, 2022

Tables support for IndexedVarArray sintefore/SparseVariables.jl#30

Merged

This was referenced Oct 19, 2022

Plan for breaking release (v0.7) sintefore/SparseVariables.jl#28

Closed

Reenable support for JuMP.Containers.rowtable sintefore/SparseVariables.jl#35

Merged

odow merged commit d33e05c into jump-dev:master Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tables.jl support #3104

Tables.jl support #3104

trulsf commented Oct 4, 2022 •

edited by odow

Loading

odow commented Oct 14, 2022

trulsf commented Oct 14, 2022

odow commented Oct 14, 2022 via email

codecov bot commented Oct 16, 2022 •

edited

Loading

odow left a comment

odow commented Oct 16, 2022

odow left a comment

mlubin commented Oct 17, 2022

odow commented Oct 17, 2022

hellemo commented Oct 17, 2022

odow commented Oct 17, 2022

trulsf commented Oct 17, 2022

hellemo commented Oct 17, 2022

odow commented Oct 17, 2022

odow commented Oct 17, 2022

mlubin commented Oct 18, 2022 •

edited

Loading

odow commented Oct 18, 2022

hellemo commented Oct 18, 2022

odow commented Oct 18, 2022

hellemo commented Oct 19, 2022

odow commented Oct 27, 2022

trulsf commented Oct 28, 2022

odow commented Oct 28, 2022

Tables.jl support #3104

Tables.jl support #3104

Conversation

trulsf commented Oct 4, 2022 • edited by odow Loading

odow commented Oct 14, 2022

trulsf commented Oct 14, 2022

odow commented Oct 14, 2022 via email

codecov bot commented Oct 16, 2022 • edited Loading

Codecov Report

odow left a comment

Choose a reason for hiding this comment

odow commented Oct 16, 2022

odow left a comment

Choose a reason for hiding this comment

mlubin commented Oct 17, 2022

odow commented Oct 17, 2022

hellemo commented Oct 17, 2022

odow commented Oct 17, 2022

trulsf commented Oct 17, 2022

hellemo commented Oct 17, 2022

odow commented Oct 17, 2022

odow commented Oct 17, 2022

mlubin commented Oct 18, 2022 • edited Loading

odow commented Oct 18, 2022

hellemo commented Oct 18, 2022

odow commented Oct 18, 2022

hellemo commented Oct 19, 2022

odow commented Oct 27, 2022

trulsf commented Oct 28, 2022

odow commented Oct 28, 2022

trulsf commented Oct 4, 2022 •

edited by odow

Loading

codecov bot commented Oct 16, 2022 •

edited

Loading

mlubin commented Oct 18, 2022 •

edited

Loading