-
-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tables.jl support #3104
Tables.jl support #3104
Conversation
So I took a look at this. Any reason that we can't just return a using JuMP
model = Model()
@variable(model, x[1:2, 1:2])
@variable(model, y[1:2, 1:2], container = DenseAxisArray)
@variable(model, z[i=1:3, j=1:3; isodd(i+j)], container = SparseAxisArray)
function _row_iterator(x::Union{Array,Containers.DenseAxisArray})
return zip(eachindex(x), Iterators.product(axes(x)...))
end
function _row_iterator(x::Containers.SparseAxisArray)
return zip(eachindex(x.data), keys(x.data))
end
function table(x, name::Symbol, col_names::Symbol...)
return table(identity, x, name, col_names...)
end
function table(
f::Function,
x::Union{Array,Containers.DenseAxisArray,Containers.SparseAxisArray},
name::Symbol,
col_names::Symbol...,
)
C = (col_names..., name)
return [NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)]
end
julia> for row in table(JuMP.name, x, :value, :A, :B)
@show row
end
row = (A = 1, B = 1, value = "x[1,1]")
row = (A = 2, B = 1, value = "x[2,1]")
row = (A = 1, B = 2, value = "x[1,2]")
row = (A = 2, B = 2, value = "x[2,2]")
julia> for row in table(JuMP.name, y, :value, :A, :B)
@show row
end
row = (A = 1, B = 1, value = "y[1,1]")
row = (A = 2, B = 1, value = "y[2,1]")
row = (A = 1, B = 2, value = "y[1,2]")
row = (A = 2, B = 2, value = "y[2,2]")
julia> for row in table(z, :value, :A, :B)
@show row
end
row = (A = 3, B = 2, value = z[3,2])
row = (A = 1, B = 2, value = z[1,2])
row = (A = 2, B = 1, value = z[2,1])
row = (A = 2, B = 3, value = z[2,3])
julia> import PrettyTables
julia> PrettyTables.pretty_table(table(z, :value, :A, :B))
┌───────┬───────┬─────────────┐
│ A │ B │ value │
│ Int64 │ Int64 │ VariableRef │
├───────┼───────┼─────────────┤
│ 3 │ 2 │ z[3,2] │
│ 1 │ 2 │ z[1,2] │
│ 2 │ 1 │ z[2,1] │
│ 2 │ 3 │ z[2,3] │
└───────┴───────┴─────────────┘ |
I was not aware that a vector of NamedTuple implemented the Tables.jl interface, but it makes sense. I should read the docs more carefully next time :) The only problem with your suggested implementation is that you return an array of NamedTuples instead of a a vector for Array and DenseAxisArray, but it is fixed by e.g. return vec([NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)]) (Maybe more efficient to fix it in the iterator?) I can update the PR with your code or is it more efficient that you take this further if it is to be added to JuMP? |
You can steal my code and update the example :)
…On Fri, 14 Oct 2022, 11:14 pm Truls Flatberg, ***@***.***> wrote:
I was not aware that a vector of NamedTuple implemented the Tables.jl
interface, but it makes sense. I should read the docs more carefully next
time :)
The only problem with your suggested implementation is that you return an
array of NamedTuples instead of a a vector for Array and DenseAxisArray,
but it is fixed by e.g.
return vec([NamedTuple{C}((args..., f(x[i]))) for (i, args) in _row_iterator(x)])
(Maybe more efficient to fix it in the iterator?)
I can update the PR with your code or is it more efficient that you take
this further if it is to be added to JuMP?
—
Reply to this email directly, view it on GitHub
<#3104 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB6MQJLP64VZC46PXKYL64TWDEW6XANCNFSM6AAAAAAQ436XPI>
.
You are receiving this because you commented.Message ID: <jump-dev/JuMP.
***@***.***>
|
Codecov ReportBase: 97.64% // Head: 97.64% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3104 +/- ##
=======================================
Coverage 97.64% 97.64%
=======================================
Files 32 33 +1
Lines 4330 4377 +47
=======================================
+ Hits 4228 4274 +46
- Misses 102 103 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trulsf I tidied things up a little.
Questions:
- Should this go in
JuMP.table
orJuMP.Containers.table
? It seems like JuMP automatically exportingtable
could lead to clashes with other packages? - Is
value_name, col_names
the correct order? It seems to lead to a counter-intuitive ordering of the resulting named tuple. What if we just hadnames...
?
We should also think about what documentation this needs:
julia> using JuMP, DataFrames
julia> x = Containers.@container([i = 1:2, j = [:A, :B]], (i, j))
2-dimensional DenseAxisArray{Tuple{Int64, Symbol},2,...} with index sets:
Dimension 1, Base.OneTo(2)
Dimension 2, [:A, :B]
And data, a 2×2 Matrix{Tuple{Int64, Symbol}}:
(1, :A) (1, :B)
(2, :A) (2, :B)
julia> DataFrames.DataFrame(table(x, :x, :i, :j)) # <- here's the ordering that feels wrong
4×3 DataFrame
Row │ i j x
│ Int64 Symbol Tuple…
─────┼────────────────────────
1 │ 1 A (1, :A)
2 │ 2 A (2, :A)
3 │ 1 B (1, :B)
4 │ 2 B (2, :B)
After merging, we should update the relevant discourse posts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I answered my two questions, and I added some docs. I think this will be quite useful.
I'd like to take a look before we merge this. Please bug me if I don't review it quickly. |
Sure. I guess one other potential change is to call it |
Very nice! Did you consider storing the names in the containers, which would simplify the use of tables further (as mentioned in #3096 and demonstrated in the last example), following up on #3088 ?
|
Not yet. That's a larger, more orthogonal change. We'd have to think about how names etc are preserved during broadcasting etc. Otherwise someone would say "why does |
@odow Thanks for tidying up and fixing. Great learning experience to see how you approach and deal with the code. I fully agree with your decision to group names into one argument. |
I guess I imagined the scope of I was kind of hoping to get rid of the call to tables where you have to specify column names (you already specified them once), and rather provide defaults if they are not specified to the macro on container creation, like e.g. DataFrames. Just picking up on these in case they are important to consider at this point. (In terms of designing API for table and considering possibilities to optimize fetching of solution values) |
The current design doesn't assume The problem with names is this: julia> using JuMP.Containers
julia> Containers.@container(x[a=1:2, b=1:2], a + b, container = DenseAxisArray)
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
Dimension 1, Base.OneTo(2)
Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
2 3
3 4
julia> Containers.@container(y[i=1:2, j=1:2], i + j, container = DenseAxisArray)
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
Dimension 1, Base.OneTo(2)
Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
2 3
3 4
julia> x .+ y
2-dimensional DenseAxisArray{Int64,2,...} with index sets:
Dimension 1, Base.OneTo(2)
Dimension 2, Base.OneTo(2)
And data, a 2×2 Matrix{Int64}:
4 6
6 8 Currently, we consider two arrays to be alike if the have the same axes. The "name" associated with each axis is ignored.
A reasonable compromise might be |
This is now: julia> model = Model();
julia> @variable(model, x[i=1:2, j=i:2] >= 0, start = i+j);
julia> Containers.table(start_value, x, :i, :j, :start)
3-element Vector{NamedTuple{(:i, :j, :start), Tuple{Int64, Int64, Float64}}}:
(i = 1, j = 2, start = 3.0)
(i = 1, j = 1, start = 2.0)
(i = 2, j = 2, start = 4.0)
julia> Containers.table(x)
3-element Vector{NamedTuple{(:x1, :x2, :y), Tuple{Int64, Int64, VariableRef}}}:
(x1 = 1, x2 = 2, y = x[1,2])
(x1 = 1, x2 = 1, y = x[1,1])
(x1 = 2, x2 = 2, y = x[2,2]) |
Agreed with the Overall this is a nice little addition. |
Updated to julia> using Tables
julia> x = rand(2, 2)
2×2 Matrix{Float64}:
0.262588 0.086919
0.199496 0.542848
julia> Tables.rowtable(x)
ERROR: ArgumentError: a 'Matrix{Float64}' is not a table; see `?Tables.table` for ways to treat an AbstractVecOrMat as a table
Stacktrace:
[1] rows(m::Matrix{Float64})
@ Tables ~/.julia/packages/Tables/T7rHm/src/matrix.jl:3
[2] rowtable(itr::Matrix{Float64})
@ Tables ~/.julia/packages/Tables/T7rHm/src/namedtuples.jl:105
[3] top-level scope
@ REPL[4]:1
julia> Tables.table(x)
Tables.MatrixTable{Matrix{Float64}} with 2 rows, 2 columns, and schema:
:Column1 Float64
:Column2 Float64
julia> Tables.rowtable(Tables.table(x))
2-element Vector{NamedTuple{(:Column1, :Column2), Tuple{Float64, Float64}}}:
(Column1 = 0.262588126998895, Column2 = 0.08691902246653505)
(Column1 = 0.19949566550192044, Column2 = 0.5428477642995122) I wonder if we should follow the help?> Tables.table
Tables.table(m::AbstractVecOrMat; [header])
Wrap an AbstractVecOrMat (Matrix, Vector, Adjoint, etc.) in a MatrixTable, which satisfies the
Tables.jl interface. (An AbstractVector is treated as a 1-column matrix.) This allows accessing
the matrix via Tables.rows and Tables.columns. An optional keyword argument iterator header can
be passed which will be converted to a Vector{Symbol} to be used as the column names. Note that
no copy of the AbstractVecOrMat is made.
julia> Tables.table(x, header = [:x1, :x2])
Tables.MatrixTable{Matrix{Float64}} with 2 rows, 2 columns, and schema:
:x1 Float64
:x2 Float64
julia> Tables.rowtable(Tables.table(x, header = [:x1, :x2]))
2-element Vector{NamedTuple{(:x1, :x2), Tuple{Float64, Float64}}}:
(x1 = 0.262588126998895, x2 = 0.08691902246653505)
(x1 = 0.19949566550192044, x2 = 0.5428477642995122) |
While I appreciate the elegance of this solution, doesn't this make it harder to specialize on a
You could still keep that behaviour, no? I mean, you already disregard the names initially. The suggested change would just be to keep them around for when they would be of use, such as when getting results via table.
Thanks. |
You could write a specific I would expect that |
Good idea. Would also want that for Sparse and Dense I think, if that's possible (to speed up by retrieving all at once).
I would say both, but we can always revisit the performance later as long as the design permits that. Thanks. |
Thanks @trulsf, I think this is a nice addition. |
Thanks @odow for your work on incorporating this in JuMP. We are looking forward to using the new features! |
I'll make a new release: #3117 |
Support for solution tables supporting the
Tables.jl
interface for JuMP variable containers. Currently has support forArray
,DenseAxisArray
,SparseAxisArray
.Closes #3096