-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbol column indexing #377
Conversation
src/utilities.jl
Outdated
find_dupes_index(A::Vector{Symbol}) = | ||
@inbounds [i for i in eachindex(A) if A[i] ∈ A[1:i-1]] | ||
|
||
@memoize function gen_colnames(n::Integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you seen a notable performance improvement when using @memoize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this function is quite slow. (or my algo is bad, is there an better algo?
Orig:
julia> @btime TimeSeries.gen_colnames(5)
899.902 ns (9 allocations: 384 bytes)
5-element Array{Symbol,1}:
:A
:B
:C
:D
:E
vs @memoize
julia> @btime TimeSeries.gen_colnames(5)
92.643 ns (2 allocations: 32 bytes)
5-element Array{Symbol,1}:
:A
:B
:C
:D
:E
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I'll take a look at the algorithm itself in more detail later, this is fine in the meantime I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm...
How does DataFrames.jl generate the default column names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it's just x
appended with sequential numbers, i.e. x1
, x2
, ...
test/combine.jl
Outdated
@test_throws ErrorException merge(cl, op, colnames=[:a, :b, :c]) | ||
|
||
for mode ∈ [:inner, :left, :right, :outer] | ||
@test merge(cl, ohlc[:High, :Low], mode, colnames=[:a, :b, :c]).colnames == [:a, :b, :c] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be the Julia standard 4 spaces for indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Codecov Report
@@ Coverage Diff @@
## master #377 +/- ##
=======================================
Coverage 99.11% 99.11%
=======================================
Files 9 9
Lines 340 340
=======================================
Hits 337 337
Misses 3 3
Continue to review full report at Codecov.
|
What... the coverage is 99.x% |
ready for review. |
LGTM |
But doc should also be updated accordingly |
Ah, sure. |
As API is changing (access column using Symbol instead of String) I think it's always safer to document changes and after (in an other PR) improve doc. |
src/timearray.jl
Outdated
fs = fieldnames(T) | ||
|
||
quote | ||
(c ∈ $fs) ? getfield(ta, c) : ta[c] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user tries to call a column values
, for example, this will be incredibly odd behavior. Either replace all field accesses within this package to use getfield
explicitly, or don't support getproperty
. (I'd vote for the latter.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I need more time work on this issue.
There are tons of statements like ta.values
in the test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, every use of ta.values
would need to be changed to getfield(ta, :values)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change them to our helper function vaules(ta)
, timestamp(ta)
, colnames(ta)
and meta(ta)
.
src/utilities.jl
Outdated
if n == 1 | ||
cnames[d] = Symbol(cnames[d], "_$n") | ||
else | ||
s = cnames[d] |> string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the function call to a pipe? It just makes it harder to read.
src/utilities.jl
Outdated
findcol(ta::AbstractTimeSeries, s::Symbol) = findcol(colnames(ta), s) | ||
|
||
@inline function findcol(cols::Vector{Symbol}, s::Symbol) | ||
i = findfirst(isequal(s), cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect indentation
Since we are moving toward the symbol indexing. Ref JuliaStats/TimeSeries.jl#377
* Deprecate the helper functions Since we are moving toward the symbol indexing. Ref JuliaStats/TimeSeries.jl#377 * Update NEWS.md * travis: add 1.0 to build matrix - update doc deploy script
test cases and doc updated. |
Rebased. good to go? |
rename(cl, "New Close") | ||
rename(cl, ["New Close"]) | ||
rename(ohlc, ["New Open", "New High", "New Low", "New Close"]) | ||
rename(cl, :Close′) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the '
here intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
′
(not '
) is used several times in https://github.com/JuliaStats/TimeSeries.jl/blob/master/src/broadcast.jl
I was surprised by the use of such a character also which can't be found easily on my AZERTY keyboard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those render exactly the same for me. Weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not naming simply :NewClose
(or :Close2
)?
What are the english name of ′
, '
and `
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the first one, the second is "apostrophe," and the third is "backtick" (assuming that's rendering for me properly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's :Close\prime<tab>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not naming simply
:NewClose
(or:Close2
)?
well, I personally love the unicode support in Julia.
I always write code (in my research project) like x = 1; x\prime<tab> = x + 42
or
x\vec<tab> ...
julia> x = 1
1
julia> x′ = x + 42
43
julia> function ϕ()
42
end
ϕ (generic function with 1 method)
julia> :Close₂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I was on Linux before when it rendered like an apostrophe for me, but on macOS they're distinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ararslan maybe you should open an issue on your Linux distribution bug tracker because that's quite problematic to have 2 differents characters rendered the same.
## Fields getter functions | ||
|
||
There are four field getter functions exported. | ||
They are named as same as the field names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field names are effectively an implementation detail, so I don't know that it's worth mentioning that. Good to document these functions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the field name is revealed to user already here:
http://juliastats.github.io/TimeSeries.jl/latest/timearray.html
So I added that section.
src/combine.jl
Outdated
colnames = Symbol.(colnames) | ||
end | ||
|
||
meta = if _meta(ta1) == _meta(ta2) && meta isa Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta === nothing
is typically preferred to meta isa Nothing
. In particular, ===
makes the compiler happy. (Though isa
is still better than ==
in this case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha, okay 5d1dcc7
src/combine.jl
Outdated
else | ||
meta = meta | ||
meta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/utilities.jl
Outdated
if n == 1 | ||
cnames[d] = Symbol(cnames[d], "_$n") | ||
else | ||
s = string(cnames[d]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me for the most part, just a few comments. Nice work here! |
One more thing to be discussed: julia> using TimeSeries
[ Info: Recompiling stale cache file /home/iblis/.julia/compiled/v1.0/TimeSeries/dmqCl.ji for TimeSeries [9e3dc215-6440-5c97-bce1-76c03772f85e]
WARNING: eval from module Memoize to TimeSeries:
Expr(:const, Symbol("##gen_colnames_memoized_cache") = Expr(:call, :IdDict))
** incremental compilation may be broken for this module **
WARNING: eval from module Memoize to TimeSeries:
Expr(:const, Symbol("##carry_char_memoized_cache") = Expr(:call, :IdDict))
** incremental compilation may be broken for this module ** |
I decided to drop Memoize.jl finally |
ref #262
getproperty
Base.propertynames