Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dropmissing! fails when one column in DataFrame is of type bitvector #3300

Open
KevDAll opened this issue Mar 13, 2023 · 1 comment
Open
Labels
Milestone

Comments

@KevDAll
Copy link

KevDAll commented Mar 13, 2023

DataFrames.jl Version 1.5.0.
Julia Version 1.6.1

When one or more columns in an array are of type bitvector, dropmissing! will fail.

The issue appears to be caused at line 922 of dataframe.jl (_deleteat!_helper)

deleteat!(col, drop)

The problem here is that bitarray.deleteat!(B::Bitvector, inds) takes precedence over array.deleteat!(a::Vector, inds::AbstractVector{Bool}).

Version 1.7.0 of base adds the signature bitarray.deleteat!(B::Bitvector, inds::AbstractVector{Bool}) which will fix the issue.

An alternative would be update the project.toml to better indicate the compatibility with Base.

Minimal Example:

Expected Behavior (occurs if B::Vector{Bool}):

x = DataFrame(:A=>[1,2,missing], :B=>[false, false, false])
dropmissing!(x)
print(x)

2×2 DataFrame
 Row │ A      B
     │ Int64  Bool
─────┼──────────────
   1 │     1  false
   2 │     2  false

Observed Behavior (occurs if B::BitArray):

x = DataFrame(:A=>[1,2,missing], :B=>falses(3))
dropmissing!(x)
print(x)

ERROR: BoundsError: attempt to access 3-element BitVector at index [false]
Stacktrace:
 [1] throw_boundserror(A::BitVector, I::Tuple{Bool})
   @ Base ./abstractarray.jl:651
 [2] checkbounds
   @ ./abstractarray.jl:616 [inlined]
 [3] deleteat!(B::BitVector, inds::BitVector)
   @ Base ./bitarray.jl:989
 [4] _deleteat!_helper(df::DataFrame, drop::BitVector)
   @ DataFrames ~/.julia/packages/DataFrames/LteEl/src/dataframe/dataframe.jl:922
 [5] deleteat!
   @ ~/.julia/packages/DataFrames/LteEl/src/dataframe/dataframe.jl:894 [inlined]
 [6] dropmissing!(df::DataFrame, cols::Colon; disallowmissing::Bool)
   @ DataFrames ~/.julia/packages/DataFrames/LteEl/src/abstractdataframe/abstractdataframe.jl:1081
 [7] dropmissing! (repeats 2 times)
   @ ~/.julia/packages/DataFrames/LteEl/src/abstractdataframe/abstractdataframe.jl:1079 [inlined]
 [8] top-level scope
   @ REPL[12]:1
@bkamins
Copy link
Member

bkamins commented Mar 13, 2023

The issue is that JuliaLang/julia#42144 was not backported to 1.6 LTS. Let us wait for Julia maintainers to decide what to do about it (in general - you are hitting a bug in Julia not in DataFrames.jl).

CC @KristofferC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants