Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One more (better?) way to represent missing values #745

Closed
shashi opened this issue Dec 19, 2014 · 3 comments
Closed

One more (better?) way to represent missing values #745

shashi opened this issue Dec 19, 2014 · 3 comments

Comments

@shashi
Copy link

shashi commented Dec 19, 2014

Just making sure this is on file in discussions here. http://wizardmac.tumblr.com/post/104019606584/whats-wrong-with-statistics-in-julia-a-reply "...the current NA and proposed Nullable approaches to missing values are a weak foundation that I would be hesitant to build upon."

The idea is you accompany the data vector with a frequency weight vector. A weight of 0 represents that the value is missing.

@nalimilan
Copy link
Member

As noted in JuliaLang/julia#9363 (comment), that vector already exists as a BitArray. I guess some tests should be done to see whether the performance cost of BitArray over a plain array of boolean or integer is significant.

@johnmyleswhite
Copy link
Contributor

I don't really think this package is the appropriate place to handle this problem, unless what we mean is that downstream functions should support frequency weights embedded as a column in a DataFrame (which I do think we should do).

@simonster
Copy link
Contributor

DataArrays seems like the right place for this discussion. Let's discuss this further in JuliaStats/DataArrays.jl#133.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants