One more (better?) way to represent missing values #745

shashi · 2014-12-19T05:26:21Z

Just making sure this is on file in discussions here. http://wizardmac.tumblr.com/post/104019606584/whats-wrong-with-statistics-in-julia-a-reply "...the current NA and proposed Nullable approaches to missing values are a weak foundation that I would be hesitant to build upon."

The idea is you accompany the data vector with a frequency weight vector. A weight of 0 represents that the value is missing.

nalimilan · 2014-12-19T11:50:42Z

As noted in JuliaLang/julia#9363 (comment), that vector already exists as a BitArray. I guess some tests should be done to see whether the performance cost of BitArray over a plain array of boolean or integer is significant.

johnmyleswhite · 2014-12-19T15:00:53Z

I don't really think this package is the appropriate place to handle this problem, unless what we mean is that downstream functions should support frequency weights embedded as a column in a DataFrame (which I do think we should do).

simonster · 2014-12-21T21:00:00Z

DataArrays seems like the right place for this discussion. Let's discuss this further in JuliaStats/DataArrays.jl#133.

simonster mentioned this issue Dec 21, 2014

Revisiting representation of missing values JuliaStats/DataArrays.jl#133

Open

simonster closed this as completed Dec 21, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One more (better?) way to represent missing values #745

One more (better?) way to represent missing values #745

shashi commented Dec 19, 2014

nalimilan commented Dec 19, 2014

johnmyleswhite commented Dec 19, 2014

simonster commented Dec 21, 2014

One more (better?) way to represent missing values #745

One more (better?) way to represent missing values #745

Comments

shashi commented Dec 19, 2014

nalimilan commented Dec 19, 2014

johnmyleswhite commented Dec 19, 2014

simonster commented Dec 21, 2014