Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[histogram] squash end-inclusive histogram bins #172

Merged
merged 5 commits into from
Mar 13, 2019

Conversation

williaster
Copy link
Owner

@williaster williaster commented Mar 13, 2019

🐛 Bug Fix

This PR updates the computation for numeric bins in @data-ui/histogram to squash the last two bins together when the thresholds of the last bin are equal to the upper threshold of the second to last bin. Without doing this the results can be quite confusing as pointed out by some Superset users.

example, given the following raw data [0, 0, 1, 2, 3, 9, 9, 10, 10, 10] (note 10 is the max value) and binCount=10

We get the following 11 bins

or with binCount=20 (and 21 resultant bins 🤔 )

This issue explains this behavior

given n bins, n+1 bin thresholds are produced.

the last bin (bins[thresholds.length]) contains any value greater than or equal to the last threshold (thresholds[thresholds.length - 1]).

Specifically this clarifys that

  1. we were rendering n+1 bins, not n
  2. the the last bin in the histogram above has x0 == x1 == 10, equal to the upper bound of the second to last bin (x0 = 9 and x1 = 10).

After updating with this squashing, we get more intuitive results

cc @kristw @conglei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant