-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Histogram Blur / Smoothing #56
Comments
Mind blown after seeing this
from Wikipedia: Gaussian filter. Makes me wonder if we could add a moving average function, then use that to implement |
I did an experiment and found that moving averages do seem to converge on what looks like a Gaussian.
Also related to d3/d3-shape#43 |
I keep coming across scenarios where a bit of smoothing would be useful, for example this one: |
Proposed implementation (from Histogram Smoothing): const blur = (data, property) => data.map((d, i) => {
const previous = (i === 0) ? i : i - 1;
const next = (i === data.length - 1) ? i : i + 1;
const sum = data[previous][property] + d[property] + data[next][property];
d[property] = sum / 3;
return d;
}); |
Moving average: https://observablehq.com/@d3/moving-average |
Also did a Gaussian Smoothing notebook |
Here's an application of that blur function. |
I think we would also want optimized 2D or n-D blurs, like we have in https://github.com/d3/d3-contour/blob/master/src/blur.js (note the number of TODO in the comments ;-) ). |
Yes that would be very cool! One "problem" I notice with the Gaussian blur approach of repeated averaging is that the slope of the curves appear to tend towards zero the closer you get to the endpoints (see the red rectangle in the image). Is this a known problem with Gaussian blur in general? How to folks usually deal with this? To me, it seems to skew the data into showing something that's not there (the "flattening out" effect at the end of the curve), which make the dataviz misleading. The original data does not seem to exhibit that behavior. I'm thinking to just remove the data points that could have been indirectly impacted by the edge case - so chop off |
Here's a revised implementation, taking that end flattening effect into account: export const blur = (data, property, numIterations) => {
const n = data.length;
for (let j = 0; j < numIterations; j++) {
for (let i = 0; i < n; i++) {
const previous = data[i === 0 ? i : i - 1];
const current = data[i];
const next = data[i === n - 1 ? i : i + 1];
const sum = previous[property] + current[property] + next[property];
current[property] = sum / 3;
}
}
// Chop off the ends, as they may represent misleading "flattening".
return data.slice(numIterations, data.length - numIterations);
}; |
Here's a fork which uses my gaussian kernel blurring; it doesn't seem to suffer from the problem you describe? |
Very nice! Indeed, the problem is not there. The problem must lie in the repeated iteration technique. |
This new notebook shows how we could use the same API for 1D and 2D blur: I've also straightened the implementation a little to get more juice out of it (about 30% faster in my tests). I've tried to implemented gaussian blur but it does not compete at all in terms of speed. |
And here is another fork, this time using the d3.blur proposal. |
To close the loop here, the implementation was published as a separate package https://github.com/Fil/array-blur . |
It would be useful to have a "blur" function to approximate kernel density estimation.
This function would accept the output from
histogram
, conceptually looking something like:Original discussion in d3/d3-contour#7 (comment)
The implementation could be similar to blurX or blurY in d3-contour.
This article may be relevant: Convolve n Square Pulses to Gaussian.
The text was updated successfully, but these errors were encountered: