Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram Blur / Smoothing #56

Closed
curran opened this issue Apr 12, 2017 · 15 comments · Fixed by #151
Closed

Histogram Blur / Smoothing #56

curran opened this issue Apr 12, 2017 · 15 comments · Fixed by #151
Assignees

Comments

@curran
Copy link

curran commented Apr 12, 2017

It would be useful to have a "blur" function to approximate kernel density estimation.

This function would accept the output from histogram, conceptually looking something like:

blur(histogram(data))

Original discussion in d3/d3-contour#7 (comment)

The implementation could be similar to blurX or blurY in d3-contour.

This article may be relevant: Convolve n Square Pulses to Gaussian.

@curran
Copy link
Author

curran commented Apr 12, 2017

Mind blown after seeing this

Due to the central limit theorem, the Gaussian can be approximated by several runs of a very simple filter such as the moving average. The simple moving average corresponds to convolution with the constant B-spline ( a rectangular pulse ), and, for example, four iterations of a moving average yields a cubic B-spline as filter window which approximates the Gaussian quite well.

from Wikipedia: Gaussian filter.

Makes me wonder if we could add a moving average function, then use that to implement blur.

@curran
Copy link
Author

curran commented Apr 12, 2017

I did an experiment and found that moving averages do seem to converge on what looks like a Gaussian.

image
https://bl.ocks.org/curran/853fa00b8f0732fb2bee7fccfd7b4523

Also related to d3/d3-shape#43

@curran
Copy link
Author

curran commented Feb 7, 2018

I keep coming across scenarios where a bit of smoothing would be useful, for example this one:

image

CO2 Emissions Stacked Area Chart

@curran curran changed the title blur? Histogram Blur / Smoothing Feb 7, 2018
@curran
Copy link
Author

curran commented Feb 7, 2018

Proposed implementation (from Histogram Smoothing):

const blur = (data, property) => data.map((d, i) => {
  const previous = (i === 0) ? i : i - 1;
  const next = (i === data.length - 1) ? i : i + 1;
  const sum = data[previous][property] + d[property] + data[next][property];
  d[property] = sum / 3;
  return d;
});

@Fil
Copy link
Member

Fil commented Apr 6, 2020

Moving average: https://observablehq.com/@d3/moving-average
(would be great to have it in d3-array)

@Fil
Copy link
Member

Fil commented Apr 7, 2020

Also did a Gaussian Smoothing notebook
https://observablehq.com/@fil/gaussian-smoothing
Obvious advantage over moving average is it's smoother :) It shows quite well on the electricity dataset which has a 24h base period.

@curran
Copy link
Author

curran commented Jun 8, 2020

Here's an application of that blur function.

image

https://vizhub.com/curran/14bdac88ebf747c2b8a9a919bbe0a831

@Fil
Copy link
Member

Fil commented Jun 11, 2020

I think we would also want optimized 2D or n-D blurs, like we have in https://github.com/d3/d3-contour/blob/master/src/blur.js (note the number of TODO in the comments ;-) ).

@curran
Copy link
Author

curran commented Jun 11, 2020

Yes that would be very cool!

One "problem" I notice with the Gaussian blur approach of repeated averaging is that the slope of the curves appear to tend towards zero the closer you get to the endpoints (see the red rectangle in the image).

image

Is this a known problem with Gaussian blur in general? How to folks usually deal with this?

To me, it seems to skew the data into showing something that's not there (the "flattening out" effect at the end of the curve), which make the dataviz misleading. The original data does not seem to exhibit that behavior.

image

I'm thinking to just remove the data points that could have been indirectly impacted by the edge case - so chop off numIterations values from the beginning and end of the timeseries.

@curran
Copy link
Author

curran commented Jun 11, 2020

Here's a revised implementation, taking that end flattening effect into account:

export const blur = (data, property, numIterations) => {
  const n = data.length;
  for (let j = 0; j < numIterations; j++) {
    for (let i = 0; i < n; i++) {
      const previous = data[i === 0 ? i : i - 1];
      const current = data[i];
      const next = data[i === n - 1 ? i : i + 1];
      const sum = previous[property] + current[property] + next[property];
      current[property] = sum / 3;
    }
  }
  // Chop off the ends, as they may represent misleading "flattening".
  return data.slice(numIterations, data.length - numIterations);
};

@Fil
Copy link
Member

Fil commented Jun 11, 2020

Here's a fork which uses my gaussian kernel blurring; it doesn't seem to suffer from the problem you describe?

@curran
Copy link
Author

curran commented Jun 11, 2020

Very nice! Indeed, the problem is not there. The problem must lie in the repeated iteration technique.

@Fil
Copy link
Member

Fil commented Jun 12, 2020

This new notebook shows how we could use the same API for 1D and 2D blur:
https://observablehq.com/@fil/moving-average-blur

I've also straightened the implementation a little to get more juice out of it (about 30% faster in my tests). I've tried to implemented gaussian blur but it does not compete at all in terms of speed.

Fil added a commit that referenced this issue Jun 13, 2020
@Fil Fil mentioned this issue Jun 13, 2020
7 tasks
@Fil
Copy link
Member

Fil commented Jun 13, 2020

And here is another fork, this time using the d3.blur proposal.

@curran
Copy link
Author

curran commented Apr 12, 2021

To close the loop here, the implementation was published as a separate package https://github.com/Fil/array-blur .

Fil added a commit that referenced this issue Jun 30, 2022
fixes #56
Fil added a commit that referenced this issue Jun 30, 2022
fixes #56
@Fil Fil closed this as completed in #151 Jul 2, 2022
Fil added a commit that referenced this issue Jul 2, 2022
* d3.blur

fixes #56
Co-authored-by: Mike Bostock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants