Something went wrong with noise. #202

brettviren · 2023-03-22T20:56:43Z

Discovered by Mike Wang (with a DNN no less!) and reported here:

https://cdcvs.fnal.gov/redmine/issues/27898

Some regression related to noise crept in. Haiwang finds it most plain as uncharacteristic high or low values at the very begin and end of the mean waveform. Between v 0.20.0 and 0.23.0:

Looking at the first ticks across a few versions (also from Haiwang):

Something between 0.21 and 0.23 went "bad". PR #172 is a suspect.

HaiwangYu · 2023-03-23T00:06:03Z

a note with more info:
2023-03-22 noise-debug-v2.pdf

brettviren · 2023-03-23T12:46:25Z

Zooming in on the left side also shows the problem:

https://www.phy.bnl.gov/~bviren/tmp/wctsim/osd/bad-noise/bad-noise.html

Time is on the X-axis. A ridge of high samples go down the channels. Then there is a broader valley of low samples.

Note, each channel group samples noise from different, arbitrary wire lengths and is not meant to be a correct model for any given detector. The data is generated via:

https://github.com/WireCell/wire-cell-toolkit/blob/master/gen/test/test-noise-roundtrip.jsonnet

HaiwangYu · 2023-03-23T13:23:23Z

Yes, I saw that too on all 3 planes of the PDSP simulation.

brettviren · 2023-03-23T19:19:07Z

Well, it seems this bug is due, at least in part, to a rather embarrassing blunder on my part!

During the refactoring of the "recycling randoms" I went through a some iteration on changing the internal API. After that was done, I did not correctly update how that API is called when we create a recycling random generator.

https://github.com/WireCell/wire-cell-toolkit/blame/master/gen/src/AddNoise.cxx#L56

As a consequence, the "percentage replacement" value was being used to set the mean of the normal distribution. That mean should have been set to zero but was set to 0.04.

Changing the code to call make_recyling() with a proper mean=0 and sigma=1 arguments removes the weird high/low ends of the mean waveforms. Because 0.04 is close to 0.0, this bug was not so obvious.

That is a real bug but even given a non-zero Gaussian mean, I do not understand the cause of the high/low ends of the mean waveform. Something else is likely lurking. I'll check a bit more.

brettviren · 2023-03-23T19:33:10Z

Hmm, the generated normals are consumed to construct a complex spectrum modulated by the Gaussian sigmas from the input mean spectrum. Using a non-zero mean for the "normal" distribution certainly causes the result to be non-Rayleigh so some distortion must happen. Not sure why this exact shape, but I think that's enough to believe there is no deeper problem.

brettviren · 2023-03-23T20:08:03Z

Here is the case when I make the mean of the "normal" distribution ridiculously large (1.0).

Note the exaggerated ends of the mean waveforms below each frame.

Here is after using proper mean of 0 for the normal distribution:

The commit which is inbound provides check_noise_roundtrip.py to make these plots and spew some stats and test-noise-roundtrip.bats which will make these plots and fail the stats if they go too far out of spec again.

HaiwangYu · 2023-03-23T20:52:47Z

@brettviren, thanks for figuring this out so quickly. I will try it out too.

A small comment: For your test plot, is that possible to move the coordinate axises a bit away from the content (2D image and 1D waveforms)? So that the edge ticks/channels can be viewed easier.

A question: what this test plot looks like if you use mean=0.04 instead of 1? Will the the test fail in that case?

HaiwangYu · 2023-03-24T16:16:47Z

Hi @brettviren, seems the spectra with this fix went up a bit compared to 0.20.0 and 0.17.1.

workarea:
https://www.phy.bnl.gov/~yuhw/wct-ci/gen/

gen:

wire-cell -l stdout -L debug -c check_pdsp_noise.jsonnet -V output=frames-9485256.tar.bz2

compare mean spec

wirecell-plot comp1d -n spec frames-0.17.1.tar.bz2 frames-9485256.tar.bz2 spec-u.pdf --chmin 0 --chmax 800 --interactive

brettviren · 2023-03-24T16:53:50Z

Thanks @HaiwangYu. As mentioned in our chat, definitely there should be some change in the spectrum but my naive understanding is that the spectrum would become slightly lower after the bug fix. I'll look into this more!

HaiwangYu · 2023-03-24T19:01:38Z

Hi @brettviren, I checked the following variations of 9485256, the higher spectra remains. So I think something else than Normals::make_recycling happened between 0.23.0 and 9485256

v2: re-run
v3: mean 0 -> 0.04
v4: revert to original call

While if I add the correct function call of Normals::make_recycling to 0.23.0, I do get matching mean spectra and wave:

wirecell-plot comp1d -n spec frames-0.17.1.tar.bz2 frames-0.23.0-v3.tar.bz2 spec-u.pdf --chmin 0 --chmax 800 --interactive
wirecell-plot comp1d -n wave frames-0.17.1.tar.bz2 frames-0.23.0-v3.tar.bz2 tmp.pdf --chmin 0 --chmax 800 --interactive

HaiwangYu · 2023-03-24T19:17:17Z

I checked that they should be loading the same noise spectra too.

brettviren · 2023-03-24T19:33:10Z

@HaiwangYu thanks, I'll look at these type of plots next.

I checked kind of the same thing using the "roundtrip" test which applies this chain:

spec->sampling->volts->adc->dac->volts->fitting->spec

With bug fix, mean=0, output spec matches input spec up to distortions which I believe are due to inescapable ADC quantization error.

With bug added back, mean=0.04, the output spectra are slightly higher than input, beyond what quantization adds.

Adding "superbug" with mean=1.0, the fit spectra go very high. Eg, input spectrum peak at 0.16V gives output 0f about 0.24V.

brettviren · 2023-03-24T21:19:23Z

I see something a little different using comp1don the "roundtrip" noise. The actual spectra are hand-made to mimic official PDSP noise but the shapes should not be exactly the same. Also, the "roundtrip" test does not care about canonical "planes". The channel range I plot are simply the channels getting the maximum noise in the test.

Note, I just pushed a small change to the comp1d plotter. Actual changes are only cosmetic.

I'm using EmpiricalNoiseModel as you. I have also looked at using GroupNoiseModel and observe the same "recycling bug".

I note two differences in comparing "bug" vs "no bug" pair with "roundtrip" noise related to your pair:

Noise from "roundtrip" test has a closer match between "bug" and "no bug".
It also has strong DC component missing from your noise. This is after subtracting the median. If instead I use the new --baseline=mean I get a zero value DC component. Perhaps this is related to integer roundoff? I note, your sample also has non-zero DC.

So, what's still going on?

Something about using the concocted noise spectra in the "roundtrip" test instead of official PDSP noise spectra?
Something about mixing signal + noise (your test) vs pure noise (my test)?

I'll audit the code between 0.23 and now.

brettviren · 2023-03-24T22:09:11Z

@HaiwangYu I guess something may be off about your tests? Here is what I see with latest HEAD with your check_pdsp_noise.jsonnet from your web site:

HaiwangYu · 2023-03-24T22:29:34Z

Hi @brettviren, I compared 0.20.0 and 9485256, for u plane [0, 800). And 9485256 has a higher spectra. Did you see a different behavior?

brettviren · 2023-03-24T22:34:49Z

Okay. I'm confused. You wrote 0.23 before.

HaiwangYu · 2023-03-24T22:36:13Z

If I check same channels for same configuration, I got same results as yours

brettviren · 2023-03-24T22:37:05Z

Okay, I will look at 0.20 + bug fix compared to HEAD.

HaiwangYu · 2023-03-24T22:38:02Z

Okay. I'm confused. You wrote 0.23 before.

0.23 has the same spectra as 0.20. 9485256 is also higher than 0.23.

brettviren · 2023-03-24T22:39:12Z

Ah, got it. Then I'll look at 0.23! 😄

brettviren · 2023-03-24T23:00:02Z

@HaiwangYu I don't know what I'm missing. Except for the DC frequency bin and the first/last time bins, I get good agreement between 0.23 vs HEAD and "0.04 bug" vs "no bug".

HaiwangYu · 2023-03-25T00:46:59Z

@brettviren, I think I found one change in PR175 may be the reason:

wire-cell-toolkit/gen/src/Digitizer.cxx

Line 94 in c84c88b

return round(relvoltage * adcmaxval);

I used 0.17.1 as the ref: before IDFT, noise refactoring
pr175: higher than ref
pr175 without round: same as ref

I guess this is because if no explicit round called here, some downstream code would implicitly use floor.

More in:
2023-03-22 noise-debug-v2-digitizer.pdf

brettviren · 2023-03-25T14:54:11Z

Excellent work, @HaiwangYu !

I wish I had given more info in my commit message as I'm having a hard time remembering what exactly motivated adding round(). This particular commit is in the middle of a series related to BlobDepoFill which is not directly related to Digitizer. I must have found some problem as a side effect of the BlobDepoFill work.

I'd expect using round() would shift the ADC waveforms upward by 1/2 count on average compared to floor(). We won't see this shift in the wave plots above due to the re-baselining comp1d applies. The spec plot shift is in the right direction for this explanation but it is not obvious that the size of the shift is consistent with 1/2 count.

I'll spend some time to better understand the implication of round() compared to floor().

brettviren · 2023-03-25T18:53:09Z

Check `round` vs `floor` at different noise levels

I use the "roundtrip" test configured with a parameterized noise spectrum that roughly models real spectra. For every group of 256 channels out of total 2560 channels I give progressively larger fraction of the parameterize spectrum. Eg, first group gets 0.1x total noise, second 0.2x, etc.

The nobug label means that the original bug that spawned this issue is fixed.

Measured mean spectra and waveform vs amount of noise

I show the measured mean spectra and waveforms for the three groups with least noise (0.1x, 0.2x and 0.3x) for the cases of directly truncating floating-point ADC values to integers (marked as floor) and after first applying round(). Starting with the 3rd group, the higher noise groups have no substantial differences between floor and round.

AC-coupled mean spectra

I simply set the zero-frequency bin to 0 so baseline shifts do not dominate the plot of the spectra. No other transformation is done (besides taking mean across the group of channels).

AC-coupled mean waveforms

The corresponding AC-coupled waveforms are:

DC-coupled mean waveforms

To show the baseline shift directly, the untransformed mean waveforms are:

Conclusions

As expected, round() imparts a baseline shift of 1/2 ADC count compared to floor.
I think that the large distortion in group 1 spectrum for floor is due to more extreme quantization error than round. I think this also explains why floor gives much smaller variance in the mean waveform compared to round.
PDSP mean waveform has std=0.25 compared to group 3's std=0.04 or group 10's std=0.12. Already group 3 has no floor vs round spectra difference besides baseline shift. This leaves me with some trouble to understand why floor vs round on PDSP spectrum makes a difference.

What do?

This really comes down to what is the correct vs desired model for ADC quantization. Since, the answer for each may not be the same and may be a decision for each experiment, I make the application of round() be a configurable parameter.

But, we still should be a sensible default. Unless people have opinions, I will make floor the default.

brettviren · 2023-03-25T20:13:55Z

Check original bug, double sized bug and no bug

Here I look at three cases:

nobug we properly use Normal distribution
smlbug we use Gaussian with mean=0.04 (original bug)
bigbug we use Gaussian with mean=0.8

This uses the same "roundtrip" test as above.

Here the focus is just on the case of floor quantization and the highest noise ("group 10").

The three mean AC-coupled waveforms:

And the three mean AC-coupled spectra:

Neither bug nor "double-bug" appears to distort the spectrum despite the damage at the ends of the waveforms. The bug fix removes that damage.

brettviren · 2023-03-26T18:47:14Z

For posterity, here is a simple demonstration about the choice between round and floor in the Digitizer.

As a function of input voltage it shows the ADC as untruncated float, floor or round at either end of the fullscale range.

It is clear how floor is a maximally biased choice.

However, what is not shown is that we also add a baseline to the voltage. This baseline could be chosen to shift the voltage by 1/2 ADC step such that floor would not actually be biased. Getting this actually correct requires some careful study. Note, in the plot, baseline=0.

HaiwangYu · 2023-03-27T00:51:04Z

Hi @brettviren, I think I made a mistake in the original comp1d that caused the large spectra discrepancy I saw.

The following code subtracts the median first then convert the frame to 'int16' by call floor in the end. This causes quite some bias and further more, bias the frame from Digitizer::round and Digitizer::floor differently.

frame = numpy.array((frame.T - numpy.median(frame, axis=1)).T, dtype=dtype)

After fixing this, or using your new comp1d, the spectra are consistent between Digitizer::round and Digitizer::floor except for the DC component.

I am so sorry for making this mistake.

Meanwhile, it is great that we noticed this Digitizer::floor to Digitizer::round change. Which may be just a simple baseline shift in most cases but still good to thoroughly understand its potential impact.

brettviren · 2023-03-27T11:38:19Z

Ah, great, thanks for checking this, @HaiwangYu! I think that closes the last concern.

BTW, I had also looked at that original median subtraction and casting with some interest but could not see anything wrong with it.

HaiwangYu · 2023-03-27T15:57:36Z

Hi @brettviren, I tried to overlay the input spectra and it seems the older version has a better consistency. My plotting script:

def specs_from_file(spectra_file, planes=None, wirelen=7500, scale_factor=1.0e9): 
    '''
    default unit is megavolt
    MV -> mV scale_factor=1.0e9
    '''
    wire_specs = json.loads(bz2.BZ2File(spectra_file, 'r').read())
    for i, wire_spec in enumerate(wire_specs) :
        if planes is not None and wire_spec['plane'] not in planes:
            continue
        if abs(wire_spec['wirelen']-wirelen) > 10 :
            continue
        print("const {:.3e} plane {}, wirelen {:.1f}".format(wire_spec['const'],wire_spec['plane'],wire_spec['wirelen']))
        freqs = [x*1000 for x in wire_spec['freqs']]
        amps = [math.sqrt(x**2+(wire_spec['const'])**2)*scale_factor for x in wire_spec['amps']]
        return (freqs,amps)

and ADC -> mV scale is 4095./1400.

brettviren · 2023-03-28T15:46:58Z

Hi @HaiwangYu

In the refactoring I missed the fact that the EmpericalNoiseModel returns a (half) spectrum with more than the number of requested samples (equal to the "fft best length").

The IncoherentAddNoise was then using only the requested spectrum size portion and that effectively stretched the spectrum toward higher frequencies. The fix now has IncoherentAddNoise being more tolerant. It will perform the sampling on the overly large spectrum and then truncate the resulting waveform to fit the requested size.

I also found and fixed some problems with the handling of the white-noise "constant" that EmpericalNoiseModel supports.

There is still a 3% increase (0.106 -> 0.109, see last plot below) in the RMS of final waveforms compared to rel 0.20.0 and 0.21.0. I think this is small enough that we need not delay making a release with all these fixes. Let me know if you disagree.

See below for some diagnostic plots comparing current HEAD with past releases. First the spectrum with 2 zooms at peak and bump and then the waveforms.

HaiwangYu · 2023-03-28T19:47:56Z

Thanks @brettviren. Seems all noise related tests are OK for me. I will do some more tests and make the release.
2023-03-28 noise-debug-v2-spctra-shift.pdf

brettviren closed this as completed in 9485256 Mar 23, 2023

HaiwangYu reopened this Mar 24, 2023

brettviren closed this as completed in 56aae07 Mar 25, 2023

HaiwangYu reopened this Mar 27, 2023

brettviren closed this as completed in 22154a5 Mar 28, 2023

brettviren added a commit that referenced this issue Mar 29, 2023

Checkpoint before merge latest from issue #202 fixes

ede5dc8

brettviren mentioned this issue Apr 28, 2023

DC noise component may not be correct. #218

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Something went wrong with noise. #202

Something went wrong with noise. #202

brettviren commented Mar 22, 2023

HaiwangYu commented Mar 23, 2023

brettviren commented Mar 23, 2023

HaiwangYu commented Mar 23, 2023

brettviren commented Mar 23, 2023

brettviren commented Mar 23, 2023

brettviren commented Mar 23, 2023

HaiwangYu commented Mar 23, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023 •

edited

Loading

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023 •

edited

Loading

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 25, 2023 •

edited

Loading

brettviren commented Mar 25, 2023

brettviren commented Mar 25, 2023

brettviren commented Mar 25, 2023

brettviren commented Mar 26, 2023

HaiwangYu commented Mar 27, 2023

brettviren commented Mar 27, 2023

HaiwangYu commented Mar 27, 2023 •

edited

Loading

brettviren commented Mar 28, 2023

HaiwangYu commented Mar 28, 2023

Something went wrong with noise. #202

Something went wrong with noise. #202

Comments

brettviren commented Mar 22, 2023

HaiwangYu commented Mar 23, 2023

brettviren commented Mar 23, 2023

HaiwangYu commented Mar 23, 2023

brettviren commented Mar 23, 2023

brettviren commented Mar 23, 2023

brettviren commented Mar 23, 2023

HaiwangYu commented Mar 23, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023 • edited Loading

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023 • edited Loading

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 24, 2023

brettviren commented Mar 24, 2023

brettviren commented Mar 24, 2023

HaiwangYu commented Mar 25, 2023 • edited Loading

brettviren commented Mar 25, 2023

brettviren commented Mar 25, 2023

Check round vs floor at different noise levels

Measured mean spectra and waveform vs amount of noise

AC-coupled mean spectra

AC-coupled mean waveforms

DC-coupled mean waveforms

Conclusions

What do?

brettviren commented Mar 25, 2023

Check original bug, double sized bug and no bug

brettviren commented Mar 26, 2023

HaiwangYu commented Mar 27, 2023

brettviren commented Mar 27, 2023

HaiwangYu commented Mar 27, 2023 • edited Loading

brettviren commented Mar 28, 2023

HaiwangYu commented Mar 28, 2023

HaiwangYu commented Mar 24, 2023 •

edited

Loading

HaiwangYu commented Mar 24, 2023 •

edited

Loading

HaiwangYu commented Mar 25, 2023 •

edited

Loading

Check `round` vs `floor` at different noise levels

HaiwangYu commented Mar 27, 2023 •

edited

Loading