-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Something went wrong with noise. #202
Comments
a note with more info: |
Zooming in on the left side also shows the problem: https://www.phy.bnl.gov/~bviren/tmp/wctsim/osd/bad-noise/bad-noise.html Time is on the X-axis. A ridge of high samples go down the channels. Then there is a broader valley of low samples. Note, each channel group samples noise from different, arbitrary wire lengths and is not meant to be a correct model for any given detector. The data is generated via: https://github.com/WireCell/wire-cell-toolkit/blob/master/gen/test/test-noise-roundtrip.jsonnet |
Well, it seems this bug is due, at least in part, to a rather embarrassing blunder on my part! During the refactoring of the "recycling randoms" I went through a some iteration on changing the internal API. After that was done, I did not correctly update how that API is called when we create a recycling random generator. https://github.com/WireCell/wire-cell-toolkit/blame/master/gen/src/AddNoise.cxx#L56 As a consequence, the "percentage replacement" value was being used to set the mean of the normal distribution. That mean should have been set to zero but was set to 0.04. Changing the code to call That is a real bug but even given a non-zero Gaussian mean, I do not understand the cause of the high/low ends of the mean waveform. Something else is likely lurking. I'll check a bit more. |
Hmm, the generated normals are consumed to construct a complex spectrum modulated by the Gaussian sigmas from the input mean spectrum. Using a non-zero mean for the "normal" distribution certainly causes the result to be non-Rayleigh so some distortion must happen. Not sure why this exact shape, but I think that's enough to believe there is no deeper problem. |
Here is the case when I make the mean of the "normal" distribution ridiculously large (1.0). Note the exaggerated ends of the mean waveforms below each frame. Here is after using proper mean of 0 for the normal distribution: The commit which is inbound provides |
@brettviren, thanks for figuring this out so quickly. I will try it out too. A small comment: For your test plot, is that possible to move the coordinate axises a bit away from the content (2D image and 1D waveforms)? So that the edge ticks/channels can be viewed easier. A question: what this test plot looks like if you use mean=0.04 instead of 1? Will the the test fail in that case? |
Hi @brettviren, seems the spectra with this fix went up a bit compared to 0.20.0 and 0.17.1. workarea: gen:
compare mean spec
|
Thanks @HaiwangYu. As mentioned in our chat, definitely there should be some change in the spectrum but my naive understanding is that the spectrum would become slightly lower after the bug fix. I'll look into this more! |
Hi @brettviren, I checked the following variations of 9485256, the higher spectra remains. So I think something else than
While if I add the correct function call of
|
@HaiwangYu thanks, I'll look at these type of plots next. I checked kind of the same thing using the "roundtrip" test which applies this chain:
With bug fix, mean=0, output spec matches input spec up to distortions which I believe are due to inescapable ADC quantization error. With bug added back, mean=0.04, the output spectra are slightly higher than input, beyond what quantization adds. Adding "superbug" with mean=1.0, the fit spectra go very high. Eg, input spectrum peak at 0.16V gives output 0f about 0.24V. |
I see something a little different using Note, I just pushed a small change to the I'm using I note two differences in comparing "bug" vs "no bug" pair with "roundtrip" noise related to your pair:
So, what's still going on?
I'll audit the code between 0.23 and now. |
@HaiwangYu I guess something may be off about your tests? Here is what I see with latest HEAD with your |
Hi @brettviren, I compared 0.20.0 and 9485256, for u plane [0, 800). And 9485256 has a higher spectra. Did you see a different behavior? |
Okay. I'm confused. You wrote 0.23 before. |
Okay, I will look at 0.20 + bug fix compared to HEAD. |
0.23 has the same spectra as 0.20. 9485256 is also higher than 0.23. |
Ah, got it. Then I'll look at 0.23! 😄 |
@HaiwangYu I don't know what I'm missing. Except for the DC frequency bin and the first/last time bins, I get good agreement between 0.23 vs HEAD and "0.04 bug" vs "no bug". |
@brettviren, I think I found one change in PR175 may be the reason: wire-cell-toolkit/gen/src/Digitizer.cxx Line 94 in c84c88b
I guess this is because if no explicit |
Excellent work, @HaiwangYu ! I wish I had given more info in my commit message as I'm having a hard time remembering what exactly motivated adding I'd expect using I'll spend some time to better understand the implication of |
Check
|
Check original bug, double sized bug and no bugHere I look at three cases:
This uses the same "roundtrip" test as above. Here the focus is just on the case of The three mean AC-coupled waveforms: And the three mean AC-coupled spectra: Neither bug nor "double-bug" appears to distort the spectrum despite the damage at the ends of the waveforms. The bug fix removes that damage. |
For posterity, here is a simple demonstration about the choice between As a function of input voltage it shows the ADC as untruncated It is clear how However, what is not shown is that we also add a |
Hi @brettviren, I think I made a mistake in the original The following code subtracts the median first then convert the frame to 'int16' by call
After fixing this, or using your new I am so sorry for making this mistake. Meanwhile, it is great that we noticed this |
Ah, great, thanks for checking this, @HaiwangYu! I think that closes the last concern. BTW, I had also looked at that original median subtraction and casting with some interest but could not see anything wrong with it. |
Hi @brettviren, I tried to overlay the input spectra and it seems the older version has a better consistency. My plotting script: def specs_from_file(spectra_file, planes=None, wirelen=7500, scale_factor=1.0e9):
'''
default unit is megavolt
MV -> mV scale_factor=1.0e9
'''
wire_specs = json.loads(bz2.BZ2File(spectra_file, 'r').read())
for i, wire_spec in enumerate(wire_specs) :
if planes is not None and wire_spec['plane'] not in planes:
continue
if abs(wire_spec['wirelen']-wirelen) > 10 :
continue
print("const {:.3e} plane {}, wirelen {:.1f}".format(wire_spec['const'],wire_spec['plane'],wire_spec['wirelen']))
freqs = [x*1000 for x in wire_spec['freqs']]
amps = [math.sqrt(x**2+(wire_spec['const'])**2)*scale_factor for x in wire_spec['amps']]
return (freqs,amps) and ADC -> mV scale is |
Hi @HaiwangYu In the refactoring I missed the fact that the The I also found and fixed some problems with the handling of the white-noise "constant" that There is still a 3% increase (0.106 -> 0.109, see last plot below) in the RMS of final waveforms compared to rel 0.20.0 and 0.21.0. I think this is small enough that we need not delay making a release with all these fixes. Let me know if you disagree. See below for some diagnostic plots comparing current |
Thanks @brettviren. Seems all noise related tests are OK for me. I will do some more tests and make the release. |
Discovered by Mike Wang (with a DNN no less!) and reported here:
https://cdcvs.fnal.gov/redmine/issues/27898
Some regression related to noise crept in. Haiwang finds it most plain as uncharacteristic high or low values at the very begin and end of the mean waveform. Between v 0.20.0 and 0.23.0:
Looking at the first ticks across a few versions (also from Haiwang):
Something between 0.21 and 0.23 went "bad". PR #172 is a suspect.
The text was updated successfully, but these errors were encountered: