Repairing unit tests for source separation #240

bmcfee · 2017-02-27T21:27:18Z

This PR should resolve #239 by relaxing the pass condition on frame-wise source separation metrics.

Again, this is a result of instability introduced by solve/lstsq having different implementations depending on the underlying linear algebra system (blas/mkl/openblas/etc).

craffel · 2017-02-27T21:58:08Z

Appears to still fail :(

bmcfee · 2017-02-27T22:02:19Z

Dang. I rolled the threshold up until it passed locally. I guess travis is a different story; trying again at 1e-2.

bmcfee · 2017-02-27T22:21:05Z

apparently 1e-2 is still too much. Maybe it's worth removing the offending fixtures instead?

craffel · 2017-02-27T22:42:47Z

It looks like the two remaining failing tests are still being called with atol 1e-12.

bmcfee · 2017-02-27T22:44:47Z

Aha -- the problem is actually that some of the failures are not framewise metrics.

At this point, I recommend reverting back to strict test thresholds, but dropping examples 00 and 04 from the fixtures. Were these generated specifically to test any particular failure modes, or are they just random clips?

faroit · 2017-02-27T22:49:06Z

Were these generated specifically to test any particular failure modes, or are they just random clips?

I think those (failing) scores are quite low which seem to increase the probability of numerical instabilities. However, they are not edge cases where one of the estimated targets is zero or identical to the reference.

bmcfee · 2017-02-28T00:19:24Z

@faroit so would we lose any case coverage by dropping these examples?

faroit · 2017-02-28T07:40:35Z

short answer: no

longer answer:

04 is a nice example of one target (bass) having a really bad estimate particular in the first frames. But it's not the only item with bass separation. So it would still be okay to drop it.

it would be interesting to see if increasing the metric window (currently set to 120) could average out the precision error?

bmcfee · 2017-02-28T15:02:08Z

Okay, tests pass now that we killed the two offending cases.

bmcfee · 2017-02-28T16:20:14Z

@craffel if you're okay with dropping these fixtures, this guy's good to go.

craffel · 2017-02-28T17:38:47Z

Merged, thank you sir.

faroit · 2017-03-01T13:30:32Z

thanks for taking care of this!

bmcfee added the bug label Feb 27, 2017

bmcfee added this to the 0.5 milestone Feb 27, 2017

bmcfee assigned craffel Feb 27, 2017

reverted strictness, removed offending test cases for separation

a083db7

bmcfee force-pushed the sourcesep-test-fixture-fix branch from e77c407 to a083db7 Compare February 28, 2017 14:16

bmcfee changed the title ~~added a weak test condition for frame-wise source separation~~ Repairing unit tests for source separation Feb 28, 2017

craffel merged commit 89f8ef1 into mir-evaluation:master Feb 28, 2017

bmcfee deleted the sourcesep-test-fixture-fix branch March 1, 2017 13:37

craffel pushed a commit that referenced this pull request Jun 22, 2018

Remove failing test cases for separation (#240)

a6dd2f3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repairing unit tests for source separation #240

Repairing unit tests for source separation #240

bmcfee commented Feb 27, 2017

craffel commented Feb 27, 2017

bmcfee commented Feb 27, 2017

bmcfee commented Feb 27, 2017

craffel commented Feb 27, 2017

bmcfee commented Feb 27, 2017

faroit commented Feb 27, 2017 •

edited

Loading

bmcfee commented Feb 28, 2017

faroit commented Feb 28, 2017 •

edited

Loading

bmcfee commented Feb 28, 2017

bmcfee commented Feb 28, 2017

craffel commented Feb 28, 2017

faroit commented Mar 1, 2017

Repairing unit tests for source separation #240

Repairing unit tests for source separation #240

Conversation

bmcfee commented Feb 27, 2017

craffel commented Feb 27, 2017

bmcfee commented Feb 27, 2017

bmcfee commented Feb 27, 2017

craffel commented Feb 27, 2017

bmcfee commented Feb 27, 2017

faroit commented Feb 27, 2017 • edited Loading

bmcfee commented Feb 28, 2017

faroit commented Feb 28, 2017 • edited Loading

bmcfee commented Feb 28, 2017

bmcfee commented Feb 28, 2017

craffel commented Feb 28, 2017

faroit commented Mar 1, 2017

faroit commented Feb 27, 2017 •

edited

Loading

faroit commented Feb 28, 2017 •

edited

Loading