Adding framewise evaluation for bss_eval_images #212

ecmjohnson · 2016-07-26T08:11:33Z

Framewise Images Evaluation

The last of the series: A function has been added in the separation module (bss_eval_images_framewise) that is identical in functionality to bss_eval_sources_framewise, but calls bss_eval_images in place of bss_eval_sources. This is in response to discussions in #68 and would also be useful for #192.

As was done with bss_eval_sources_framewise, if the window and hop parameters will result in fewer than 2 windows being computed the function will simply return the result of bss_eval_images over the signals.

Testing

The required unit and regression testing has been added. The unit tests reuse the functions added for bss_eval_images and bss_eval_sources_framewise with slight modifications to accommodate the differences. Regression test data has been added in the test/data/separation/output0*.json files.

… use the atleast_3d from numpy

craffel · 2016-07-26T15:36:50Z

mir_eval/separation.py

+    )
+    # if fewer than 2 windows would be evaluated, return the images result
+    if nwin < 2:
+        return bss_eval_images(reference_sources,


Wouldn't this result in the function returning arrays of shape (nsrc,)? It seems for consistency that we should reshape so that it returns arrays of shape (nsrc, 1). But I'm open to discussion about it!

I think that's a good idea for both the framewise functions.

craffel · 2016-07-26T15:48:46Z

Thank you again! Of course, looks good overall; had one comment about the "not enough frames" case (which may apply to sources framewise) and another about collapsing some of your test code, but otherwise seems ready!

faroit · 2016-07-28T10:48:07Z

@carlthome @ErikJohnsonCU @craffel @aliutkus

To continue the discussion on #207 concerning the permutation indices: we actually think it doesn't make sense to provide a framewise permutation, on the other hand it doesn't make sense to provide an aggregated measure (like median) as well.. So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

For SISEC (which is framewise images bss), permutation is not computed, so I would vote for changing the default values to False for all bss scores in mir_eval.

faroit · 2016-07-28T11:13:40Z

mir_eval/separation.py

+    reference_sources : np.ndarray, shape=(nsrc, nsampl)
+        matrix containing true sources (must have the same shape as
+        estimated_sources)
+    estimated_sources : np.ndarray, shape=(nsrc, nsampl)


this should be np.ndarray, shape=(nsrc, nsampl, nchan)

True! I'll correct that.

…o bss_eval_sources

craffel · 2016-07-28T15:27:26Z

So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

Ok, being explicit in the docstring sounds right to me.

For SISEC (which is framewise images bss), permutation is not computed, so I would vote for changing the default values to False for all bss scores in mir_eval.

So is the consensus that permutation=True is wrong in general, or just for framewise? Is there any precedent for permutation=True or permutation=False for non-framewise anywhere else? I hesitate to change this default because it will break backwards compatibility, but if there's consensus that it's wrong in general, that is ok.

faroit · 2016-07-28T15:43:28Z

So is the consensus that permutation=True is wrong in general, or just for framewise? Is there any precedent for permutation=True or permutation=False for non-framewise anywhere else? I hesitate to change this default because it will break backwards compatibility, but if there's consensus that it's wrong in general, that is ok.

no it's not wrong in general, it's just not useful for framewise computation. Changing the default to false is just because of very few people do actually need the permutation. But yes, we should not break compatibility here, so maybe lets leave it as it is

craffel · 2016-07-29T16:54:03Z

no it's not wrong in general, it's just not useful for framewise computation. Changing the default to false is just because of very few people do actually need the permutation. But yes, we should not break compatibility here, so maybe lets leave it as it is

We could also make it default False for framewise functions and True for the others, but this would be a little messy/confusing.

carlthome · 2016-07-29T16:56:54Z

How about computing a global SIR sorting in the framewise functions as well (e.g. concatenating the frames before determining the permutation ndarray)? That might make more sense. Or perhaps it's just an unnecessary limitation.

faroit · 2016-07-29T17:21:36Z

How about computing a global SIR sorting in the framewise functions as well (e.g. concatenating the frames before determining the permutation ndarray)?

that would indeed make sense but this would also make the evaluation much slower. In fact, for very long items, consisting of many sources like in DSD100 or MedleyDB, people use framewise to make it computational efficient...

carlthome · 2016-07-29T17:34:11Z

For computational performance, how about doing a histogram first and using that for calculating the global sorting? I.e. summing the frames to one (or a few) and then use that to determine the global permutation? Assuming the interference is evenly distributed across an estimated signal, the sum shouldn't mess that up, I think.

Basically, just something like

sorts = Enum('local histogram global')
sort = ...  # Parameter
if sort == sorts.histogram:
    frame = np.average(frames)
elif sort == sorts.global:
    frame = np.concatenate(frames)
# SIR sort estimated sources...

Or maybe this is just silly... Especially if people use framewise evaluation across all songs in DSD100, in which case separation results could be really different due to varying instrumentation etc.

faroit · 2016-08-03T09:23:20Z

@carlthome, I like this idea, but I would let the users decide on this and just output framewise values for now.

ecmjohnson · 2016-08-03T12:50:28Z

We could also make it default False for framewise functions and True for the others, but this would be a little messy/confusing.

@craffel This is what is currently being done. If the default was True for framewise functions and the permutation changed during the evaluation without the user being aware this would result in grossly incorrect results. And if the default was False for non-framewise functions it would break backwards compatibility...

ecmjohnson · 2016-08-03T12:59:49Z

So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

@craffel @faroit @carlthome How does the following sound as a smart docstring comment?

Please be aware that this function does not compute permutations on the possible relations between reference_sources and estimated_sources due to the dangers of a changing permutation. Therefore (by default), it assumes that reference_sources[i] corresponds to estimated_sources[i]. To enable computing permutations please set compute_permutation to be True and check that the returned perm is identical for all windows.

craffel · 2016-08-03T16:56:18Z

Sounds reasonable to me!

carlthome · 2016-08-03T20:43:20Z

I wouldn't presume that people don't want to SIR sort per frame though. I'd only add a little to the compute_permutation parameter doc, like:

Note: Permutations are computed independently for every frame. For reference_sources[i] to properly correspond to estimated_sources[i], verify that the permutation result is the same for every frame.

The parameter default is obvious from the function signature, and should not be included in the documentation.

ecmjohnson · 2016-08-04T08:27:34Z

I wouldn't presume that people don't want to SIR sort per frame though.

But if a different permutation were computed for one frame, that would artificially buff the results by assuming a different correlation between the references and estimates than previous frames. Changing the compute_permutation parameter to default True could lead to a "blind" user (not reading the code/documentation) accepting the measures when they are incorrect.

A user can still pass compute_permutation = True and sort every frame, but it requires them accepting they will need to verify no change in permutation across all frames.

carlthome · 2016-08-04T08:54:00Z

Sorry, I was unclear. I absolutely think the default value should be False. I just don't think the docstring should assume there's no use case for framewise permutations.

ecmjohnson · 2016-08-04T12:39:35Z

@carlthome I totally agree with you. There is definitely a use case for framewise permutations and the docstring should make it clear how to safely compute them.

I've changed the comment slightly to emphasize that not computing permutations is only by default and that it can be safely changed if the user checks the output.

faroit · 2016-08-04T12:48:04Z

LGTM, @craffel do we need another code review? Otherwise I think this can be merged now.

craffel · 2016-08-05T15:30:57Z

LGTM, thank you all again!

ecmjohnson added 10 commits July 26, 2016 09:49

Added function for evaluating images framewise

17b606a

Adding evaluation of images framewise

5f80a8b

Added default parameters for bss_eval_images_framewise

b0b8544

Using images as fallback for invalid win/hop params; also, changed to…

1bbf2e5

… use the atleast_3d from numpy

Images framewise is now being unit tested

b63f2d8

Reordered computing scores and comparing them to prevent downtime issues

7aaead2

Added regression testing; pep8 fix

225c3de

Updated handling of regressions tests

318c8d2

Added image framewise data to output0*.json files

a410e4d

Updating the test cases failing due to numerical precision differences

b59109f

craffel reviewed Jul 26, 2016
View reviewed changes

ecmjohnson added 4 commits July 28, 2016 09:27

Consolidated a test case; added error checks on metric; moved a function

4d2be07

Ensured return type of framewise functions matches docstring

8fe9de5

Added a squeeze to deal with single dimension in testing

509377a

pep8 fix

2cdd9fb

faroit mentioned this pull request Jul 28, 2016

Bss eval images #207

Merged

faroit reviewed Jul 28, 2016
View reviewed changes

faroit mentioned this pull request Jul 28, 2016

Adding evaluation using mir_eval faroit/dsdtools#9

Closed

ecmjohnson added 4 commits July 28, 2016 13:29

Corrected shape in images framewise docstring

afbdb25

Added a note to the images framewise function

786b345

Clarified the compute_permutation comment in docstring and added it t…

927835d

…o bss_eval_sources

Added reference in bss_eval_sources and bss_eval_images functions

0bd9c67

faroit mentioned this pull request Jul 29, 2016

Framewise separation evaluation where some frames are silent #213

Closed

Updated framewise function docstrings with comment on permutations

c53febf

craffel merged commit a4acbfa into mir-evaluation:master Aug 5, 2016

bmcfee modified the milestone: 0.4 Aug 5, 2016

bmcfee mentioned this pull request Feb 24, 2017

Fix source separation unit tests #239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding framewise evaluation for bss_eval_images #212

Adding framewise evaluation for bss_eval_images #212

ecmjohnson commented Jul 26, 2016

craffel Jul 26, 2016

ecmjohnson Jul 28, 2016

craffel commented Jul 26, 2016

faroit commented Jul 28, 2016

faroit Jul 28, 2016

ecmjohnson Jul 28, 2016

craffel commented Jul 28, 2016

faroit commented Jul 28, 2016

craffel commented Jul 29, 2016

carlthome commented Jul 29, 2016 •

edited

Loading

faroit commented Jul 29, 2016

carlthome commented Jul 29, 2016 •

edited

Loading

faroit commented Aug 3, 2016

ecmjohnson commented Aug 3, 2016 •

edited

Loading

ecmjohnson commented Aug 3, 2016 •

edited

Loading

craffel commented Aug 3, 2016

carlthome commented Aug 3, 2016 •

edited

Loading

ecmjohnson commented Aug 4, 2016

carlthome commented Aug 4, 2016

ecmjohnson commented Aug 4, 2016 •

edited

Loading

faroit commented Aug 4, 2016

craffel commented Aug 5, 2016

Adding framewise evaluation for bss_eval_images #212

Adding framewise evaluation for bss_eval_images #212

Conversation

ecmjohnson commented Jul 26, 2016

Framewise Images Evaluation

Testing

craffel Jul 26, 2016

Choose a reason for hiding this comment

ecmjohnson Jul 28, 2016

Choose a reason for hiding this comment

craffel commented Jul 26, 2016

faroit commented Jul 28, 2016

faroit Jul 28, 2016

Choose a reason for hiding this comment

ecmjohnson Jul 28, 2016

Choose a reason for hiding this comment

craffel commented Jul 28, 2016

faroit commented Jul 28, 2016

craffel commented Jul 29, 2016

carlthome commented Jul 29, 2016 • edited Loading

faroit commented Jul 29, 2016

carlthome commented Jul 29, 2016 • edited Loading

faroit commented Aug 3, 2016

ecmjohnson commented Aug 3, 2016 • edited Loading

ecmjohnson commented Aug 3, 2016 • edited Loading

craffel commented Aug 3, 2016

carlthome commented Aug 3, 2016 • edited Loading

ecmjohnson commented Aug 4, 2016

carlthome commented Aug 4, 2016

ecmjohnson commented Aug 4, 2016 • edited Loading

faroit commented Aug 4, 2016

craffel commented Aug 5, 2016

carlthome commented Jul 29, 2016 •

edited

Loading

carlthome commented Jul 29, 2016 •

edited

Loading

ecmjohnson commented Aug 3, 2016 •

edited

Loading

ecmjohnson commented Aug 3, 2016 •

edited

Loading

carlthome commented Aug 3, 2016 •

edited

Loading

ecmjohnson commented Aug 4, 2016 •

edited

Loading