Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding framewise evaluation for bss_eval_images #212

Merged
merged 19 commits into from
Aug 5, 2016

Conversation

ecmjohnson
Copy link
Contributor

Framewise Images Evaluation

The last of the series: A function has been added in the separation module (bss_eval_images_framewise) that is identical in functionality to bss_eval_sources_framewise, but calls bss_eval_images in place of bss_eval_sources. This is in response to discussions in #68 and would also be useful for #192.

As was done with bss_eval_sources_framewise, if the window and hop parameters will result in fewer than 2 windows being computed the function will simply return the result of bss_eval_images over the signals.

Testing

The required unit and regression testing has been added. The unit tests reuse the functions added for bss_eval_images and bss_eval_sources_framewise with slight modifications to accommodate the differences. Regression test data has been added in the test/data/separation/output0*.json files.

)
# if fewer than 2 windows would be evaluated, return the images result
if nwin < 2:
return bss_eval_images(reference_sources,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this result in the function returning arrays of shape (nsrc,)? It seems for consistency that we should reshape so that it returns arrays of shape (nsrc, 1). But I'm open to discussion about it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea for both the framewise functions.

@craffel
Copy link
Collaborator

craffel commented Jul 26, 2016

Thank you again! Of course, looks good overall; had one comment about the "not enough frames" case (which may apply to sources framewise) and another about collapsing some of your test code, but otherwise seems ready!

@faroit faroit mentioned this pull request Jul 28, 2016
@faroit
Copy link

faroit commented Jul 28, 2016

@carlthome @ErikJohnsonCU @craffel @aliutkus

To continue the discussion on #207 concerning the permutation indices: we actually think it doesn't make sense to provide a framewise permutation, on the other hand it doesn't make sense to provide an aggregated measure (like median) as well.. So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

For SISEC (which is framewise images bss), permutation is not computed, so I would vote for changing the default values to False for all bss scores in mir_eval.

reference_sources : np.ndarray, shape=(nsrc, nsampl)
matrix containing true sources (must have the same shape as
estimated_sources)
estimated_sources : np.ndarray, shape=(nsrc, nsampl)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be np.ndarray, shape=(nsrc, nsampl, nchan)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! I'll correct that.

@craffel
Copy link
Collaborator

craffel commented Jul 28, 2016

So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

Ok, being explicit in the docstring sounds right to me.

For SISEC (which is framewise images bss), permutation is not computed, so I would vote for changing the default values to False for all bss scores in mir_eval.

So is the consensus that permutation=True is wrong in general, or just for framewise? Is there any precedent for permutation=True or permutation=False for non-framewise anywhere else? I hesitate to change this default because it will break backwards compatibility, but if there's consensus that it's wrong in general, that is ok.

@faroit
Copy link

faroit commented Jul 28, 2016

So is the consensus that permutation=True is wrong in general, or just for framewise? Is there any precedent for permutation=True or permutation=False for non-framewise anywhere else? I hesitate to change this default because it will break backwards compatibility, but if there's consensus that it's wrong in general, that is ok.

no it's not wrong in general, it's just not useful for framewise computation. Changing the default to false is just because of very few people do actually need the permutation. But yes, we should not break compatibility here, so maybe lets leave it as it is

@craffel
Copy link
Collaborator

craffel commented Jul 29, 2016

no it's not wrong in general, it's just not useful for framewise computation. Changing the default to false is just because of very few people do actually need the permutation. But yes, we should not break compatibility here, so maybe lets leave it as it is

We could also make it default False for framewise functions and True for the others, but this would be a little messy/confusing.

@carlthome
Copy link
Contributor

carlthome commented Jul 29, 2016

How about computing a global SIR sorting in the framewise functions as well (e.g. concatenating the frames before determining the permutation ndarray)? That might make more sense. Or perhaps it's just an unnecessary limitation.

@faroit
Copy link

faroit commented Jul 29, 2016

How about computing a global SIR sorting in the framewise functions as well (e.g. concatenating the frames before determining the permutation ndarray)?

that would indeed make sense but this would also make the evaluation much slower. In fact, for very long items, consisting of many sources like in DSD100 or MedleyDB, people use framewise to make it computational efficient...

@carlthome
Copy link
Contributor

carlthome commented Jul 29, 2016

For computational performance, how about doing a histogram first and using that for calculating the global sorting? I.e. summing the frames to one (or a few) and then use that to determine the global permutation? Assuming the interference is evenly distributed across an estimated signal, the sum shouldn't mess that up, I think.

Basically, just something like

sorts = Enum('local histogram global')
sort = ...  # Parameter
if sort == sorts.histogram:
    frame = np.average(frames)
elif sort == sorts.global:
    frame = np.concatenate(frames)
# SIR sort estimated sources...

Or maybe this is just silly... Especially if people use framewise evaluation across all songs in DSD100, in which case separation results could be really different due to varying instrumentation etc.

@faroit
Copy link

faroit commented Aug 3, 2016

@carlthome, I like this idea, but I would let the users decide on this and just output framewise values for now.

@ecmjohnson
Copy link
Contributor Author

ecmjohnson commented Aug 3, 2016

We could also make it default False for framewise functions and True for the others, but this would be a little messy/confusing.

@craffel This is what is currently being done. If the default was True for framewise functions and the permutation changed during the evaluation without the user being aware this would result in grossly incorrect results. And if the default was False for non-framewise functions it would break backwards compatibility...

@ecmjohnson
Copy link
Contributor Author

ecmjohnson commented Aug 3, 2016

So we decided, to output the permutations, but we should add a smart docstring for this to make sure users are aware of this problem.

@craffel @faroit @carlthome How does the following sound as a smart docstring comment?

Please be aware that this function does not compute permutations on the possible relations between reference_sources and estimated_sources due to the dangers of a changing permutation. Therefore (by default), it assumes that reference_sources[i] corresponds to estimated_sources[i]. To enable computing permutations please set compute_permutation to be True and check that the returned perm is identical for all windows.

@craffel
Copy link
Collaborator

craffel commented Aug 3, 2016

Sounds reasonable to me!

@carlthome
Copy link
Contributor

carlthome commented Aug 3, 2016

I wouldn't presume that people don't want to SIR sort per frame though. I'd only add a little to the compute_permutation parameter doc, like:

Note: Permutations are computed independently for every frame. For reference_sources[i] to properly correspond to estimated_sources[i], verify that the permutation result is the same for every frame.

The parameter default is obvious from the function signature, and should not be included in the documentation.

@ecmjohnson
Copy link
Contributor Author

I wouldn't presume that people don't want to SIR sort per frame though.

But if a different permutation were computed for one frame, that would artificially buff the results by assuming a different correlation between the references and estimates than previous frames. Changing the compute_permutation parameter to default True could lead to a "blind" user (not reading the code/documentation) accepting the measures when they are incorrect.

A user can still pass compute_permutation = True and sort every frame, but it requires them accepting they will need to verify no change in permutation across all frames.

@carlthome
Copy link
Contributor

Sorry, I was unclear. I absolutely think the default value should be False. I just don't think the docstring should assume there's no use case for framewise permutations.

@ecmjohnson
Copy link
Contributor Author

ecmjohnson commented Aug 4, 2016

@carlthome I totally agree with you. There is definitely a use case for framewise permutations and the docstring should make it clear how to safely compute them.

I've changed the comment slightly to emphasize that not computing permutations is only by default and that it can be safely changed if the user checks the output.

@faroit
Copy link

faroit commented Aug 4, 2016

LGTM, @craffel do we need another code review? Otherwise I think this can be merged now.

@craffel
Copy link
Collaborator

craffel commented Aug 5, 2016

LGTM, thank you all again!

@craffel craffel merged commit a4acbfa into mir-evaluation:master Aug 5, 2016
@bmcfee bmcfee modified the milestone: 0.4 Aug 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants