-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transcription eval with velocity #283
Conversation
This submodule follows the conventions of :mod:`mir_eval.transcription` and | ||
additionally requires velocities to be provided as MIDI velocities in the range | ||
[0, 127]. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validation code does not seem to enforce the upper bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True - it's a bit odd, because the only reason to specify MIDI velocities is so that this line makes sense:
velocity_range = max(1, max_velocity - min_velocity)
It's ok if max_velocity - min_velocity = 0
because that just means all the velocities in the reference annotation are the same velocity, but if 0 < max_velocity - min_velocity < 1
then it's a bit odd to clip it to 1. Instead of picking some epsilon, I thought it would be simpler to just specify velocity should be MIDI velocity. It actually doesn't matter if the velocity range is [0, 127] or [0, N] for some large N, since the velocity range will just get rescaled to [0, 1]
anyways. So there's not really any reason to check for < 127
, if that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable. I think the docstring should be amended to state the actual assumptions of the code though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What other assumptions do you think should be listed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't require that velocity is <127, why not just say that it has to be a non-negative number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because
if 0 < max_velocity - min_velocity < 1 then it's a bit odd to clip it to 1.
So non-negative is not really sufficient to describe the intended range. I can't really think of a helpful concrete way to describe it, like "velocity should be non-negative, but make sure the range is pretty big, probably you should just use MIDI velocities but if you use some other large velocity range you'll probably be ok as long as the min - max is greater than 1".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style-wise, LGTM! There's a bit of an ambiguity (IMO) in the docstring relating to how the velocity data is validated, but that's the only thing that stuck out to me.
I didn't thoroughly pick over the tests.
Any tips for what was unclear? I avoided copying the math and description from the paper, but I can expand however you see fit. |
Just tested this code on the data from https://arxiv.org/abs/1710.11153 and got the same mean F-score, so should be good to go from a "matches previous implementation" standpoint. |
This pull request implements the velocity-aware transcription evaluation procedure described in Onsets and Frames: Dual-Objective Piano Transcription. Details on this metric and evaluation can be found therein and in the docstring for the new
transcription_velocity
submodule.I also fixed
generate_data.py
to sort the results ofglob
, and made.coveragerc
to show the missing lines in test logs.I'd like to get a LGTM from @bmcfee or @justinsalamon for
mir_eval
conventions and reasonableness and an LGTM from @cghawthorne or @ekelsen that this eval matches what was used in that paper.