-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
display submodule #196
display submodule #196
Conversation
I'm not crazy about this name because you can also use labels in
Ok.
Sounds good. Maybe we can include a small example of how to set an axis' color cycler? I bet many people have not done this. Or, whatever.
What's |
Yeah, I think if we provide examples showing the output of each, it will be obvious enough.
I figure most people will just do this via |
Here's an example of hierarchical segment plots, comparing two annotations from salami: plt.figure(figsize=(12, 8))
ax = plt.subplot(2,1,1)
mir_eval.display.labeled_intervals(int2, lab2, ax=ax, alpha=0.8, label='small')
mir_eval.display.labeled_intervals(int1, lab1, ax=ax, alpha=0.8, label='functions')
plt.legend(loc='best')
plt.title('Annotator 1')
ax = plt.subplot(2,1,2)
mir_eval.display.labeled_intervals(int4, lab4, ax=ax, alpha=0.8, label='small')
mir_eval.display.labeled_intervals(int3, lab3, ax=ax, alpha=0.8, label='functions')
plt.title('Annotator 2')
plt.legend(loc='best') I think I generally prefer this to the analogous MSAF-style plot: plt.figure(figsize=(12, 8))
ax = plt.subplot(2,1,1)
mir_eval.display.segments(int1, lab1, base=0.5, ax=ax, text=True)
mir_eval.display.segments(int2, lab2, height=0.5, ax=ax, text=True)
plt.title('Annotator 1')
ax = plt.subplot(2,1,2)
mir_eval.display.segments(int3, lab3, base=0.5, ax=ax, text=True)
mir_eval.display.segments(int4, lab4, height=0.5, ax=ax, text=True)
plt.title('Annotator 2') The first one is a little easier to read (IMO) for a few reasons:
Anyone have thoughts for how the segment plot can be improved? @urinieto ? |
The only thing I like a bit more in the "analogous MSAF-style" is that it's visually easier for me to see which subsegments in the lower plot correspond to larger segments in the upper plot. The vertical separation in the first one makes this a little less obvious to me. This could be solved maybe by drawing dashed lines in the lower plot corresponding to boundaries in the upper plot, or something. Playing devil's advocate, now:
This is the worse part, IMO, of the "analogous MSAF-style", but could pretty easily be solved if you dive back into automatic color cycling.
Well, so does vertical position - I don't think it's hard to differentiate levels in the "analogous MSAF-style".
These could be y labels.
I think if there was no color re-use it would be about as easy either way, assuming there were not too many colors. |
Really? I find it pretty easy, but maybe that's because I know what to expect. We could add ticks and grid lines, but it would probably get cluttered quickly.
Not really; you'd eventually have to rely on texture and alpha channels. Even then, the biggest complaint that I've gotten from people is that colors don't mean anything across hierarchy levels, but they really want them to. I think it's best to avoid that whole hornet's nest if we can.
For two levels, no, but for deeper structures it can get messy.
Do you mean tick marks on the y axis? It's possible, sure.
I disagree, but that's anecdotal. Opinions from others would be useful here. |
Yeah, I have to keep changing focus from top to bottom plot. It's not that bad.
Fair enough.
Wouldn't lower just mean finer structure regardless of the number of levels? Or is that not how hierarchical structure is defined?
Yep. |
Pinging @justinsalamon re: pitch and notes. I see in your prototype that you used rectangles for notes, and the y axis was set to a midi (numeric) scale. I think this can be plugged directly into I think we can do a little better if we define a FuncFormatter that can reinterpret midi numbers as pitches (with octave and cent deviation) or frequencies (Hz). I'd prefer this to raw numbers or explicit ticker strings (like librosa currently does) because it will behave nicely with interactive plotting/zooming/etc. For contours, we've talked about plotting on a cent axis. I think it makes the most sense to just map to continuous/fractional midi numbers via It would also be nice to plot contours directly in frequency space for easy overlaying on spectrograms. Ultimately, this should be easy to do with a flag. The only nontrivial part of pitch contour plotting then will be voicing detection and grouping of samples into fragments. Am I missing anything here? |
hmmm, maybe? MIDI notes can but don't have to be quantized. If I understand correctly labeled intervals builds the y-axis labels from the data labels? This might not a good option for non-quantized pitch data?
👍
For melody this shouldn't be a problem since there's only (at most) one active pitch per timestamp. Voicing can be indicated by a change of style I suppose (color? line type? alpha?). Grouping samples into contours becomes a tough cooky for multi-f0, especially since the mir_eval format (correct me @rabitt if I'm wrong) doesn't specify which samples belong to which contour. Maybe apply basic TF-clustering as a best guess? |
Yeah, fair point. We might have to refactor a bit to separate the patch logic from the vertical positioning. The bigger issue though is that we don't have a great way to specify the height of rectangles: should they span an entire semitone? That doesn't seem quite right. Maybe it would be better to just dump these into a line plot instead?
I was going to just not plot the unvoiced segments, but i suppose you might want those anyway. As a dirty hack you could simply to
ugh... yeah, i forgot about multi-f0 plotting here. This seems like a problem that needs to be resolved at the multipitch API first, and display later. I really dislike the idea of clustering tf values in the display module to sort that out. |
Quick proof of concept for pitch contour plotting: ref_times, ref_freqs = mir_eval.io.load_labeled_events('/home/bmcfee/git/mir_eval/tests/data/melody/ref00.txt')
est_times, est_freqs = mir_eval.io.load_labeled_events('/home/bmcfee/git/mir_eval/tests/data/melody/est00.txt')
plt.figure(figsize=(12, 4))
mir_eval.display.pitch(ref_times, ref_freqs, unvoiced=True, midi=True,
linewidth=4, alpha=0.5, label='Reference')
mir_eval.display.pitch(est_times, est_freqs, unvoiced=True, midi=True,
linewidth=1, alpha=1, label='Estimate')
plt.legend(loc='best')
plt.tight_layout() It allows you to toggle plotting of unvoiced regions (which are depicted with the same style, but half the alpha as the voiced regions). A region is considered "voiced" if all of its frequencies are nonzero (and freq_to_voicing says so); this avoids underflow in the log scaling (midi units) conversion. Styles are inherited from the axis line cycler, and the same style is applied to all fragments within a call to |
This is a matter of taste. My personal preference would be for each note to span a semitone (in height), since it matches the convention used in MIDI editors in DAWs so I find it intuitive. While it's true that a note doesn't inherently "span" a pitch range, I think the most common use case for this visualization will be for notes quantized on a semitone grid, for which this styling would be most helpful I think.
Yeah you definitely want to be able to display unvoiced pitch estimates. Not sure what your hack exapmle achieves?
Even if you sort it for the JAMS API, once you pass it to mir_eval it will be converted to the MIREX format which doesn't specify grouping. So as long as mir_eval supports the MIREX format, it's not clear to me how this can be solved on an API level? |
This looks almost right to me:
What do you mean by this? Why do you need to infer if a region is voiced beyond what's already indicated by the raw data? |
style choices are left as an exercise to the reader. also note: this is using mpl 2.0.0b1, which may have some kinks to work out.
A region can be "unvoiced" with either negative or 0 frequency. If it's negative, it gets mapped to positive, and voicing is set to False. If it's 0, well, it stays 0. When you go to take a log for midi conversion, explosions happen. In general, I think you just don't want to show 0-frequency values at all, since they distort the axes and convey no information. Am I wrong here? |
Yes, that's correct. A 0 Hz estimate means "I detected no pitch here" and there's no reason to display it. A negative value means "I detected pitch here but I don't think it's part of the melody (but it might be if I'm wrong)", which is what we want to display as "unvoiced". |
Nice! |
And with the piano roll: ref_times, ref_pitches = mir_eval.io.load_valued_intervals('/home/bmcfee/git/mir_eval/tests/data/transcription/ref03.txt')
est_times, est_pitches = mir_eval.io.load_valued_intervals('/home/bmcfee/git/mir_eval/tests/data/transcription/est03.txt')
plt.figure(figsize=(8, 4))
mir_eval.display.piano_roll(ref_times, ref_pitches, label='Reference', alpha=0.5)
mir_eval.display.piano_roll(est_times, est_pitches, label='Estimate', alpha=0.5, facecolor='r')
plt.ylim(ref_midis.min(), ref_midis.max())
plt.grid('on')
plt.legend(loc='best')
mir_eval.display.ticker_notes()
plt.tight_layout() after some zooming in interactive mode: |
Nice! I'm confused about the y-axis gridlines: shouldn't notes be centered on the pitch values? It seems like right now they sit "on top" of the actual pitch value? Another side issue is the density of the y-grid. For a piano roll I'd like to always have a y-tick (and label) for every semitone, independently of the zoom level of the plot. Otherwise it gets a little tricky to determine the pitch of some notes without counting with your fingers on the screen :) |
They're baseline-aligned to the bars; the ticks/grid lines go below the relevant note, but the text labels are where they should be. I don't want to have them run through the rects because that would put the patch positions at non-integer values, and generally make life a pain. (If you don't believe me, try manipulating a bar plot some time.)
Yup, just added minor ticks at all semitones for the piano roll plotter. Labels are still dynamically generated; if you want them in specific locations, that's your business as a discerning user with impeccable taste. |
It might be easier to implement, but it's hella-confusing. The note above the D#6 label is a D#6? eek! |
hella you talkin about? That's the same thing: the label "C1" is parallel to the notes at pitch C1. EDIT derp, my bad. The |
Not sure I follow. In the logic screenshot above, the label C1 is vertically aligned with notes representing a C1. In the piano roll figure, the labels are vertically aligned with the gridlines, which aren't aligned with any of the notes. EDIT: this might be one of those things that's more easily hashed out in person with a whiteboard EDIT: ahh... yes, now it makes sense 👍 |
Okay folks. I feel like the bulk of this, not counting source separation is basically done from a prototyping perspective. Here's a notebook that you can play with, feedback is welcome. At this point, it's worth thinking about how we should test things. Matplotlib-style image comparison regression tests? They're kind of a pain to set up, but I'm not sure what else we can do. |
Thanks @bmcfee. I am currently moving everything out of my house into a big box so won't get a chance to look at this for a day or two. One thing that popped up and I don't think was resolved:
(and the ensuing discussion). I think the only sane thing to do is to not group at all, and just plot multi-f0 as non-connected dots. That's how they're evaluated, anyways (right?). I definitely do not think TF clustering is something we want to reckon with, unless there is a very simple approach which does the right thing the vast majority of the time. |
Seems like a nice idea, but it seems a little outside of mir_eval's style as written above. Inferring beat position from (static) tempo doesn't seem precise enough to me -- I'd rather have a way to visualize known/estimated beat/downbeat positions. We already have event viz for beats/downbeats, but they display as lines instead of tickers. I think this is reasonable because you can also read off the time of the beats instead of just metrical position, which should be relatively obvious if you plot downbeats as well. |
Ok, spent some time playing with this. It's great. I had a lot of fun visualizing some
|
In retrospect I think this is a bad convention.
shouldn't it be
That should be outside the scope of mir_eval; there's an open pr/discussion on matplotlib to do more intelligent automatic range limiting.
It's in the docstring. If you put a legend on a hierarchy plot,
What would you expect it to do?
I still think we shouldn't do that; instead, accept an input data structure that indexes tf values by contour id.
I was going for consistency with the separation metrics, but sure.
I tried a whole lot to work that out, but nothing seemed to do what I wanted. Maybe it's possible if you render each spectrogram as RGBA image first, and then average in RGBA space. |
Haha, I could see why you'd say that. Fair enough.
From
It's the same issue as above:
Yeah, or provide a kwarg which does that, or something. But we don't need to worry about that now.
I think they all specify e.g.
It seems (from the point of view of an outsider who has no idea what's going on) that, under some threshold, alpha is set to 0. Is there any easy way of modifying that threshold, so that more of the spectrogram blobs get made transparent? |
Can you give me an example that demonstrates this behavior? I'm not sure what you mean by "new figure" (as opposed to new axes); when I do: fig = plt.gcf()
mir_eval.display.events(events, figure=fig) it works as expected. Are you doing something differently?
I think what's happening here is that |
Not really. The colormaps are generated by linear interpolation; what you're describing would require a nonlinear interpolation, and exposing that in a clean way to the user is non-trivial. I also looked into making the maximum alpha value tuneable. This is easy in mpl 2.0 via |
Yeah, it sucks. Basically, there's no way to attach legend entries (labels) to colormeshes, so we have to make proxy artists and label those instead. To make that work, we need to first construct a legend, and there's no way to do that without drawing the legend. (There might be a way to do it by making 0px-by-0px patches attached to the axes, but that seems ... perilous.) At any rate, it doesn't bother me too much since the separation plots are completely uninterpretable without a legend, whereas the remaining plots may or may not be. |
That makes sense. The
Ok.
Can we call |
Apparently not. You can hide the legend that way, but then drawing it again doesn't work. At best, it would have to be something like: ax = mir_eval.display.separation(some garbage)
ax.get_legend().set_visible(true) instead of |
Ok, got it working with 0x0 rect patches. It's a brutal hack, but it works how you describe, and as an added bonus, makes it easier to manipulate the legend options. Apart from the transparency thing in separation, I think I've hit all of @craffel's points. |
Okay, added a maximum alpha value for source separation display. It doesn't achieve everything we'd want, but it does make it so that sources with comparable amplitudes (near max) don't mask each other out. |
Nice. This and the legend looks nice. Final nitpick: Setting
|
@@ -346,6 +346,7 @@ def test_separation_label(): | |||
|
|||
plt.legend() | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's done that way for consistency with |
Sure, I'm just used to matplotlib fitting the axis limits to the content,
but we can do something different. Unless there's anything left for you to
do, feel free to merge!
|
Oh, I misunderstood -- I thought you were talking about the API. I fixed the limit detection logic to do the right thing, but the API is designed for consistency across functions. |
@craffel btw, this should be ready to go. If you want to run that last example through one more time, please do so before I merge. |
Merged, thank you again! |
👍 💯 |
This implements #28
TODO: