Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-f0 frame level eval metrics #186

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,14 @@ The following subsections document each submodule.
:show-inheritance:
:member-order: bysource

:mod:`mir_eval.multipitch`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully multipitch is an intentional reference...

----------------------
.. automodule:: mir_eval.multipitch
:members:
:undoc-members:
:show-inheritance:
:member-order: bysource

:mod:`mir_eval.onset`
---------------------
.. automodule:: mir_eval.onset
Expand Down
62 changes: 62 additions & 0 deletions evaluators/multipitch_eval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/usr/bin/env python
'''
Utility script for computing all multipitch metrics.

Usage:

./multipitch_eval.py REFERENCE.TXT ESTIMATED.TXT
'''

from __future__ import print_function
import argparse
import sys
import os
import eval_utilities

import mir_eval


def process_arguments():
'''Argparse function to get the program parameters'''

parser = argparse.ArgumentParser(description='mir_eval multipitch '
'detection evaluation')

parser.add_argument('-o',
dest='output_file',
default=None,
type=str,
action='store',
help='Store results in json format')

parser.add_argument('reference_file',
action='store',
help='path to the reference annotation file')

parser.add_argument('estimated_file',
action='store',
help='path to the estimated annotation file')

return vars(parser.parse_args(sys.argv[1:]))


if __name__ == '__main__':
# Get the parameters
parameters = process_arguments()

# Load in data
ref_times, ref_freqs = mir_eval.io.load_ragged_time_series(
parameters['reference_file'])
est_times, est_freqs = mir_eval.io.load_ragged_time_series(
parameters['estimated_file'])

# Compute all the scores
scores = mir_eval.multipitch.evaluate(
ref_times, ref_freqs, est_times, est_freqs)
print("{} vs. {}".format(os.path.basename(parameters['reference_file']),
os.path.basename(parameters['estimated_file'])))
eval_utilities.print_evaluation(scores)

if parameters['output_file']:
print('Saving results to: ', parameters['output_file'])
eval_utilities.save_results(scores, parameters['output_file'])
1 change: 1 addition & 0 deletions mir_eval/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from . import util
from . import sonify
from . import melody
from . import multipitch
from . import pattern
from . import tempo
from . import hierarchy
Expand Down
92 changes: 92 additions & 0 deletions mir_eval/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -477,3 +477,95 @@ def load_key(filename, delimiter=r'\s+'):
warnings.warn(error.args[0])

return key_string


def load_ragged_time_series(filename, dtype=float, delimiter=r'\s+',
header=False):
r"""Utility function for loading in data from a delimited time series
annotation file with a variable number of columns.
Assumes that column 0 contains time stamps and columns 1 through n contain
values. n may be variable from time stamp to time stamp.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small (3-line) example input here might be helpful.


Examples
--------
>>> # Load a ragged list of tab-delimited multi-f0 midi notes
>>> times, vals = load_ragged_time_series('multif0.txt', dtype=int,
delimiter='\t')
>>> # Load a raggled list of space delimited multi-f0 values with a header
>>> times, vals = load_ragged_time_series('labeled_events.csv',
header=True)

Parameters
----------
filename : str
Path to the annotation file
dtype : function
Data type to apply to values columns.
delimiter : str
Separator regular expression.
By default, lines will be split by any amount of whitespace.
header : bool
Indicates whether a header row is present or not.
By default, assumes no header is present.

Returns
-------
times : np.ndarray
array of timestamps (float)
values : list of np.ndarray
list of arrays of corresponding values

"""
# Initialize empty lists
times = []
values = []

# Create re object for splitting lines
splitter = re.compile(delimiter)

# Keep track of whether we create our own file handle
own_fh = False
# If the filename input is a string, need to open it
if isinstance(filename, six.string_types):
# Remember that we need to close it later
own_fh = True
# Open the file for reading
input_file = open(filename, 'r')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could leak file descriptors if there's an error anywhere below (before the call to close).

For this sort of thing, it's better to make a wrapper function that returns the file descriptor, and then use a context manager: with my_open(buf_or_str, 'r') as fdisc: ....

See here for an example of this in jams.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @rabitt copied that from the other loaders, so if we change that we should change all loaders. The logic is copied from numpy: https://github.com/numpy/numpy/blob/master/numpy/lib/npyio.py#L477
The difference being that we potentially raise an exception and they don't, I suppose.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference being that we potentially raise an exception and they don't, I suppose.

Yup, that's exactly the point. I guess the descriptor could close when it goes out of scope, but it still seems ugly to me.

copied that from the other loaders

all the more reason to abstract this logic out and do it right. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... note, if this problem exists elsewhere, I have no problem with leaving the present code as is and fixing it in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll leave this for a future pr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened Issue #191

# If the provided has a read attribute, we can use it as a file handle
elif hasattr(filename, 'read'):
input_file = filename
# Raise error otherwise
else:
raise ValueError('filename must be a string or file handle')
if header:
start_row = 1
else:
start_row = 0
for row, line in enumerate(input_file, start_row):
# Split each line using the supplied delimiter
data = splitter.split(line.strip())
try:
converted_time = float(data[0])
except (TypeError, ValueError) as exe:
six.raise_from(ValueError("Couldn't convert value {} using {} "
"found at {}:{:d}:\n\t{}".format(
data[0], float.__name__,
filename, row, line)), exe)
times.append(converted_time)

# cast values to a numpy array. time stamps with no values are cast
# to an empty array.
try:
converted_value = np.array(data[1:], dtype=dtype)
except (TypeError, ValueError) as exe:
six.raise_from(ValueError("Couldn't convert value {} using {} "
"found at {}:{:d}:\n\t{}".format(
data[1:], dtype.__name__,
filename, row, line)), exe)
values.append(converted_value)

# Close the file handle if we opened it
if own_fh:
input_file.close()

return np.array(times), values
Loading