EvaluationBase Class

Abstract Base Class for objective evaluations

class spiegelib.evaluation.EvaluationBase(targets, estimations)

Bases: abc.ABC

Example Input:

targets = [target_1, target_2]
estimations = [[prediction_1_for_target_1, prediction_2_for_target_1],
               [prediction_1_for_target_2, prediction_2_for_target_2]]

Where prediction_1 and prediction_2 would represent results from two different methods or sources. For example, audio for all prediction_1 samples might be from a GA and all audio for prediction_2 samples might be from a deep learning approach.

Parameters
  • targets (list) – a list of objects to use as the ground truth for evaluation

  • estimations (lits) – a 2D list of objects. Should contain a list of objects representing estimations for each target. The position of an object in each list is used to distinguish between different sources. For example, if you are comparing two different synthesizer programming methods, then you would want to make sure to have the results from each method in the same position in each list.

abstract evaluate_target(target, predictions)

Abstract method. Must be implemented and evaluate a single target and predictions made for that target. Called automatically by evaluate()

Parameters
  • target (list) – Audio to use as ground truth in evaluation

  • predictions (list) – list of AudioBuffer objects to evaluate against the target audio file.

Returns

A list of dictionaries with stats for each prediction evaluation

Return type

list

evaluate()

Run evaluation. Calls evaluate_target on all targets and creates a dictionary of metrics stored in the scores attribute.

Saves each prediction for a target in a dictionary keyed by the position in the prediction list that it was constructed with - uses the key ‘source_#’. Where # is the position index.

get_scores()
Returns

scores calculated during evaluation

Return type

dict

get_stats()
Returns

stats that summarize the scores for each source using mean, median, standard deviation, minimum, and maximum.

Return type

dict

save_scores_json(path)

Save scores to a JSON file

Parameters

path (str) – location to save JSON file

save_stats_json(path)

Save score statistics as a JSON file

Parameters

path (str) – location to save JSON file

plot_hist(sources, metric, bins=None, clip_upper=None, **kwargs)

Plot a histogram of results of evaluation. Uses Matplotlib.

Parameters
  • sources (list) – Which audio sources to include in histogram. [0] would use the first prediction source passed in during construction, [1] would use the seconds, etc.

  • metric (str) – Which metric to use for creating the histogram. Depends on which were used during evaluation.

  • bins (int or sequence or str, optional) – passed into matplotlib hist method and indicates the number of bins to use, or if it is a list then it dfines bin edges. With Numpy 1.11 or newer, you can alternatively provide a string describing a binning strategy, such as ‘auto’, ‘sturges’, ‘fd’, ‘doane’, ‘scott’, ‘rice’ or ‘sqrt’

  • clipper_upper (number, optional) – Set an upper range for input values. This can be used to force any values above a certain range into the right most hitogram bin.

  • kwargs – Keyword arguments to be passed into hist

verify_input_list(input_list)

Base method for verifying input list. Override to implement verification. For example, override and call verify_audio_input_list() on input_list to verify that AudioBuffer objects are being passed in.

static verify_audio_input_list(input_list)

Static method for verifying input lists with audio buffers

static euclidian_distance(A, B)

Calculates the euclidian distance between two arrays.

Parameters
  • A (np.ndarray) – First array (Ground truth)

  • B (np.ndarray) – Second array (Prediction)

Returns

Euclidian distance

Return type

float

static manhattan_distance(A, B)

Calculates the manhattan distance between two arrays.

Parameters
  • A (np.ndarray) – First array (Ground truth)

  • B (np.ndarray) – Second array (Prediction)

Returns

Manhattan distance

Return type

float

static mean_abs_error(A, B)

Calculates mean absolute error between two arrays. Mean(ABS(A-B)).

Parameters
  • A (np.ndarray) – First array (Ground truth)

  • B (np.ndarray) – Second array (Prediction)

Returns

Mean absolute error

Return type

float

static mean_squared_error(A, B)

Calculates mean squared error between two arrays. Mean(Square(A-B)).

Parameters
  • A (np.ndarray) – First array (Ground truth)

  • B (np.ndarray) – Second array (Prediction)

Returns

Mean squared error

Return type

float