API Reference¶
Core functionality¶
-
birdvoxclassify.core.
apply_hierarchical_consistency
(formatted_pred_dict, taxonomy, level_threshold_dict=None, detection_threshold=0.5)[source]¶ Obtain the best predicted candidate class for a prediction at all taxonomic levels, enforcing “top-down” hierarchical consistency. That is, starting from the “coarsest” taxonomic level, if the most probable class is considered “present” (estimated probability greater than a threshold), it is considered the best candidate for that level, and only taxonomic children of this class will be considered when choosing candidates for “finer” taxonomic levels. If the most probable class is not considered “present” (estimated probability below the same threshold), then the “other” class is chosen as the best candidate, with the probability assigned to be the complement of the most probable “consistent” class.
Parameters: - formatted_pred_dict : dict
Formatted dictionary of predictions.
- taxonomy : dict or None [default:
None
] Taxonomy JSON object used to apply hierarchical consistency. If
None
, thenhierarchical_consistency
must beFalse
.- level_threshold_dict : dict or None [default:
None
] Optional dictionary of detection thresholds for each taxonomic level.
- detection_threshold : float [default:
0.5
] Detection threshold applied uniformly to all classes at all levels. If
level_threshold_dict
is provided, this is ignored.
Returns: - best_candidates_dict : dict
Formatted dictionary specifying the best candidate for each taxonomic level.
-
birdvoxclassify.core.
batch_generator
(filepath_list, batch_size=512)[source]¶ Returns a generator that, from a list of filepaths, yields batches of PCEN images and the corresponding filenames.
Parameters: - filepath_list : list[str]
(Non-empty) list of filepaths to audio files for which to generate batches of PCEN images and the corresponding filenames
- batch_size : int [default:
512
] Size of yielded batches
Yields: - batch : np.ndarray [shape: (batch_size, top_freq_id, n_hops, 1)]
PCEN batch
- batch_filepaths : list[str]
List of filepaths corresponding to the clips in the batch
-
birdvoxclassify.core.
compute_pcen
(audio, sr, input_format=True)[source]¶ Computes PCEN (per-channel-energy normalization) for the given audio clip.
Parameters: - audio : np.ndarray [shape: (N,)]
Audio array
- sr : int
Sample rate
- input_format : bool [default:
True
] If True, adds an additional channel dimension (of size 1) and ensures that a fixed number of PCEN frames (corresponding to
get_pcen_settings()['n_hops']
) is returned. If number of frames is greater, the center frames are returned. If the the number of frames is less, empty frames are padded.
Returns: - pcen : np.ndarray [shape: (top_freq_id, n_hops, 1) or (top_freq_id, num_frames)]
Per-channel energy normalization processed Mel spectrogram. If
input_format=True
, will be in shape(top_freq_id, n_hops, 1)
. Otherwise it will be in shape(top_freq_id, num_frames)
, wherenum_frames
is the number of PCEN frames for the entire audio clip.
-
birdvoxclassify.core.
format_pred
(pred_list, taxonomy)[source]¶ Formats a list of predictions for a single audio clip into a more human-readable JSON object using the given taxonomy object.
The output will be in the following format:
{ <prediction level> : { <taxonomy id> : { "probability": <float>, "common_name": <str>, "scientific_name": <str>, "taxonomy_level_names": <str>, "taxonomy_level_aliases": <dict of aliases>, "child_ids": <list of children IDs> }, ... }, ... }
Parameters: - pred_list : list[np.ndarray [shape (1, num_labels) or (num_labels,)]
List of predictions at the taxonomical levels predicted by the model for a single example.
num_labels
may be different for each of the different levels of the taxonomy.- taxonomy : dict
Taxonomy JSON object
Returns: - formatted_pred_dict : dict
Prediction dictionary object
-
birdvoxclassify.core.
format_pred_batch
(batch_pred_list, taxonomy)[source]¶ Formats a list of predictions for a batch of audio clips into a more human-readable JSON object using the given taxonomy object. The output will be in the form of a list of JSON objects in the format returned by
format_pred
.Parameters: - batch_pred_list : list[np.ndarray [shape (batch_size, num_labels)] ]
List of predictions at the taxonomical levels predicted by the model for a batch of examples.
num_labels
may be different for each of the different levels of the taxonomy.- taxonomy : dict
Taxonomy JSON object
Returns: - pred_dict_list : list[dict]
List of JSON dictionary objects
-
birdvoxclassify.core.
get_batch_best_candidates
(batch_pred_list=None, batch_formatted_pred_list=None, taxonomy=None, hierarchical_consistency=True)[source]¶ Obtain the best candidate classes for each prediction in a batch.
Parameters: - batch_pred_list : list or None [default:
None
] List of batch predictions. If not provided,
batch_formatted_pred_list
must be provided.- batch_formatted_pred_list : list or None [default:
None
] List of formatted batch predictions. If not provided,
batch_pred_list
must be provided.- taxonomy : dict or None [default:
None
] Taxonomy JSON object used to apply hierarchical consistency. If
None
, thenhierarchical_consistency
must beFalse
.- hierarchical_consistency : bool [default:
True
] If
True
, apply hierarchical consistency to predictions.
Returns: - batch_best_candidates_list : list
List of formatted dictionaries specifying the best candidates for each taxonomic level.
- batch_pred_list : list or None [default:
-
birdvoxclassify.core.
get_best_candidates
(pred_list=None, formatted_pred_dict=None, taxonomy=None, hierarchical_consistency=True)[source]¶ Obtain the best predicted candidate class for a prediction at all taxonomic levels. The output will be in the following format:
{ <prediction level> : { "probability": <float>, "common_name": <str>, "scientific_name": <str>, "taxonomy_level_names": <str>, "taxonomy_level_aliases": <dict of aliases>, "child_ids": <list of children IDs> }, ... }
Parameters: - pred_list : list[np.ndarray [shape (1, num_labels) or (num_labels,)] or None [default:
None
] List of predictions at the taxonomical levels predicted by the model for a single example. If provided,
taxonomy
, must also be provided.If not provided,
formatted_pred_dict
must be provided.- formatted_pred_dict : dict or None [default:
None
] Formatted dictionary of predictions. If not provided,
pred_list
must be provided.- taxonomy : dict or None [default:
None
] Taxonomy JSON object used to apply hierarchical consistency. If
None
, thenhierarchical_consistency
must beFalse
.- hierarchical_consistency : bool [default:
True
] If
True
, apply hierarchical consistency to predictions.
Returns: - best_candidates_dict : dict
Formatted dictionary specifying the best candidate for each taxonomic level.
- pred_list : list[np.ndarray [shape (1, num_labels) or (num_labels,)] or None [default:
-
birdvoxclassify.core.
get_model_path
(model_name)[source]¶ Returns path to the bird species classification model of the given name.
Parameters: - model_name : str
Name of classifier model. Should be in format
<model id>_<taxonomy version>-<taxonomy md5sum>
. v0.3.1 UPDATE: model names with taxonomy md5 checksum ``2e7e1bbd434a35b3961e315cfe3832fc`` or ``beb9234f0e13a34c7ac41db72e85addd`` are not available in this version but are restored in v0.3.1 for backwards compatibility. They will no longer be supported starting with v0.4. Please use model names with taxonomy md5 checksums 3c6d869456b2705ea5805b6b7d08f870 and 2f6efd9017669ef5198e48d8ec7dce4c (respectively) instead.
Returns: - model_path : str
Path to classifier model weights. Should be in format
<BirdVoxClassify dir>/resources/models/<model id>_<taxonomy version>-<taxonomy md5sum>.h5
-
birdvoxclassify.core.
get_output_path
(filepath, suffix, output_dir)[source]¶ Returns output path to file containing bird species classification predictions for a given audio clip file.
Parameters: - filepath : str
Path to audio file to be processed
- suffix : str
String to append to filename (including extension)
- output_dir : str or None
Path to directory where file will be saved. If None, will use directory of given filepath.
Returns: - output_path : str
Path to output file
-
birdvoxclassify.core.
get_pcen_settings
()[source]¶ Returns dictionary of Mel spectrogram and PCEN parameters for preparing the input to the bird species classification models.
Returns: - pcen_settings : dict[str, *]
Dictionary of Mel spectrogram and PCEN parameters
-
birdvoxclassify.core.
get_taxonomy_node
(ref_id, taxonomy)[source]¶ Gets node in taxonomy corresponding to the given reference ID (e.g.
1.4.1
)Parameters: - ref_id : str
Taxonomy reference ID
- taxonomy : dict
Taxonomy JSON object
Returns: - node : dict[str, *]
Taxonomy node, containing information about the entity corresponding to the given taxonomy reference ID
-
birdvoxclassify.core.
get_taxonomy_path
(model_name)[source]¶ Get the path to the taxonomy corresponding to the model of the given name.
Specifically, with a model name of the format:
<model id>_<taxonomy version>-<taxonomy md5sum>
the path to taxonomy file
<BirdVoxClassify dir>/resources/taxonomy/<taxonomy version>.json
is returned. The MD5 checksum of this file is compared to <taxonomy md5sum> to ensure that the content of the taxonomy file matches the format of the output that the model is expected to produce.Parameters: - model_name : str
Name of model. Should be in format <model id>_<taxonomy version>-<taxonomy md5sum>. v0.3.1 UPDATE: model names with taxonomy md5 checksums ``2e7e1bbd434a35b3961e315cfe3832fc`` or ``beb9234f0e13a34c7ac41db72e85addd`` are not available in this version but are restored in v0.3.1 for backwards compatibility. They will no longer be supported starting with v0.4. Please use model names with taxonomy md5 checksums 3c6d869456b2705ea5805b6b7d08f870 and 2f6efd9017669ef5198e48d8ec7dce4c (respectively) instead.
Returns: - taxonomy_path : str
Path to taxonomy file, which should be in format <BirdVoxClassify dir>/resources/taxonomy/<taxonomy version>.json
-
birdvoxclassify.core.
load_classifier
(model_name)[source]¶ Loads bird species classification model of the given name.
Parameters: - model_name : str
Name of classifier model. Should be in format
<model id>_<taxonomy version>-<taxonomy md5sum>
. v0.3.1 UPDATE: model names with taxonomy md5 checksum ``2e7e1bbd434a35b3961e315cfe3832fc`` or ``beb9234f0e13a34c7ac41db72e85addd`` are not available in this version but are restored in v0.3.1 for backwards compatibility. They will no longer be supported starting with v0.4. Please use model names with taxonomy md5 checksums 3c6d869456b2705ea5805b6b7d08f870 and 2f6efd9017669ef5198e48d8ec7dce4c (respectively) instead.
Returns: - classifier : keras.models.Model
Bird species classification model
-
birdvoxclassify.core.
load_taxonomy
(taxonomy_path)[source]¶ Loads taxonomy JSON file as an OrderedDict to ensure consistent ordering.
Parameters: - taxonomy_path : str
Path to taxonomy file.
Returns: - taxonomy : OrderedDict
Taxonomy object
-
birdvoxclassify.core.
predict
(pcen, classifier, logger_level=20)[source]¶ Performs bird species classification on PCEN arrays using the given model.
Parameters: - pcen : np.ndarray [shape (n_mels, n_hops, 1) or (batch_size, n_mels, n_hops, 1)
PCEN array for a single clip or a batch of clips
- classifier : keras.models.Model
Bird species classification model object
- logger_level : int [default:
logging.INFO
] Logger level
Returns: - pred_list : list[np.ndarray [shape (batch_size or 1, num_labels)] ]
List of predictions at the taxonomical levels predicted by the model. num_labels may be different for each of the different levels of the taxonomy. If a single example is given (i.e. there is no batch dimension in the input PCEN),
batch_size = 1
.
-
birdvoxclassify.core.
process_file
(filepaths, output_dir=None, output_summary_path=None, classifier=None, taxonomy=None, batch_size=512, suffix='', select_best_candidates=False, hierarchical_consistency=True, logger_level=20, model_name='birdvoxclassify-taxonet_tv1hierarchical-3c6d869456b2705ea5805b6b7d08f870')[source]¶ Runs bird species classification model on one or more audio clips.
Parameters: - filepaths : list or str
Filepath or list of filepaths of audio files for which to run prediction
- output_dir : str or None [default:
None
] Output directory used for outputting per-file prediction JSON files. If
None
, no per-file prediction JSON files are produced.- output_summary_path : str or None [default:
None
] Output path for summary prediction JSON file for all processed audio files. If
None
, no summary prediction file is produced.- classifier : keras.models.Model or None [default:
None
] Bird species classification model object. If
None
, the model corresponding tomodel_name
is loaded.- taxonomy : dict or None [default:
None
] Taxonomy JSON object. If
None
, the taxonomy corresponding tomodel_name
is loaded.- batch_size : int [default:
512
] Batch size for predictions
- suffix : str [default:
""
] String to append to filename
- select_best_candidates : bool [default:
False
] If
True
, best candidates will be provided in output dictionary instead of all classes and their probabilities.- hierarchical_consistency : bool [default:
True
] If
True
and ifselect_best_candidates
isTrue
, apply hierarchical consistency when selecting best candidates.- logger_level : int [default:
logging.INFO
] Logger level
- model_name : str [default birdvoxclassify.DEFAULT_MODEL_NAME]
Name of classifier model. Should be in format
<model id>_<taxonomy version>-<taxonomy md5sum>
. v0.3.1 UPDATE: model names with taxonomy md5sum ``2e7e1bbd434a35b3961e315cfe3832fc`` or ``beb9234f0e13a34c7ac41db72e85addd`` are not available in this version but are restored in v0.3.1 for backwards compatibility. They will no longer be supported starting with v0.4. Please use model names with taxonomy md5 checksums 3c6d869456b2705ea5805b6b7d08f870 and 2f6efd9017669ef5198e48d8ec7dce4c (respectively) instead.
Returns: - output_dict : dict[str, dict]
Output dictionary mapping audio filename to prediction dictionary. If
select_best_candidates
isFalse
, the dictionary is in the format produced byformat_pred
. Otherwise, the dictionary is in the format produced byget_best_candidates
.