Audio Feature Extraction

The following classes can be used to extract relevant features from audio. Audio feature extraction helps to transform raw audio data into a representation that is more representative of human perception. This representation is often used to provide a more meaningful comparison of audio files and can be used as input to machine learning approaches and to evaluate results.

All audio features in SpiegeLib currently use librosa. The FeaturesBase class is an abstract base class that defines an interface for performing feature extraction within SpiegeLib. All feature extraction classes inherit from FeaturesBase and provide a wrapper to functions within librosa.

Data Scaling

Data scaling is an important step in pre-processing data prior to machine learning algorithms. All classes that inherit from FeaturesBase have a scaler attribute which holds a data scaler object and can be used to normalize or standardize feature extraction results. These scalers are inspired by the scalers implemented in sklearn, but can handle datasets with more dimensions. They are also simplified and designed to fit into feature extraction pipelines.

Some examples demonstrating how scalers are integrated with feature extraction are provided in the StandardScaler