Data Augment¶
Augment¶
-
class
openspeech.data.audio.augment.
JoiningAugment
[source]¶ Data augment by concatenating audio signals
- Inputs:
signal: np.ndarray [shape=(n,)] audio time series
- Returns: signal
signal: concatenated signal
-
class
openspeech.data.audio.augment.
NoiseInjector
(noise_dataset_dir: str, sample_rate: int = 16000, noise_level: float = 0.7)[source]¶ Provides noise injection for noise augmentation.
- The noise augmentation process is as follows:
1: Randomly sample audios by noise_size from dataset 2: Extract noise from audio_paths 3: Add noise to sound
- Parameters
- Inputs: signal
signal: signal from audio file
- Returns: signal
signal: noise added signal
-
class
openspeech.data.audio.augment.
SpecAugment
(freq_mask_para: int = 18, time_mask_num: int = 10, freq_mask_num: int = 2)[source]¶ Provides Spec Augment. A simple data augmentation method for speech recognition. This concept proposed in https://arxiv.org/abs/1904.08779
- Parameters
- Inputs: feature_vector
feature_vector (torch.FloatTensor): feature vector from audio file.
- Returns: feature_vector:
feature_vector: masked feature vector.
-
class
openspeech.data.audio.augment.
TimeStretchAugment
(min_rate: float = 0.7, max_rate: float = 1.4)[source]¶ Time-stretch an audio series by a fixed rate.
- Inputs:
signal: np.ndarray [shape=(n,)] audio time series
- Returns
np.ndarray [shape=(round(n/rate),)] audio time series stretched by the specified rate
- Return type
y_stretch