Openspeech’s configurations¶
This page describes all configurations in Openspeech.
audio¶
mfcc¶
name: Name of dataset.sample_rate: Sampling rate of audioframe_length: Frame length for spectrogramframe_shift: Length of hop between STFTdel_silence: Flag indication whether to apply delete silence or notnum_mels: The number of mfc coefficients to retain.apply_spec_augment: Flag indication whether to apply spec augment or notapply_noise_augment: Flag indication whether to apply noise augment or notapply_time_stretch_augment: Flag indication whether to apply time stretch augment or notapply_joining_augment: Flag indication whether to apply audio joining augment or not
melspectrogram¶
name: Name of dataset.sample_rate: Sampling rate of audioframe_length: Frame length for spectrogramframe_shift: Length of hop between STFTdel_silence: Flag indication whether to apply delete silence or notnum_mels: The number of mfc coefficients to retain.apply_spec_augment: Flag indication whether to apply spec augment or notapply_noise_augment: Flag indication whether to apply noise augment or notapply_time_stretch_augment: Flag indication whether to apply time stretch augment or notapply_joining_augment: Flag indication whether to apply audio joining augment or not
fbank¶
name: Name of dataset.sample_rate: Sampling rate of audioframe_length: Frame length for spectrogramframe_shift: Length of hop between STFTdel_silence: Flag indication whether to apply delete silence or notnum_mels: The number of mfc coefficients to retain.apply_spec_augment: Flag indication whether to apply spec augment or notapply_noise_augment: Flag indication whether to apply noise augment or notapply_time_stretch_augment: Flag indication whether to apply time stretch augment or notapply_joining_augment: Flag indication whether to apply audio joining augment or not
spectrogram¶
name: Name of dataset.sample_rate: Sampling rate of audioframe_length: Frame length for spectrogramframe_shift: Length of hop between STFTdel_silence: Flag indication whether to apply delete silence or notnum_mels: Spectrogram is independent of mel, but uses the ‘num_mels’ variable to unify feature size variablesapply_spec_augment: Flag indication whether to apply spec augment or notapply_noise_augment: Flag indication whether to apply noise augment or notapply_time_stretch_augment: Flag indication whether to apply time stretch augment or notapply_joining_augment: Flag indication whether to apply audio joining augment or not
augment¶
default¶
apply_spec_augment: Flag indication whether to apply spec augment or notapply_noise_augment: Flag indication whether to apply noise augment or not Noise augment requiresnoise_dataset_path.noise_dataset_dirshould be contain audio files.apply_joining_augment: Flag indication whether to apply joining augment or not If true, create a new audio file by connecting two audio randomlyapply_time_stretch_augment: Flag indication whether to apply spec augment or notfreq_mask_para: Hyper Parameter for freq masking to limit freq masking lengthfreq_mask_num: How many freq-masked area to maketime_mask_num: How many time-masked area to makenoise_dataset_dir: How many time-masked area to makenoise_level: Noise adjustment leveltime_stretch_min_rate: Minimum rate of audio time stretchtime_stretch_max_rate: Maximum rate of audio time stretch
dataset¶
kspon¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasettest_dataset_path: Path of evaluation datasetmanifest_file_path: Path of manifest filetest_manifest_dir: Path of directory contains test manifest filespreprocess_mode: KsponSpeech preprocess mode {phonetic, spelling}
libri¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasetdataset_download: Flag indication whether to download dataset or not.manifest_file_path: Path of manifest file
aishell¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasetdataset_download: Flag indication whether to download dataset or not.manifest_file_path: Path of manifest file
ksponspeech¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasettest_dataset_path: Path of evaluation datasetmanifest_file_path: Path of manifest filetest_manifest_dir: Path of directory contains test manifest filespreprocess_mode: KsponSpeech preprocess mode {phonetic, spelling}
librispeech¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasetdataset_download: Flag indication whether to download dataset or not.manifest_file_path: Path of manifest file
lm¶
dataset: Select dataset for training (librispeech, ksponspeech, aishell, lm)dataset_path: Path of datasetvalid_ratio: Ratio of validation datatest_ratio: Ratio of test data
model¶
listen_attend_spell¶
model_name: Model namenum_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.hidden_state_dim: The hidden state dimension of encoder.encoder_dropout_p: The dropout probability of encoder.encoder_bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)joint_ctc_attention: Flag indication joint ctc attention or notmax_length: Max decoding length.num_attention_heads: The number of attention heads.decoder_dropout_p: The dropout probability of decoder.decoder_attn_mechanism: The attention mechanism for decoder.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
listen_attend_spell_with_location_aware¶
model_name: Model namenum_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.hidden_state_dim: The hidden state dimension of encoder.encoder_dropout_p: The dropout probability of encoder.encoder_bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)joint_ctc_attention: Flag indication joint ctc attention or notmax_length: Max decoding length.num_attention_heads: The number of attention heads.decoder_dropout_p: The dropout probability of decoder.decoder_attn_mechanism: The attention mechanism for decoder.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
listen_attend_spell_with_multi_head¶
model_name: Model namenum_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.hidden_state_dim: The hidden state dimension of encoder.encoder_dropout_p: The dropout probability of encoder.encoder_bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)joint_ctc_attention: Flag indication joint ctc attention or notmax_length: Max decoding length.num_attention_heads: The number of attention heads.decoder_dropout_p: The dropout probability of decoder.decoder_attn_mechanism: The attention mechanism for decoder.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
joint_ctc_listen_attend_spell¶
model_name: Model namenum_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.hidden_state_dim: The hidden state dimension of encoder.encoder_dropout_p: The dropout probability of encoder.encoder_bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)joint_ctc_attention: Flag indication joint ctc attention or notmax_length: Max decoding length.num_attention_heads: The number of attention heads.decoder_dropout_p: The dropout probability of decoder.decoder_attn_mechanism: The attention mechanism for decoder.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
deep_cnn_with_joint_ctc_listen_attend_spell¶
model_name: Model namenum_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.hidden_state_dim: The hidden state dimension of encoder.encoder_dropout_p: The dropout probability of encoder.encoder_bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)extractor: The CNN feature extractor.activation: Type of activation functionjoint_ctc_attention: Flag indication joint ctc attention or notmax_length: Max decoding length.num_attention_heads: The number of attention heads.decoder_dropout_p: The dropout probability of decoder.decoder_attn_mechanism: The attention mechanism for decoder.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
deepspeech2¶
model_name: Model namernn_type: Type of rnn cell (rnn, lstm, gru)num_rnn_layers: The number of rnn layersrnn_hidden_dim: Hidden state dimenstion of RNN.dropout_p: The dropout probability of model.bidirectional: If True, becomes a bidirectional encodersactivation: Type of activation functionoptimizer: Optimizer for training.
lstm_lm¶
model_name: Model namenum_layers: The number of encoder layers.hidden_state_dim: The hidden state dimension of encoder.dropout_p: The dropout probability of encoder.rnn_type: Type of rnn cell (rnn, lstm, gru)max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.optimizer: Optimizer for training.
rnn_transducer¶
model_name: Model nameencoder_hidden_state_dim: Dimension of encoder.decoder_hidden_state_dim: Dimension of decoder.num_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.encoder_dropout_p: The dropout probability of encoder.decoder_dropout_p: The dropout probability of decoder.bidirectional: If True, becomes a bidirectional encodersrnn_type: Type of rnn cell (rnn, lstm, gru)output_dim: Dimension of outputsoptimizer: Optimizer for training.
transformer_lm¶
model_name: Model namenum_layers: The number of encoder layers.d_model: The dimension of model.d_ff: The dimenstion of feed forward network.num_attention_heads: The number of attention heads.dropout_p: The dropout probability of encoder.max_length: Max decoding length.optimizer: Optimizer for training.
transformer¶
model_name: Model named_model: Dimension of model.d_ff: Dimenstion of feed forward network.num_attention_heads: The number of attention heads.num_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.encoder_dropout_p: The dropout probability of encoder.decoder_dropout_p: The dropout probability of decoder.ffnet_style: Style of feed forward network. (ff, conv)max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.joint_ctc_attention: Flag indication joint ctc attention or notoptimizer: Optimizer for training.
joint_ctc_transformer¶
model_name: Model nameextractor: The CNN feature extractor.d_model: Dimension of model.d_ff: Dimenstion of feed forward network.num_attention_heads: The number of attention heads.num_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.encoder_dropout_p: The dropout probability of encoder.decoder_dropout_p: The dropout probability of decoder.ffnet_style: Style of feed forward network. (ff, conv)max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.joint_ctc_attention: Flag indication joint ctc attention or notoptimizer: Optimizer for training.
transformer_with_ctc¶
model_name: Model named_model: Dimension of model.d_ff: Dimenstion of feed forward network.num_attention_heads: The number of attention heads.num_encoder_layers: The number of encoder layers.encoder_dropout_p: The dropout probability of encoder.ffnet_style: Style of feed forward network. (ff, conv)optimizer: Optimizer for training.
vgg_transformer¶
model_name: Model nameextractor: The CNN feature extractor.d_model: Dimension of model.d_ff: Dimenstion of feed forward network.num_attention_heads: The number of attention heads.num_encoder_layers: The number of encoder layers.num_decoder_layers: The number of decoder layers.encoder_dropout_p: The dropout probability of encoder.decoder_dropout_p: The dropout probability of decoder.ffnet_style: Style of feed forward network. (ff, conv)max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.joint_ctc_attention: Flag indication joint ctc attention or notoptimizer: Optimizer for training.
conformer¶
model_name: Model nameencoder_dim: Dimension of encoder.num_encoder_layers: The number of encoder layers.num_attention_heads: The number of attention heads.feed_forward_expansion_factor: The expansion factor of feed forward module.conv_expansion_factor: The expansion factor of convolution module.input_dropout_p: The dropout probability of inputs.feed_forward_dropout_p: The dropout probability of feed forward module.attention_dropout_p: The dropout probability of attention module.conv_dropout_p: The dropout probability of convolution module.conv_kernel_size: The kernel size of convolution.half_step_residual: Flag indication whether to use half step residual or notoptimizer: Optimizer for training.
conformer_lstm¶
model_name: Model nameencoder_dim: Dimension of encoder.num_encoder_layers: The number of encoder layers.num_attention_heads: The number of attention heads.feed_forward_expansion_factor: The expansion factor of feed forward module.conv_expansion_factor: The expansion factor of convolution module.input_dropout_p: The dropout probability of inputs.feed_forward_dropout_p: The dropout probability of feed forward module.attention_dropout_p: The dropout probability of attention module.conv_dropout_p: The dropout probability of convolution module.conv_kernel_size: The kernel size of convolution.half_step_residual: Flag indication whether to use half step residual or notnum_decoder_layers: The number of decoder layers.decoder_dropout_p: The dropout probability of decoder.max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.rnn_type: Type of rnn cell (rnn, lstm, gru)decoder_attn_mechanism: The attention mechanism for decoder.optimizer: Optimizer for training.
conformer_transducer¶
model_name: Model nameencoder_dim: Dimension of encoder.num_encoder_layers: The number of encoder layers.num_attention_heads: The number of attention heads.feed_forward_expansion_factor: The expansion factor of feed forward module.conv_expansion_factor: The expansion factor of convolution module.input_dropout_p: The dropout probability of inputs.feed_forward_dropout_p: The dropout probability of feed forward module.attention_dropout_p: The dropout probability of attention module.conv_dropout_p: The dropout probability of convolution module.conv_kernel_size: The kernel size of convolution.half_step_residual: Flag indication whether to use half step residual or notnum_decoder_layers: The number of decoder layers.decoder_dropout_p: The dropout probability of decoder.max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.rnn_type: Type of rnn cell (rnn, lstm, gru)decoder_hidden_state_dim: Hidden state dimension of decoder.decoder_output_dim: Output dimension of decoder.optimizer: Optimizer for training.
joint_ctc_conformer_lstm¶
model_name: Model nameencoder_dim: Dimension of encoder.num_encoder_layers: The number of encoder layers.num_attention_heads: The number of attention heads.feed_forward_expansion_factor: The expansion factor of feed forward module.conv_expansion_factor: The expansion factor of convolution module.input_dropout_p: The dropout probability of inputs.feed_forward_dropout_p: The dropout probability of feed forward module.attention_dropout_p: The dropout probability of attention module.conv_dropout_p: The dropout probability of convolution module.conv_kernel_size: The kernel size of convolution.half_step_residual: Flag indication whether to use half step residual or notnum_decoder_layers: The number of decoder layers.decoder_dropout_p: The dropout probability of decoder.num_decoder_attention_heads: The number of decoder attention heads.max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.rnn_type: Type of rnn cell (rnn, lstm, gru)decoder_attn_mechanism: The attention mechanism for decoder.optimizer: Optimizer for training.
transformer_transducer¶
model_name: Model nameencoder_dim: Dimension of encoder named_ff: Dimension of feed forward networknum_audio_layers: Number of audio layersnum_label_layers: Number of label layersnum_attention_heads: Number of attention headsaudio_dropout_p: Dropout probability of audio layerlabel_dropout_p: Dropout probability of label layerdecoder_hidden_state_dim: Hidden state dimension of decoderdecoder_output_dim: Dimension of model output.conv_kernel_size: Kernel size of convolution layer.max_positional_length: Max length of positional encoding.optimizer: Optimizer for training.
quartznet5x5¶
model_name: Model namenum_blocks: Number of quartznet blocksnum_sub_blocks: Number of quartznet sub blocksin_channels: Input channels of jasper blocksout_channels: Output channels of jasper block’s convolutionkernel_size: Kernel size of jasper block’s convolutiondilation: Dilation of jasper block’s convolutiondropout_p: Dropout probabilityoptimizer: Optimizer for training.
quartznet10x5¶
model_name: Model namenum_blocks: Number of quartznet blocksnum_sub_blocks: Number of quartznet sub blocksin_channels: Input channels of jasper blocksout_channels: Output channels of jasper block’s convolutionkernel_size: Kernel size of jasper block’s convolutiondilation: Dilation of jasper block’s convolutiondropout_p: Dropout probabilityoptimizer: Optimizer for training.
quartznet15x5¶
model_name: Model namenum_blocks: Number of quartznet5x5 blocksnum_sub_blocks: Number of quartznet5x5 sub blocksin_channels: Input channels of jasper blocksout_channels: Output channels of jasper block’s convolutionkernel_size: Kernel size of jasper block’s convolutiondilation: Dilation of jasper block’s convolutiondropout_p: Dropout probabilityoptimizer: Optimizer for training.
contextnet¶
model_name: Model namemodel_size: Model sizeinput_dim: Dimension of input vectornum_encoder_layers: The number of convolution layerskernel_size: Value of convolution kernel sizenum_channels: The number of channels in the convolution filterencoder_dim: Dimension of encoder output vectoroptimizer: Optimizer for training
contextnet_lstm¶
model_name: Model namemodel_size: Model sizeinput_dim: Dimension of input vectornum_encoder_layers: The number of convolution layersnum_decoder_layers: The number of decoder layers.kernel_size: Value of convolution kernel sizenum_channels: The number of channels in the convolution filterencoder_dim: Dimension of encoder output vectornum_attention_heads: The number of attention heads.attention_dropout_p: The dropout probability of attention module.decoder_dropout_p: The dropout probability of decoder.max_length: Max decoding length.teacher_forcing_ratio: The ratio of teacher forcing.rnn_type: Type of rnn cell (rnn, lstm, gru)decoder_attn_mechanism: The attention mechanism for decoder.optimizer: Optimizer for training.
contextnet_transducer¶
model_name: Model namemodel_size: Model sizeinput_dim: Dimension of input vectornum_encoder_layers: The number of convolution layersnum_decoder_layers: The number of rnn layerskernel_size: Value of convolution kernel sizenum_channels: The number of channels in the convolution filterhidden_dim: The number of features in the decoder hidden stateencoder_dim: Dimension of encoder output vectordecoder_output_dim: Dimension of decoder output vectordropout: Dropout probability of decoderrnn_type: Type of rnn celloptimizer: Optimizer for training
jasper5x3¶
model_name: Model namenum_blocks: Number of jasper blocksnum_sub_blocks: Number of jasper sub blocksin_channels: Input channels of jasper blocksout_channels: Output channels of jasper block’s convolutionkernel_size: Kernel size of jasper block’s convolutiondilation: Dilation of jasper block’s convolutiondropout_p: Dropout probabilityoptimizer: Optimizer for training.
jasper10x5¶
model_name: Model namenum_blocks: Number of jasper blocksnum_sub_blocks: Number of jasper sub blocksin_channels: Input channels of jasper blocksout_channels: Output channels of jasper block’s convolutionkernel_size: Kernel size of jasper block’s convolutiondilation: Dilation of jasper block’s convolutiondropout_p: Dropout probabilityoptimizer: Optimizer for training.
criterion¶
label_smoothed_cross_entropy¶
criterion_name: Criterion name for training.reduction: Reduction method of criterionsmoothing: Ratio of smoothing loss (confidence = 1.0 - smoothing)
joint_ctc_cross_entropy¶
criterion_name: Criterion name for training.reduction: Reduction method of criterionctc_weight: Weight of ctc loss for training.cross_entropy_weight: Weight of cross entropy loss for training.smoothing: Ratio of smoothing loss (confidence = 1.0 - smoothing)zero_infinity: Whether to zero infinite losses and the associated gradients.
perplexity¶
criterion_name: Criterion name for trainingreduction: Reduction method of criterion
transducer¶
criterion_name: Criterion name for training.reduction: Reduction method of criteriongather: Reduce memory consumption.
ctc¶
criterion_name: Criterion name for trainingreduction: Reduction method of criterionzero_infinity: Whether to zero infinite losses and the associated gradients.
cross_entropy¶
criterion_name: Criterion name for trainingreduction: Reduction method of criterion
lr_scheduler¶
reduce_lr_on_plateau¶
lr: Learning ratescheduler_name: Name of learning rate scheduler.lr_patience: Number of epochs with no improvement after which learning rate will be reduced.lr_factor: Factor by which the learning rate will be reduced. new_lr = lr * factor.
warmup¶
lr: Learning ratescheduler_name: Name of learning rate scheduler.peak_lr: Maximum learning rate.init_lr: Initial learning rate.warmup_steps: Warmup the learning rate linearly for the first N updatestotal_steps: Total training steps.
warmup_reduce_lr_on_plateau¶
lr: Learning ratescheduler_name: Name of learning rate scheduler.lr_patience: Number of epochs with no improvement after which learning rate will be reduced.lr_factor: Factor by which the learning rate will be reduced. new_lr = lr * factor.peak_lr: Maximum learning rate.init_lr: Initial learning rate.warmup_steps: Warmup the learning rate linearly for the first N updates
tri_stage¶
lr: Learning ratescheduler_name: Name of learning rate scheduler.init_lr: Initial learning rate.init_lr_scale: Initial learning rate scale.final_lr_scale: Final learning rate scalephase_ratio: Automatically sets warmup/hold/decay steps to the ratio specified here from max_updates. the ratios must add up to 1.0total_steps: Total training steps.
transformer¶
lr: Learning ratescheduler_name: Name of learning rate scheduler.peak_lr: Maximum learning rate.final_lr: Final learning rate.final_lr_scale: Final learning rate scalewarmup_steps: Warmup the learning rate linearly for the first N updatesdecay_steps: Steps in decay stages
trainer¶
cpu¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPU
gpu¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPUauto_select_gpus: If enabled and gpus is an integer, pick available gpus automatically.
tpu¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPUuse_tpu: If set True, will train with GPUtpu_cores: Number of TPU cores
gpu-fp16¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPUauto_select_gpus: If enabled and gpus is an integer, pick available gpus automatically.precision: Double precision (64), full precision (32) or half precision (16). Can be used on CPU, GPU or TPUs.amp_backend: The mixed precision backend to use (“native” or “apex”)
tpu-fp16¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPUuse_tpu: If set True, will train with GPUtpu_cores: Number of TPU coresprecision: Double precision (64), full precision (32) or half precision (16). Can be used on CPU, GPU or TPUs.amp_backend: The mixed precision backend to use (“native” or “apex”)
cpu-fp64¶
seed: Seed for training.accelerator: Previously known as distributed_backend (dp, ddp, ddp2, etc…).accumulate_grad_batches: Accumulates grads every k batches or as set up in the dict.num_workers: The number of cpu coresbatch_size: Size of batchcheck_val_every_n_epoch: Check val every n train epochs.gradient_clip_val: 0 means don’t clip.logger: Training logger. {wandb, tensorboard}max_epochs: Stop training once this number of epochs is reached.auto_scale_batch_size: If set to True, will initially run a batch size finder trying to find the largest batch size that fits into memory.name: Trainer namedevice: Training device.use_cuda: If set True, will train with GPUprecision: Double precision (64), full precision (32) or half precision (16). Can be used on CPU, GPU or TPUs.amp_backend: The mixed precision backend to use (“native” or “apex”)
tokenizer¶
libri_subword¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.vocab_size: Size of vocabulary.vocab_path: Path of vocabulary file.
libri_character¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.vocab_path: Path of vocabulary file.
aishell_character¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.vocab_path: Path of vocabulary file.
kspon_subword¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.sp_model_path: Path of sentencepiece model.vocab_size: Size of vocabulary.
kspon_grapheme¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.vocab_path: Path of vocabulary file.
kspon_character¶
sos_token: Start of sentence tokeneos_token: End of sentence tokenpad_token: Pad tokenblank_token: Blank token (for CTC training)encoding: Encoding of vocabunit: Unit of vocabulary.vocab_path: Path of vocabulary file.