Name		Name	Last commit message	Last commit date
parent directory ..
figs		figs
README.md		README.md
config.yml		config.yml
save_conformer_from_weights.py		save_conformer_from_weights.py
test_conformer.py		test_conformer.py
test_subword_conformer.py		test_subword_conformer.py
tflite_conformer.py		tflite_conformer.py
tflite_subword_conformer.py		tflite_subword_conformer.py
train_conformer.py		train_conformer.py
train_ga_conformer.py		train_ga_conformer.py
train_ga_subword_conformer.py		train_ga_subword_conformer.py
train_subword_conformer.py		train_subword_conformer.py

README.md

Conformer: Convolution-augmented Transformer for Speech Recognition

Reference: https://arxiv.org/abs/2005.08100

Example Model YAML Config

speech_config:
  sample_rate: 16000
  frame_ms: 25
  stride_ms: 10
  feature_type: log_mel_spectrogram
  num_feature_bins: 80
  preemphasis: 0.97
  normalize_signal: True
  normalize_feature: True
  normalize_per_feature: False

decoder_config:
  vocabulary: null
  target_vocab_size: 1024
  max_subword_length: 4
  blank_at_zero: True
  beam_width: 5
  norm_score: True

model_config:
  name: conformer
  subsampling:
    type: conv2
    kernel_size: 3
    strides: 2
    filters: 144
  positional_encoding: sinusoid_concat
  dmodel: 144
  num_blocks: 16
  head_size: 36
  num_heads: 4
  mha_type: relmha
  kernel_size: 32
  fc_factor: 0.5
  dropout: 0.1
  embed_dim: 320
  embed_dropout: 0.0
  num_rnns: 1
  rnn_units: 320
  rnn_type: lstm
  layer_norm: True
  joint_dim: 320

learning_config:
  augmentations:
    after:
      time_masking:
        num_masks: 10
        mask_factor: 100
        p_upperbound: 0.2
      freq_masking:
        num_masks: 1
        mask_factor: 27

  dataset_config:
    train_paths: ...
    eval_paths: ...
    test_paths: ...
    tfrecords_dir: ...

  optimizer_config:
    warmup_steps: 10000
    beta1: 0.9
    beta2: 0.98
    epsilon: 1e-9

  running_config:
    batch_size: 4
    num_epochs: 22
    outdir: ...
    log_interval_steps: 400
    save_interval_steps: 400
    eval_interval_steps: 1000

Usage

Training, see python examples/conformer/train_conformer.py --help

Testing, see python examples/conformer/train_conformer.py --help

TFLite Conversion, see python examples/conformer/tflite_conformer.py --help

Conformer Subwords - Results on LibriSpeech

Summary

Number of subwords: 1031
Maxium length of a subword: 4
Subwords corpus: all training sets, dev sets and test-clean
Number of parameters: 10,341,639
Positional Encoding Type: sinusoid concatenation

Pretrained and Config, go to drive

Transducer Loss

Error Rates

Test-clean	WER (%)	CER (%)
Greedy	6.4476862	2.51828337

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conformer

conformer

README.md

Conformer: Convolution-augmented Transformer for Speech Recognition

Example Model YAML Config

Usage

Conformer Subwords - Results on LibriSpeech

Files

conformer

Directory actions

More options

Directory actions

More options

Latest commit

History

conformer

Folders and files

parent directory

README.md

Conformer: Convolution-augmented Transformer for Speech Recognition

Example Model YAML Config

Usage

Conformer Subwords - Results on LibriSpeech