| | |
| | | - pcm_path, `e.g.`: asr_example.pcm, |
| | | - audio bytes stream, `e.g.`: bytes data from a microphone |
| | | - audio sample point,`e.g.`: `audio, rate = soundfile.read("asr_example_zh.wav")`, the dtype is numpy.ndarray or torch.Tensor |
| | | - wav.scp, kaldi style wav list (`wav_id \t wav_path``), `e.g.`: |
| | | - wav.scp, kaldi style wav list (`wav_id \t wav_path`), `e.g.`: |
| | | ```text |
| | | asr_example1 ./audios/asr_example1.wav |
| | | asr_example2 ./audios/asr_example2.wav |
| | |
| | | - `max_epoch`: number of training epoch |
| | | - `lr`: learning rate |
| | | |
| | | - Training data formats: |
| | | ```sh |
| | | cat ./example_data/text |
| | | BAC009S0002W0122 而 对 楼 市 成 交 抑 制 作 用 最 大 的 限 购 |
| | | BAC009S0002W0123 也 成 为 地 方 政 府 的 眼 中 钉 |
| | | english_example_1 hello world |
| | | english_example_2 go swim 去 游 泳 |
| | | |
| | | cat ./example_data/wav.scp |
| | | BAC009S0002W0122 /mnt/data/wav/train/S0002/BAC009S0002W0122.wav |
| | | BAC009S0002W0123 /mnt/data/wav/train/S0002/BAC009S0002W0123.wav |
| | | english_example_1 /mnt/data/wav/train/S0002/english_example_1.wav |
| | | english_example_2 /mnt/data/wav/train/S0002/english_example_2.wav |
| | | ``` |
| | | |
| | | - Then you can run the pipeline to finetune with: |
| | | ```shell |
| | | python finetune.py |
| | |
| | | --njob 64 \ |
| | | --checkpoint_dir "./checkpoint" \ |
| | | --checkpoint_name "valid.cer_ctc.ave.pb" |
| | | ``` |
| | | ``` |