From 53a753755bd235ad1fe341ba4199e0b8e197a505 Mon Sep 17 00:00:00 2001 From: 嘉渊 <wangjiaming.wjm@alibaba-inc.com> Date: 星期三, 24 五月 2023 11:44:05 +0800 Subject: [PATCH] update repo --- docs/academic_recipe/asr_recipe.md | 10 ++++------ 1 files changed, 4 insertions(+), 6 deletions(-) diff --git a/docs/academic_recipe/asr_recipe.md b/docs/academic_recipe/asr_recipe.md index 327fb7b..7f727e7 100644 --- a/docs/academic_recipe/asr_recipe.md +++ b/docs/academic_recipe/asr_recipe.md @@ -83,7 +83,6 @@ ### Stage 2: Dictionary Preparation This stage processes the dictionary, which is used as a mapping between label characters and integer indices during ASR training. The processed dictionary file is saved as `$feats_dir/data/$lang_toekn_list/$token_type/tokens.txt`. An example of `tokens.txt` is as follows: -* `tokens.txt` ``` <blank> <s> @@ -95,10 +94,10 @@ 榫� <unk> ``` -* `<blank>`: indicates the blank token for CTC -* `<s>`: indicates the start-of-sentence token -* `</s>`: indicates the end-of-sentence token -* `<unk>`: indicates the out-of-vocabulary token +* `<blank>`: indicates the blank token for CTC, must be in the first line +* `<s>`: indicates the start-of-sentence token, must be in the second line +* `</s>`: indicates the end-of-sentence token, must be in the third line +* `<unk>`: indicates the out-of-vocabulary token, must be in the last line ### Stage 3: LM Training @@ -146,7 +145,6 @@ * Performance We adopt `CER` to verify the performance. The results are in `$exp_dir/exp/$model_dir/$decoding_yaml_name/$average_model_name/$dset`, namely `text.cer` and `text.cer.txt`. `text.cer` saves the comparison between the recognized text and the reference text while `text.cer.txt` saves the final `CER` results. The following is an example of `text.cer`: -* `text.cer` ``` ... BAC009S0764W0213(nwords=11,cor=11,ins=0,del=0,sub=0) corr=100.00%,cer=0.00% -- Gitblit v1.9.1