From cec84de3b676b5fdcd6c2f5dc30fe4b3571ed574 Mon Sep 17 00:00:00 2001 From: 游雁 <zhifu.gzf@alibaba-inc.com> Date: 星期三, 15 二月 2023 20:09:21 +0800 Subject: [PATCH] Merge branch 'main' of github.com:alibaba-damo-academy/FunASR add --- docs/modelscope_usages.md | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 53 insertions(+), 0 deletions(-) diff --git a/docs/modelscope_usages.md b/docs/modelscope_usages.md new file mode 100644 index 0000000..84c8e1d --- /dev/null +++ b/docs/modelscope_usages.md @@ -0,0 +1,53 @@ +# ModelScope Usage +ModelScope is an open-source model-as-service platform supported by Alibaba, which provides flexible and convenient model applications for users in academia and industry. For specific usages and open source models, please refer to [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition). In the domain of speech, we provide autoregressive/non-autoregressive speech recognition, speech pre-training, punctuation prediction and other models, which are convenient for users. + +## Overall Introduction +We provide the usages of different models under the `egs_modelscope`, which supports directly employing our provided models for inference, as well as finetuning the models we provided as pre-trained initial models. Next, we will introduce the model provided in the `egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch` directory, including `infer.py`, `finetune.py` and `infer_after_finetune .py`. The corresponding functions are as follows: +- `infer.py`: perform inference on the specified dataset based on our provided model +- `finetune.py`: employ our provided model as the initial model for fintuning +- `infer_after_finetune.py`: perform inference on the specified dataset based on the finetuned model + +## Inference +We provide `infer.py` to achieve the inference. Based on this file, users can preform inference on the specified dataset based on our provided model and obtain the corresponding recognition results. If the transcript is given, the `CER` will be calculated at the same time. Before performing inference, users can set the following parameters to modify the inference configuration: +* `data_dir`锛歞ataset directory. The directory should contain the wav list file `wav.scp` and the transcript file `text` (optional). For the format of these two files, please refer to the instructions in [Quick Start](./get_started.md). If the `text` file exists, the CER will be calculated accordingly, otherwise it will be skipped. +* `output_dir`锛歵he directory for saving the inference results +* `batch_size`锛歜atch size during the inference +* `ctc_weight`锛歴ome models contain a CTC module, users can set this parameter to specify the weight of the CTC module during the inference + +In addition to directly setting parameters in `infer.py`, users can also manually set the parameters in the `decoding.yaml` file in the model download directory to modify the inference configuration. + +## Finetuning +We provide `finetune.py` to achieve the finetuning. Based on this file, users can finetune on the specified dataset based on our provided model as the initial model to achieve better performance in the specificed domain. Before finetuning, users can set the following parameters to modify the finetuning configuration: +* `data_path`锛歞ataset directory銆俆his directory should contain the `train` directory for saving the training set and the `dev` directory for saving the validation set. Each directory needs to contain the wav list file `wav.scp` and the transcript file `text` +* `output_dir`锛歵he directory for saving the finetuning results +* `dataset_type`锛歠or small dataset锛宻et as `small`锛沠or dataset larger than 1000 hours锛宻et as `large` +* `batch_bins`锛歜atch size锛宨f dataset_type is set as `small`锛宼he unit of batch_bins is the number of fbank feature frames; if dataset_type is set as `large`, the unit of batch_bins is milliseconds +* `max_epoch`锛歵he maximum number of training epochs + +The following parameters can also be set. However, if there is no special requirement, users can ignore these parameters and use the default value we provided directly: +* `accum_grad`锛歵he accumulation of the gradient +* `keep_nbest_models`锛歴elect the `keep_nbest_models` models with the best performance and average the parameters + of these models to get a better model +* `optim`锛歴et the optimizer +* `lr`锛歴et the learning rate +* `scheduler`锛歴et learning rate adjustment strategy +* `scheduler_conf`锛歴et the related parameters of the learning rate adjustment strategy +* `specaug`锛歴et for the spectral augmentation +* `specaug_conf`锛歴et related parameters of the spectral augmentation + +In addition to directly setting parameters in `finetune.py`, users can also manually set the parameters in the `finetune.yaml` file in the model download directory to modify the finetuning configuration. + +## Inference after Finetuning +We provide `infer_after_finetune.py` to achieve the inference based on the model finetuned by users. Based on this file, users can preform inference on the specified dataset based on the finetuned model and obtain the corresponding recognition results. If the transcript is given, the `CER` will be calculated at the same time. Before performing inference, users can set the following parameters to modify the inference configuration: +* `data_dir`锛歞ataset directory銆俆he directory should contain the wav list file `wav.scp` and the transcript file `text` (optional). If the `text` file exists, the CER will be calculated accordingly, otherwise it will be skipped. +* `output_dir`锛歵he directory for saving the inference results +* `batch_size`锛歜atch size during the inference +* `ctc_weight`锛歴ome models contain a CTC module, users can set this parameter to specify the weight of the CTC module during the inference +* `decoding_model_name`锛歴et the name of the model used for the inference + +The following parameters can also be set. However, if there is no special requirement, users can ignore these parameters and use the default value we provided directly: +* `modelscope_model_name`锛歵he initial model name used when finetuning +* `required_files`锛歠iles required for the inference when using the modelscope interface + +## Announcements +Some models may have other specific parameters during the finetuning and inference. The usages of these parameters can be found in the `README.md` file in the corresponding directory. \ No newline at end of file -- Gitblit v1.9.1