From 5e2825f0ba4eab635a064598f57793d9fc83ab0d Mon Sep 17 00:00:00 2001
From: speech_asr <wangjiaming.wjm@alibaba-inc.com>
Date: 星期三, 15 二月 2023 15:49:20 +0800
Subject: [PATCH] update docs

---
 docs_cn/modelscope_usages.md |    4 ++--
 docs/modelscope_usages.md    |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/docs/modelscope_usages.md b/docs/modelscope_usages.md
new file mode 100644
index 0000000..d42416a
--- /dev/null
+++ b/docs/modelscope_usages.md
@@ -0,0 +1,52 @@
+# 蹇�熶娇鐢∕odelScope
+ModelScope is an open-source model-as-service platform supported by Alibaba, which provides flexible and convenient model applications for users in academia and industry. For specific usages and open source models, please refer to [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition). In the domain of speech, we provide autoregressive/non-autoregressive speech recognition, speech pre-training, punctuation prediction and other models, which are convenient for users.
+
+## Overall Introduction
+We provide the usages of different models under the `egs_modelscope`, which supports directly employing our provided models for inference, as well as finetuning the models we provided as pre-trained initial models. Next, we will introduce the model provided in the `egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch` directory, including `infer.py`, `finetune.py` and `infer_after_finetune .py`. The corresponding functions are as follows:
+- `infer.py`: perform inference on the specified dataset based on our provided model
+- `finetune.py`: employ our provided model as the initial model for fintuning
+- `infer_after_finetune.py`: perform inference on the specified dataset based on the finetuned model
+
+## Inference
+We provide `infer.py` to achieve the inference. Based on this file, users can preform inference on the specified dataset based on our provided model and obtain the corresponding recognition results. If the transcript is given, the `CER` will be calculated at the same time. Before performing inference, users can specify the following parameters to modify the inference configuration:
+* `data_dir`锛氭暟鎹泦鐩綍銆傜洰褰曚笅搴旇鍖呮嫭闊抽鍒楄〃鏂囦欢`wav.scp`鍜屾妱鏈枃浠禶text`(鍙��)锛屽叿浣撴牸寮忓彲浠ュ弬瑙乕蹇�熷紑濮媇(./get_started.md)涓殑璇存槑銆傚鏋渀text`鏂囦欢瀛樺湪锛屽垯浼氱浉搴旂殑璁$畻CER锛屽惁鍒欎細璺宠繃銆�
+* `output_dir`锛氭帹鐞嗙粨鏋滀繚瀛樼洰褰�
+* `batch_size`锛氭帹鐞嗘椂鐨刡atch澶у皬
+* `ctc_weight`锛氶儴鍒嗘ā鍨嬪寘鍚獵TC妯″潡锛屽彲浠ヨ缃鍙傛暟鏉ユ寚瀹氭帹鐞嗘椂锛孋TC妯″潡鐨勬潈閲�
+
+闄や簡鐩存帴鍦╜infer.py`涓缃弬鏁板锛岀敤鎴蜂篃鍙互閫氳繃鎵嬪姩淇敼妯″瀷涓嬭浇鐩綍涓嬬殑`decoding.yaml`鏂囦欢涓殑鍙傛暟鏉ヤ慨鏀规帹鐞嗛厤缃��
+
+## 妯″瀷寰皟
+鎴戜滑鎻愪緵浜哷finetune.py`鏉ュ疄鐜版ā鍨嬪井璋冦�傚熀浜庢鏂囦欢锛岀敤鎴峰彲浠ュ熀浜庢垜浠彁渚涚殑妯″瀷浣滀负鍒濆妯″瀷锛屽湪鎸囧畾鐨勬暟鎹泦涓婅繘琛屽井璋冿紝浠庤�屽湪鐗瑰緛棰嗗煙鍙栧緱鏇村ソ鐨勬�ц兘銆傚湪寰皟寮�濮嬪墠锛岀敤鎴峰彲浠ユ寚瀹氬涓嬪弬鏁版潵淇敼寰皟閰嶇疆锛�
+* `data_path`锛氭暟鎹洰褰曘�傝鐩綍涓嬪簲璇ュ寘鎷瓨鏀捐缁冮泦鏁版嵁鐨刞train`鐩綍鍜屽瓨鏀鹃獙璇侀泦鏁版嵁鐨刞dev`鐩綍銆傛瘡涓洰褰曚腑闇�瑕佸寘鎷煶棰戝垪琛ㄦ枃浠禶wav.scp`鍜屾妱鏈枃浠禶text`
+* `output_dir`锛氬井璋冪粨鏋滀繚瀛樼洰褰�
+* `dataset_type`锛氬浜庡皬鏁版嵁闆嗭紝璁剧疆涓篳small`锛涘綋鏁版嵁閲忓ぇ浜�1000灏忔椂鏃讹紝璁剧疆涓篳large`
+* `batch_bins`锛歜atch size锛屽鏋渄ataset_type璁剧疆涓篳small`锛宐atch_bins鍗曚綅涓篺bank鐗瑰緛甯ф暟锛涘鏋渄ataset_type=`large`锛宐atch_bins鍗曚綅涓烘绉�
+* `max_epoch`锛氭渶澶х殑璁粌杞暟
+
+浠ヤ笅鍙傛暟涔熷彲浠ヨ繘琛岃缃�備絾鏄鏋滄病鏈夌壒鍒殑闇�姹傦紝鍙互蹇界暐锛岀洿鎺ヤ娇鐢ㄦ垜浠粰瀹氱殑榛樿鍊硷細
+* `accum_grad`锛氭搴︾疮绉�
+* `keep_nbest_models`锛氶�夋嫨鎬ц兘鏈�濂界殑`keep_nbest_models`涓ā鍨嬬殑鍙傛暟杩涜骞冲潎锛屽緱鍒版�ц兘鏇村ソ鐨勬ā鍨�
+* `optim`锛氳缃井璋冩椂鐨勪紭鍖栧櫒
+* `lr`锛氳缃井璋冩椂鐨勫涔犵巼
+* `scheduler`锛氳缃涔犵巼璋冩暣绛栫暐
+* `scheduler_conf`锛氬涔犵巼璋冩暣绛栫暐鐨勭浉鍏冲弬鏁�
+* `specaug`锛氳缃氨澧炲箍
+* `specaug_conf`锛氳氨澧炲箍鐨勭浉鍏冲弬鏁�
+
+闄や簡鐩存帴鍦╜finetune.py`涓缃弬鏁板锛岀敤鎴蜂篃鍙互閫氳繃鎵嬪姩淇敼妯″瀷涓嬭浇鐩綍涓嬬殑`finetune.yaml`鏂囦欢涓殑鍙傛暟鏉ヤ慨鏀瑰井璋冮厤缃��
+
+## 鍩轰簬寰皟鍚庣殑妯″瀷鎺ㄧ悊
+鎴戜滑鎻愪緵浜哷infer_after_finetune.py`鏉ュ疄鐜板熀浜庣敤鎴疯嚜宸卞井璋冨緱鍒扮殑妯″瀷杩涜鎺ㄧ悊銆傚熀浜庢鏂囦欢锛岀敤鎴峰彲浠ュ熀浜庡井璋冨悗鐨勬ā鍨嬶紝瀵规寚瀹氱殑鏁版嵁闆嗚繘琛屾帹鐞嗭紝寰楀埌鐩稿簲鐨勮瘑鍒粨鏋溿�傚鏋滃悓鏃剁粰瀹氫簡鎶勬湰锛屽垯浼氬悓鏃惰绠桟ER銆傚湪寮�濮嬫帹鐞嗗墠锛岀敤鎴峰彲浠ユ寚瀹氬涓嬪弬鏁版潵淇敼鎺ㄧ悊閰嶇疆锛�
+* `data_dir`锛氭暟鎹泦鐩綍銆傜洰褰曚笅搴旇鍖呮嫭闊抽鍒楄〃鏂囦欢`wav.scp`鍜屾妱鏈枃浠禶text`(鍙��)銆傚鏋渀text`鏂囦欢瀛樺湪锛屽垯浼氱浉搴旂殑璁$畻CER锛屽惁鍒欎細璺宠繃銆�
+* `output_dir`锛氭帹鐞嗙粨鏋滀繚瀛樼洰褰�
+* `batch_size`锛氭帹鐞嗘椂鐨刡atch澶у皬
+* `ctc_weight`锛氶儴鍒嗘ā鍨嬪寘鍚獵TC妯″潡锛屽彲浠ヨ缃鍙傛暟鏉ユ寚瀹氭帹鐞嗘椂锛孋TC妯″潡鐨勬潈閲�
+* `decoding_model_name`锛氭寚瀹氱敤浜庢帹鐞嗙殑妯″瀷鍚�
+
+浠ヤ笅鍙傛暟涔熷彲浠ヨ繘琛岃缃�備絾鏄鏋滄病鏈夌壒鍒殑闇�姹傦紝鍙互蹇界暐锛岀洿鎺ヤ娇鐢ㄦ垜浠粰瀹氱殑榛樿鍊硷細
+* `modelscope_model_name`锛氬井璋冩椂浣跨敤鐨勫垵濮嬫ā鍨�
+* `required_files`锛氫娇鐢╩odelscope鎺ュ彛杩涜鎺ㄧ悊鏃堕渶瑕佺敤鍒扮殑鏂囦欢
+
+## 娉ㄦ剰浜嬮」
+閮ㄥ垎妯″瀷鍙兘鍦ㄥ井璋冦�佹帹鐞嗘椂瀛樺湪涓�浜涚壒鏈夌殑鍙傛暟锛岃繖閮ㄥ垎鍙傛暟鍙互鍦ㄥ搴旂洰褰曠殑README.md鏂囦欢涓壘鍒板叿浣撶敤娉曘��
\ No newline at end of file
diff --git a/docs_cn/modelscope_usages.md b/docs_cn/modelscope_usages.md
index 6e1420a..43d0c0f 100644
--- a/docs_cn/modelscope_usages.md
+++ b/docs_cn/modelscope_usages.md
@@ -2,13 +2,13 @@
 ModelScope鏄樋閲屽反宸存帹鍑虹殑寮�婧愭ā鍨嬪嵆鏈嶅姟鍏变韩骞冲彴锛屼负骞垮ぇ瀛︽湳鐣岀敤鎴峰拰宸ヤ笟鐣岀敤鎴锋彁渚涚伒娲汇�佷究鎹风殑妯″瀷搴旂敤鏀寔銆傚叿浣撶殑浣跨敤鏂规硶鍜屽紑婧愭ā鍨嬪彲浠ュ弬瑙乕ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition) 銆傚湪璇煶鏂瑰悜锛屾垜浠彁渚涗簡鑷洖褰�/闈炶嚜鍥炲綊璇煶璇嗗埆锛岃闊抽璁粌锛屾爣鐐归娴嬬瓑妯″瀷锛岀敤鎴峰彲浠ユ柟渚夸娇鐢ㄣ��
 
 ## 鏁翠綋浠嬬粛
-鎴戜滑鍦╡gs_modelscope鐩綍涓嬫彁渚涗簡鐩稿叧妯″瀷鐨勪娇鐢紝鏀寔鐩存帴鐢ㄦ垜浠彁渚涚殑妯″瀷杩涜鎺ㄧ悊锛屽悓鏃朵篃鏀寔灏嗘垜浠彁渚涚殑妯″瀷浣滀负棰勮缁冨ソ鐨勬ā鍨嬩綔涓哄垵濮嬫ā鍨嬭繘琛屽井璋冦�備笅闈紝鎴戜滑灏嗕互egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch鐩綍涓彁渚涚殑妯″瀷鏉ヨ繘琛屼粙缁嶏紝鍖呮嫭`infer.py`锛宍finetune.py`鍜宍infer_after_finetune.py`锛屽搴旂殑鍔熻兘濡備笅锛�
+鎴戜滑鍦╜egs_modelscope` 鐩綍涓嬫彁渚涗簡涓嶅悓妯″瀷鐨勪娇鐢ㄦ柟娉曪紝鏀寔鐩存帴鐢ㄦ垜浠彁渚涚殑妯″瀷杩涜鎺ㄧ悊锛屽悓鏃朵篃鏀寔灏嗘垜浠彁渚涚殑妯″瀷浣滀负棰勮缁冨ソ鐨勫垵濮嬫ā鍨嬭繘琛屽井璋冦�備笅闈紝鎴戜滑灏嗕互`egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch`鐩綍涓彁渚涚殑妯″瀷鏉ヨ繘琛屼粙缁嶏紝鍖呮嫭`infer.py`锛宍finetune.py`鍜宍infer_after_finetune.py`锛屽搴旂殑鍔熻兘濡備笅锛�
 - `infer.py`: 鍩轰簬鎴戜滑鎻愪緵鐨勬ā鍨嬶紝瀵规寚瀹氱殑鏁版嵁闆嗚繘琛屾帹鐞�
 - `finetune.py`: 灏嗘垜浠彁渚涚殑妯″瀷浣滀负鍒濆妯″瀷杩涜寰皟
 - `infer_after_finetune.py`: 鍩轰簬寰皟寰楀埌鐨勬ā鍨嬶紝瀵规寚瀹氱殑鏁版嵁闆嗚繘琛屾帹鐞�
 
 ## 妯″瀷鎺ㄧ悊
-鎴戜滑鎻愪緵浜哷infer.py`鏉ュ疄鐜版ā鍨嬫帹鐞嗐�傚熀浜庢鏂囦欢锛岀敤鎴峰彲浠ュ熀浜庢垜浠彁渚涚殑妯″瀷锛屽鎸囧畾鐨勬暟鎹泦杩涜鎺ㄧ悊锛屽緱鍒扮浉搴旂殑璇嗗埆缁撴灉銆傚鏋滃悓鏃剁粰瀹氫簡鎶勬湰锛屽垯浼氬悓鏃惰绠桟ER銆傚湪寮�濮嬫帹鐞嗗墠锛岀敤鎴峰彲浠ユ寚瀹氬涓嬪弬鏁版潵淇敼鎺ㄧ悊閰嶇疆锛�
+鎴戜滑鎻愪緵浜哷infer.py`鏉ュ疄鐜版ā鍨嬫帹鐞嗐�傚熀浜庢鏂囦欢锛岀敤鎴峰彲浠ュ熀浜庢垜浠彁渚涚殑妯″瀷锛屽鎸囧畾鐨勬暟鎹泦杩涜鎺ㄧ悊锛屽緱鍒扮浉搴旂殑璇嗗埆缁撴灉銆傚鏋滅粰瀹氫簡鎶勬湰锛屽垯浼氬悓鏃惰绠梎CER`銆傚湪寮�濮嬫帹鐞嗗墠锛岀敤鎴峰彲浠ユ寚瀹氬涓嬪弬鏁版潵淇敼鎺ㄧ悊閰嶇疆锛�
 * `data_dir`锛氭暟鎹泦鐩綍銆傜洰褰曚笅搴旇鍖呮嫭闊抽鍒楄〃鏂囦欢`wav.scp`鍜屾妱鏈枃浠禶text`(鍙��)锛屽叿浣撴牸寮忓彲浠ュ弬瑙乕蹇�熷紑濮媇(./get_started.md)涓殑璇存槑銆傚鏋渀text`鏂囦欢瀛樺湪锛屽垯浼氱浉搴旂殑璁$畻CER锛屽惁鍒欎細璺宠繃銆�
 * `output_dir`锛氭帹鐞嗙粨鏋滀繚瀛樼洰褰�
 * `batch_size`锛氭帹鐞嗘椂鐨刡atch澶у皬

--
Gitblit v1.9.1