From 5ebac59f9e7631677d93036c2e9e8f373577eb97 Mon Sep 17 00:00:00 2001
From: 游雁 <zhifu.gzf@alibaba-inc.com>
Date: 星期四, 21 三月 2024 16:26:49 +0800
Subject: [PATCH] tutorial

---
 /dev/null                                                    |    1 
 docs/tutorial/README_zh.md                                   |  186 +++++++++++++++++++++++++++++++
 examples/industrial_data_pretraining/paraformer/README_zh.md |  161 +++++++++++++++++++++-----
 examples/README_zh.md                                        |    1 
 README_zh.md                                                 |    5 
 README.md                                                    |    2 
 6 files changed, 317 insertions(+), 39 deletions(-)

diff --git a/README.md b/README.md
index e7ff1b1..95529c6 100644
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@
 ```
 
 ## Model Zoo
-FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo]().
+FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo](./model_zoo).
 
 (Note: 猸� represents the ModelScope model zoo, 馃 represents the Huggingface model zoo, 馃崁 represents the OpenAI model zoo)
 
diff --git a/README_zh.md b/README_zh.md
index 46af926..eeedc04 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -69,7 +69,7 @@
 
 ## 妯″瀷浠撳簱
 
-FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](./MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�僛妯″瀷浠撳簱]()銆�
+FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](./MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�� [妯″瀷浠撳簱](./model_zoo)銆�
 
 锛堟敞锛氣瓙 琛ㄧずModelScope妯″瀷浠撳簱锛岎煠� 琛ㄧずHuggingface妯″瀷浠撳簱锛岎煃�琛ㄧずOpenAI妯″瀷浠撳簱锛�
 
@@ -208,7 +208,8 @@
 res = model.generate(input=(wav_file, text_file), data_type=("sound", "text"))
 print(res)
 ```
-鏇村璇︾粏鐢ㄦ硶锛圼绀轰緥](https://github.com/alibaba-damo-academy/FunASR/tree/main/examples/industrial_data_pretraining)锛�
+鏇磋缁嗭紙[鐢ㄦ硶](docs/tutorial/README_zh.md)锛夛紝
+鏇村锛圼绀轰緥](https://github.com/alibaba-damo-academy/FunASR/tree/main/examples/industrial_data_pretraining)锛�
 
 ## 瀵煎嚭ONNX
 ### 浠庡懡浠よ瀵煎嚭
diff --git a/docs/runtime b/docs/runtime
deleted file mode 120000
index 3d1f990..0000000
--- a/docs/runtime
+++ /dev/null
@@ -1 +0,0 @@
-../runtime
\ No newline at end of file
diff --git a/docs/tutorial/README_zh.md b/docs/tutorial/README_zh.md
new file mode 100644
index 0000000..97bce71
--- /dev/null
+++ b/docs/tutorial/README_zh.md
@@ -0,0 +1,186 @@
+(绠�浣撲腑鏂噟[English](./README.md))
+
+FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](../../MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�� [妯″瀷浠撳簱](../../model_zoo)銆�
+
+
+## 鎺ㄧ悊
+
+### 蹇�熶娇鐢�
+#### [Paraformer 妯″瀷](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="/Users/zhifu/Downloads/modelscope_models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
+
+res = model.generate(input="/Users/zhifu/Downloads/modelscope_models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav")
+print(res)
+```
+
+### 璇︾粏鐢ㄦ硶浠嬬粛
+```python
+model = AutoModel(model=[str], device=[str], ncpu=[int], output_dir=[str], batch_size=[int], **kwargs)
+```
+#### AutoModel 瀹氫箟
+- `model`(str): [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `device`(str): `cuda:0`锛堥粯璁pu0锛夛紝浣跨敤 GPU 杩涜鎺ㄧ悊锛屾寚瀹氥�傚鏋滀负`cpu`锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
+- `ncpu`(int): `4` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
+- `output_dir`(str): `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `batch_size`(int): `1` 锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊澶у皬
+- `**kwargs`(dict): 鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝vad妯″瀷涓渶澶у垏鍓查暱搴� `max_single_segment_time=6000` 锛堟绉掞級銆�
+#### AutoModel 鎺ㄧ悊
+```python
+res = model.generate(input=[str], output_dir=[str])
+```
+- `input`: 瑕佽В鐮佺殑杈撳叆锛屽彲浠ユ槸锛�
+  - wav鏂囦欢璺緞, 渚嬪: asr_example.wav
+  - pcm鏂囦欢璺緞, 渚嬪: asr_example.pcm锛屾鏃堕渶瑕佹寚瀹氶煶棰戦噰鏍风巼fs锛堥粯璁や负16000锛�
+  - 闊抽瀛楄妭鏁版祦锛屼緥濡傦細楹﹀厠椋庣殑瀛楄妭鏁版暟鎹�
+  - wav.scp锛宬aldi 鏍峰紡鐨� wav 鍒楄〃 (`wav_id \t wav_path`), 渚嬪:
+  ```text
+  asr_example1  ./audios/asr_example1.wav
+  asr_example2  ./audios/asr_example2.wav
+  ```
+  鍦ㄨ繖绉嶈緭鍏� `wav.scp` 鐨勬儏鍐典笅锛屽繀椤昏缃� `output_dir` 浠ヤ繚瀛樿緭鍑虹粨鏋�
+  - 闊抽閲囨牱鐐癸紝渚嬪锛歚audio, rate = soundfile.read("asr_example_zh.wav")`, 鏁版嵁绫诲瀷涓� numpy.ndarray銆傛敮鎸乥atch杈撳叆锛岀被鍨嬩负list锛�
+  ```[audio_sample1, audio_sample2, ..., audio_sampleN]```
+  - fbank杈撳叆锛屾敮鎸佺粍batch銆俿hape涓篬batch, frames, dim]锛岀被鍨嬩负torch.Tensor锛屼緥濡�
+- `output_dir`: None 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `**kwargs`(dict): 涓庢ā鍨嬬浉鍏崇殑鎺ㄧ悊鍙傛暟锛屼緥濡傦紝`beam_size=10`锛宍decoding_ctc_weight=0.1`銆�
+
+### onnx涓巐ibtorch瀵煎嚭
+
+```python
+res = model.export(type="onnx", quantize=True)
+```
+- `type`(str)锛歚onnx`(榛樿)锛屽鍑簅nnx鏍煎紡銆俙torch`瀵煎嚭libtorch鏍煎紡銆�
+- `quantize`(bool)锛歚False`锛堥粯璁わ級锛屾槸鍚﹀仛閲忓寲銆�
+
+
+## 寰皟
+
+### 蹇�熷紑濮�
+```shell
+cd examples/industrial_data_pretraining/paraformer
+bash finetune.sh
+# "log_file: ./outputs/log.txt"
+```
+璇︾粏瀹屾暣鐨勮剼鏈弬鑰� [finetune.sh](../../examples/industrial_data_pretraining/paraformer/finetune.sh)
+
+### 璇︾粏鍙傛暟浠嬬粛
+
+```shell
+funasr/bin/train.py \
+++model="${model_name_or_model_dir}" \
+++model_revision="${model_revision}" \
+++train_data_set_list="${train_data}" \
+++valid_data_set_list="${val_data}" \
+++dataset_conf.batch_size=20000 \
+++dataset_conf.batch_type="token" \
+++dataset_conf.num_workers=4 \
+++train_conf.max_epoch=50 \
+++train_conf.log_interval=1 \
+++train_conf.resume=false \
+++train_conf.validate_interval=2000 \
+++train_conf.save_checkpoint_interval=2000 \
+++train_conf.keep_nbest_models=20 \
+++optim_conf.lr=0.0002 \
+++output_dir="${output_dir}" &> ${log_file}
+```
+
+- `model`锛坰tr锛夛細妯″瀷鍚嶅瓧锛堟ā鍨嬩粨搴撲腑鐨処D锛夛紝姝ゆ椂鑴氭湰浼氳嚜鍔ㄤ笅杞芥ā鍨嬪埌鏈锛涙垨鑰呮湰鍦板凡缁忎笅杞藉ソ鐨勬ā鍨嬭矾寰勩��
+- `model_revision`锛坰tr锛夛細褰� `model` 涓烘ā鍨嬪悕瀛楁椂锛屼笅杞芥寚瀹氱増鏈殑妯″瀷銆�
+- `train_data_set_list`锛坰tr锛夛細璁粌鏁版嵁璺緞锛岄粯璁や负jsonl鏍煎紡锛屽叿浣撳弬鑰冿紙[渚嬪瓙](../../data/list)锛夈��
+- `valid_data_set_list`锛坰tr锛夛細楠岃瘉鏁版嵁璺緞锛岄粯璁や负jsonl鏍煎紡锛屽叿浣撳弬鑰冿紙[渚嬪瓙](../../data/list)锛夈��
+- `dataset_conf.batch_type`锛坰tr锛夛細`example`锛堥粯璁わ級锛宐atch鐨勭被鍨嬨�俙example`琛ㄧず鎸夌収鍥哄畾鏁扮洰batch_size涓牱鏈粍batch锛沗length` or `token` 琛ㄧず鍔ㄦ�佺粍batch锛宐atch鎬婚暱搴︽垨鑰卼oken鏁颁负batch_size銆�
+- `dataset_conf.batch_size`锛坕nt锛夛細涓� `batch_type` 鎼厤浣跨敤锛屽綋 `batch_type=example` 鏃讹紝琛ㄧず鏍锋湰涓暟锛涘綋 `batch_type=length` 鏃讹紝琛ㄧず鏍锋湰涓暱搴︼紝鍗曚綅涓篺bank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
+- `train_conf.max_epoch`锛坕nt锛夛細璁粌鎬籩poch鏁般��
+- `train_conf.log_interval`锛坕nt锛夛細鎵撳嵃鏃ュ織闂撮殧step鏁般��
+- `train_conf.resume`锛坕nt锛夛細鏄惁寮�鍚柇鐐归噸璁��
+- `train_conf.validate_interval`锛坕nt锛夛細璁粌涓仛楠岃瘉娴嬭瘯鐨勯棿闅攕tep鏁般��
+- `train_conf.save_checkpoint_interval`锛坕nt锛夛細璁粌涓ā鍨嬩繚瀛橀棿闅攕tep鏁般��
+- `train_conf.keep_nbest_models`锛坕nt锛夛細淇濈暀鏈�澶у灏戜釜妯″瀷鍙傛暟锛屾寜鐓ч獙璇侀泦acc鎺掑簭锛屼粠楂樺埌搴曚繚鐣欍��
+- `optim_conf.lr`锛坒loat锛夛細瀛︿範鐜囥��
+- `output_dir`锛坰tr锛夛細妯″瀷淇濆瓨璺緞銆�
+- `**kwargs`(dict): 鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝杩囨护20s浠ヤ笂闀块煶棰戯細`dataset_conf.max_token_length=2000`锛屽崟浣嶄负闊抽fbank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
+
+#### 澶歡pu璁粌
+##### 鍗曟満澶歡pu璁粌
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 1 --nproc_per_node ${gpu_num} \
+../../../funasr/bin/train.py ${train_args}
+```
+--nnodes 琛ㄧず鍙備笌鐨勮妭鐐规�绘暟锛�--nproc_per_node 琛ㄧず姣忎釜鑺傜偣涓婅繍琛岀殑杩涚▼鏁�
+
+##### 澶氭満澶歡pu璁粌
+
+鍦ㄤ富鑺傜偣涓婏紝鍋囪IP涓�192.168.1.1锛岀鍙d负12345锛屼娇鐢ㄧ殑鏄�2涓狦PU锛屽垯杩愯濡備笅鍛戒护锛�
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 2 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+../../../funasr/bin/train.py ${train_args}
+```
+鍦ㄤ粠鑺傜偣涓婏紙鍋囪IP涓�192.168.1.2锛夛紝浣犻渶瑕佺‘淇滿ASTER_ADDR鍜孧ASTER_PORT鐜鍙橀噺涓庝富鑺傜偣璁剧疆鐨勪竴鑷达紝骞惰繍琛屽悓鏍风殑鍛戒护锛�
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 2 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+../../../funasr/bin/train.py ${train_args}
+```
+
+--nnodes 琛ㄧず鍙備笌鐨勮妭鐐规�绘暟锛�--nproc_per_node 琛ㄧず姣忎釜鑺傜偣涓婅繍琛岀殑杩涚▼鏁�
+
+#### 鍑嗗鏁版嵁
+
+`train_text.txt`
+
+宸﹁竟涓烘暟鎹敮涓�ID锛岄渶涓巂train_wav.scp`涓殑`ID`涓�涓�瀵瑰簲
+鍙宠竟涓洪煶棰戞枃浠舵爣娉ㄦ枃鏈紝鏍煎紡濡備笅锛�
+
+```bash
+ID0012W0013 褰撳鎴烽闄╂壙鍙楄兘鍔涜瘎浼颁緷鎹彂鐢熷彉鍖栨椂
+ID0012W0014 鎵�鏈夊彧瑕佸鐞� data 涓嶇浣犳槸鍋� machine learning 鍋� deep learning
+ID0012W0015 he tried to think how it could be
+```
+
+
+`train_wav.scp`
+
+宸﹁竟涓烘暟鎹敮涓�ID锛岄渶涓巂train_text.txt`涓殑`ID`涓�涓�瀵瑰簲
+鍙宠竟涓洪煶棰戞枃浠剁殑璺緞锛屾牸寮忓涓�
+
+```bash
+BAC009S0764W0121 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/BAC009S0764W0121.wav
+BAC009S0916W0489 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/BAC009S0916W0489.wav
+ID0012W0015 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_cn_en.wav
+```
+
+
+#### 鏌ョ湅璁粌鏃ュ織
+
+##### 鏌ョ湅瀹為獙log
+```shell
+tail log.txt
+[2024-03-21 15:55:52,137][root][INFO] - train, rank: 3, epoch: 0/50, step: 6990/1, total step: 6990, (loss_avg_rank: 0.327), (loss_avg_epoch: 0.409), (ppl_avg_epoch: 1.506), (acc_avg_epoch: 0.795), (lr: 1.165e-04), [('loss_att', 0.259), ('acc', 0.825), ('loss_pre', 0.04), ('loss', 0.299), ('batch_size', 40)], {'data_load': '0.000', 'forward_time': '0.315', 'backward_time': '0.555', 'optim_time': '0.076', 'total_time': '0.947'}, GPU, memory: usage: 3.830 GB, peak: 18.357 GB, cache: 20.910 GB, cache_peak: 20.910 GB
+[2024-03-21 15:55:52,139][root][INFO] - train, rank: 1, epoch: 0/50, step: 6990/1, total step: 6990, (loss_avg_rank: 0.334), (loss_avg_epoch: 0.409), (ppl_avg_epoch: 1.506), (acc_avg_epoch: 0.795), (lr: 1.165e-04), [('loss_att', 0.285), ('acc', 0.823), ('loss_pre', 0.046), ('loss', 0.331), ('batch_size', 36)], {'data_load': '0.000', 'forward_time': '0.334', 'backward_time': '0.536', 'optim_time': '0.077', 'total_time': '0.948'}, GPU, memory: usage: 3.943 GB, peak: 18.291 GB, cache: 19.619 GB, cache_peak: 19.619 GB
+```
+鎸囨爣瑙i噴锛�
+- `rank`锛氳〃绀篻pu id銆�
+- `epoch`,`step`,`total step`锛氳〃绀哄綋鍓峞poch锛宻tep锛屾�籹tep銆�
+- `loss_avg_rank`锛氳〃绀哄綋鍓峴tep锛屾墍鏈塯pu骞冲潎loss銆�
+- `loss/ppl/acc_avg_epoch`锛氳〃绀哄綋鍓峞poch鍛ㄦ湡锛屾埅姝㈠綋鍓峴tep鏁版椂锛屾�诲钩鍧噇oss/ppl/acc銆俥poch缁撴潫鏃剁殑鏈�鍚庝竴涓猻tep琛ㄧずepoch鎬诲钩鍧噇oss/ppl/acc锛屾帹鑽愪娇鐢╝cc鎸囨爣銆�
+- `lr`锛氬綋鍓峴tep鐨勫涔犵巼銆�
+- `[('loss_att', 0.259), ('acc', 0.825), ('loss_pre', 0.04), ('loss', 0.299), ('batch_size', 40)]`锛氳〃绀哄綋鍓峠pu id鐨勫叿浣撴暟鎹��
+- `total_time`锛氳〃绀哄崟涓猻tep鎬昏�楁椂銆�
+- `GPU, memory`锛氬垎鍒〃绀猴紝妯″瀷浣跨敤/宄板�兼樉瀛橈紝妯″瀷+缂撳瓨浣跨敤/宄板�兼樉瀛樸��
+
+##### tensorboard鍙鍖�
+```bash
+tensorboard --logdir /xxxx/FunASR/examples/industrial_data_pretraining/paraformer/outputs/log/tensorboard
+```
+娴忚鍣ㄤ腑鎵撳紑锛歨ttp://localhost:6006/
diff --git a/examples/README_zh.md b/examples/README_zh.md
new file mode 120000
index 0000000..a2059ae
--- /dev/null
+++ b/examples/README_zh.md
@@ -0,0 +1 @@
+../docs/tutorial/README_zh.md
\ No newline at end of file
diff --git a/examples/industrial_data_pretraining/paraformer/README_zh.md b/examples/industrial_data_pretraining/paraformer/README_zh.md
index 38a4455..97bce71 100644
--- a/examples/industrial_data_pretraining/paraformer/README_zh.md
+++ b/examples/industrial_data_pretraining/paraformer/README_zh.md
@@ -1,9 +1,7 @@
 (绠�浣撲腑鏂噟[English](./README.md))
 
-# 璇煶璇嗗埆
+FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](../../MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�� [妯″瀷浠撳簱](../../model_zoo)銆�
 
-> **娉ㄦ剰**:
-> pipeline 鏀寔 [modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑鎵�鏈夋ā鍨嬭繘琛屾帹鐞嗗拰寰皟銆傝繖閲屾垜浠互鍏稿瀷妯″瀷浣滀负绀轰緥鏉ユ紨绀轰娇鐢ㄦ柟娉曘��
 
 ## 鎺ㄧ悊
 
@@ -14,18 +12,25 @@
 
 model = AutoModel(model="/Users/zhifu/Downloads/modelscope_models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
 
-res = model(input="/Users/zhifu/Downloads/modelscope_models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav")
+res = model.generate(input="/Users/zhifu/Downloads/modelscope_models/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav")
 print(res)
 ```
 
-### API鎺ュ彛璇存槑
+### 璇︾粏鐢ㄦ硶浠嬬粛
+```python
+model = AutoModel(model=[str], device=[str], ncpu=[int], output_dir=[str], batch_size=[int], **kwargs)
+```
 #### AutoModel 瀹氫箟
-- `model`: [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
-- `device`: `cuda`锛堥粯璁わ級锛屼娇鐢� GPU 杩涜鎺ㄧ悊銆傚鏋滀负`cpu`锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
-- `ncpu`: `None` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
-- `output_dir`: `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
-- `batch_size`: `1` 锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊澶у皬
+- `model`(str): [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `device`(str): `cuda:0`锛堥粯璁pu0锛夛紝浣跨敤 GPU 杩涜鎺ㄧ悊锛屾寚瀹氥�傚鏋滀负`cpu`锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
+- `ncpu`(int): `4` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
+- `output_dir`(str): `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `batch_size`(int): `1` 锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊澶у皬
+- `**kwargs`(dict): 鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝vad妯″瀷涓渶澶у垏鍓查暱搴� `max_single_segment_time=6000` 锛堟绉掞級銆�
 #### AutoModel 鎺ㄧ悊
+```python
+res = model.generate(input=[str], output_dir=[str])
+```
 - `input`: 瑕佽В鐮佺殑杈撳叆锛屽彲浠ユ槸锛�
   - wav鏂囦欢璺緞, 渚嬪: asr_example.wav
   - pcm鏂囦欢璺緞, 渚嬪: asr_example.pcm锛屾鏃堕渶瑕佹寚瀹氶煶棰戦噰鏍风巼fs锛堥粯璁や负16000锛�
@@ -40,56 +45,142 @@
   ```[audio_sample1, audio_sample2, ..., audio_sampleN]```
   - fbank杈撳叆锛屾敮鎸佺粍batch銆俿hape涓篬batch, frames, dim]锛岀被鍨嬩负torch.Tensor锛屼緥濡�
 - `output_dir`: None 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `**kwargs`(dict): 涓庢ā鍨嬬浉鍏崇殑鎺ㄧ悊鍙傛暟锛屼緥濡傦紝`beam_size=10`锛宍decoding_ctc_weight=0.1`銆�
+
+### onnx涓巐ibtorch瀵煎嚭
+
+```python
+res = model.export(type="onnx", quantize=True)
+```
+- `type`(str)锛歚onnx`(榛樿)锛屽鍑簅nnx鏍煎紡銆俙torch`瀵煎嚭libtorch鏍煎紡銆�
+- `quantize`(bool)锛歚False`锛堥粯璁わ級锛屾槸鍚﹀仛閲忓寲銆�
 
 
 ## 寰皟
+
+### 蹇�熷紑濮�
+```shell
+cd examples/industrial_data_pretraining/paraformer
+bash finetune.sh
+# "log_file: ./outputs/log.txt"
+```
+璇︾粏瀹屾暣鐨勮剼鏈弬鑰� [finetune.sh](../../examples/industrial_data_pretraining/paraformer/finetune.sh)
+
+### 璇︾粏鍙傛暟浠嬬粛
+
+```shell
+funasr/bin/train.py \
+++model="${model_name_or_model_dir}" \
+++model_revision="${model_revision}" \
+++train_data_set_list="${train_data}" \
+++valid_data_set_list="${val_data}" \
+++dataset_conf.batch_size=20000 \
+++dataset_conf.batch_type="token" \
+++dataset_conf.num_workers=4 \
+++train_conf.max_epoch=50 \
+++train_conf.log_interval=1 \
+++train_conf.resume=false \
+++train_conf.validate_interval=2000 \
+++train_conf.save_checkpoint_interval=2000 \
+++train_conf.keep_nbest_models=20 \
+++optim_conf.lr=0.0002 \
+++output_dir="${output_dir}" &> ${log_file}
+```
+
+- `model`锛坰tr锛夛細妯″瀷鍚嶅瓧锛堟ā鍨嬩粨搴撲腑鐨処D锛夛紝姝ゆ椂鑴氭湰浼氳嚜鍔ㄤ笅杞芥ā鍨嬪埌鏈锛涙垨鑰呮湰鍦板凡缁忎笅杞藉ソ鐨勬ā鍨嬭矾寰勩��
+- `model_revision`锛坰tr锛夛細褰� `model` 涓烘ā鍨嬪悕瀛楁椂锛屼笅杞芥寚瀹氱増鏈殑妯″瀷銆�
+- `train_data_set_list`锛坰tr锛夛細璁粌鏁版嵁璺緞锛岄粯璁や负jsonl鏍煎紡锛屽叿浣撳弬鑰冿紙[渚嬪瓙](../../data/list)锛夈��
+- `valid_data_set_list`锛坰tr锛夛細楠岃瘉鏁版嵁璺緞锛岄粯璁や负jsonl鏍煎紡锛屽叿浣撳弬鑰冿紙[渚嬪瓙](../../data/list)锛夈��
+- `dataset_conf.batch_type`锛坰tr锛夛細`example`锛堥粯璁わ級锛宐atch鐨勭被鍨嬨�俙example`琛ㄧず鎸夌収鍥哄畾鏁扮洰batch_size涓牱鏈粍batch锛沗length` or `token` 琛ㄧず鍔ㄦ�佺粍batch锛宐atch鎬婚暱搴︽垨鑰卼oken鏁颁负batch_size銆�
+- `dataset_conf.batch_size`锛坕nt锛夛細涓� `batch_type` 鎼厤浣跨敤锛屽綋 `batch_type=example` 鏃讹紝琛ㄧず鏍锋湰涓暟锛涘綋 `batch_type=length` 鏃讹紝琛ㄧず鏍锋湰涓暱搴︼紝鍗曚綅涓篺bank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
+- `train_conf.max_epoch`锛坕nt锛夛細璁粌鎬籩poch鏁般��
+- `train_conf.log_interval`锛坕nt锛夛細鎵撳嵃鏃ュ織闂撮殧step鏁般��
+- `train_conf.resume`锛坕nt锛夛細鏄惁寮�鍚柇鐐归噸璁��
+- `train_conf.validate_interval`锛坕nt锛夛細璁粌涓仛楠岃瘉娴嬭瘯鐨勯棿闅攕tep鏁般��
+- `train_conf.save_checkpoint_interval`锛坕nt锛夛細璁粌涓ā鍨嬩繚瀛橀棿闅攕tep鏁般��
+- `train_conf.keep_nbest_models`锛坕nt锛夛細淇濈暀鏈�澶у灏戜釜妯″瀷鍙傛暟锛屾寜鐓ч獙璇侀泦acc鎺掑簭锛屼粠楂樺埌搴曚繚鐣欍��
+- `optim_conf.lr`锛坒loat锛夛細瀛︿範鐜囥��
+- `output_dir`锛坰tr锛夛細妯″瀷淇濆瓨璺緞銆�
+- `**kwargs`(dict): 鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝杩囨护20s浠ヤ笂闀块煶棰戯細`dataset_conf.max_token_length=2000`锛屽崟浣嶄负闊抽fbank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
+
+#### 澶歡pu璁粌
+##### 鍗曟満澶歡pu璁粌
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 1 --nproc_per_node ${gpu_num} \
+../../../funasr/bin/train.py ${train_args}
+```
+--nnodes 琛ㄧず鍙備笌鐨勮妭鐐规�绘暟锛�--nproc_per_node 琛ㄧず姣忎釜鑺傜偣涓婅繍琛岀殑杩涚▼鏁�
+
+##### 澶氭満澶歡pu璁粌
+
+鍦ㄤ富鑺傜偣涓婏紝鍋囪IP涓�192.168.1.1锛岀鍙d负12345锛屼娇鐢ㄧ殑鏄�2涓狦PU锛屽垯杩愯濡備笅鍛戒护锛�
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 2 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+../../../funasr/bin/train.py ${train_args}
+```
+鍦ㄤ粠鑺傜偣涓婏紙鍋囪IP涓�192.168.1.2锛夛紝浣犻渶瑕佺‘淇滿ASTER_ADDR鍜孧ASTER_PORT鐜鍙橀噺涓庝富鑺傜偣璁剧疆鐨勪竴鑷达紝骞惰繍琛屽悓鏍风殑鍛戒护锛�
+```shell
+export CUDA_VISIBLE_DEVICES="0,1"
+gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
+
+torchrun --nnodes 2 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+../../../funasr/bin/train.py ${train_args}
+```
+
+--nnodes 琛ㄧず鍙備笌鐨勮妭鐐规�绘暟锛�--nproc_per_node 琛ㄧず姣忎釜鑺傜偣涓婅繍琛岀殑杩涚▼鏁�
 
 #### 鍑嗗鏁版嵁
 
 `train_text.txt`
 
 宸﹁竟涓烘暟鎹敮涓�ID锛岄渶涓巂train_wav.scp`涓殑`ID`涓�涓�瀵瑰簲
-鍙宠竟涓洪煶棰戞枃浠舵爣娉ㄦ枃鏈�
+鍙宠竟涓洪煶棰戞枃浠舵爣娉ㄦ枃鏈紝鏍煎紡濡備笅锛�
 
 ```bash
 ID0012W0013 褰撳鎴烽闄╂壙鍙楄兘鍔涜瘎浼颁緷鎹彂鐢熷彉鍖栨椂
-ID0012W0014 鏉ㄦ稕涓嶅緱涓嶅皢宸ュ巶鍏虫帀
+ID0012W0014 鎵�鏈夊彧瑕佸鐞� data 涓嶇浣犳槸鍋� machine learning 鍋� deep learning
+ID0012W0015 he tried to think how it could be
 ```
 
 
 `train_wav.scp`
 
 宸﹁竟涓烘暟鎹敮涓�ID锛岄渶涓巂train_text.txt`涓殑`ID`涓�涓�瀵瑰簲
-鍙宠竟涓洪煶棰戞枃浠剁殑缁濆璺緞
+鍙宠竟涓洪煶棰戞枃浠剁殑璺緞锛屾牸寮忓涓�
 
 ```bash
-ID0012W0013 /Users/zhifu/funasr_github/test_local/aishell2_dev_ios/wav/D0012/ID0012W0013.wav
-ID0012W0014 /Users/zhifu/funasr_github/test_local/aishell2_dev_ios/wav/D0012/ID0012W0014.wav
+BAC009S0764W0121 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/BAC009S0764W0121.wav
+BAC009S0916W0489 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/BAC009S0916W0489.wav
+ID0012W0015 https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_cn_en.wav
 ```
 
-#### 璁粌
 
-```bash
-cd examples/industrial_data_pretraining/paraformer
-sh finetune_from_local.sh
+#### 鏌ョ湅璁粌鏃ュ織
+
+##### 鏌ョ湅瀹為獙log
+```shell
+tail log.txt
+[2024-03-21 15:55:52,137][root][INFO] - train, rank: 3, epoch: 0/50, step: 6990/1, total step: 6990, (loss_avg_rank: 0.327), (loss_avg_epoch: 0.409), (ppl_avg_epoch: 1.506), (acc_avg_epoch: 0.795), (lr: 1.165e-04), [('loss_att', 0.259), ('acc', 0.825), ('loss_pre', 0.04), ('loss', 0.299), ('batch_size', 40)], {'data_load': '0.000', 'forward_time': '0.315', 'backward_time': '0.555', 'optim_time': '0.076', 'total_time': '0.947'}, GPU, memory: usage: 3.830 GB, peak: 18.357 GB, cache: 20.910 GB, cache_peak: 20.910 GB
+[2024-03-21 15:55:52,139][root][INFO] - train, rank: 1, epoch: 0/50, step: 6990/1, total step: 6990, (loss_avg_rank: 0.334), (loss_avg_epoch: 0.409), (ppl_avg_epoch: 1.506), (acc_avg_epoch: 0.795), (lr: 1.165e-04), [('loss_att', 0.285), ('acc', 0.823), ('loss_pre', 0.046), ('loss', 0.331), ('batch_size', 36)], {'data_load': '0.000', 'forward_time': '0.334', 'backward_time': '0.536', 'optim_time': '0.077', 'total_time': '0.948'}, GPU, memory: usage: 3.943 GB, peak: 18.291 GB, cache: 19.619 GB, cache_peak: 19.619 GB
 ```
+鎸囨爣瑙i噴锛�
+- `rank`锛氳〃绀篻pu id銆�
+- `epoch`,`step`,`total step`锛氳〃绀哄綋鍓峞poch锛宻tep锛屾�籹tep銆�
+- `loss_avg_rank`锛氳〃绀哄綋鍓峴tep锛屾墍鏈塯pu骞冲潎loss銆�
+- `loss/ppl/acc_avg_epoch`锛氳〃绀哄綋鍓峞poch鍛ㄦ湡锛屾埅姝㈠綋鍓峴tep鏁版椂锛屾�诲钩鍧噇oss/ppl/acc銆俥poch缁撴潫鏃剁殑鏈�鍚庝竴涓猻tep琛ㄧずepoch鎬诲钩鍧噇oss/ppl/acc锛屾帹鑽愪娇鐢╝cc鎸囨爣銆�
+- `lr`锛氬綋鍓峴tep鐨勫涔犵巼銆�
+- `[('loss_att', 0.259), ('acc', 0.825), ('loss_pre', 0.04), ('loss', 0.299), ('batch_size', 40)]`锛氳〃绀哄綋鍓峠pu id鐨勫叿浣撴暟鎹��
+- `total_time`锛氳〃绀哄崟涓猻tep鎬昏�楁椂銆�
+- `GPU, memory`锛氬垎鍒〃绀猴紝妯″瀷浣跨敤/宄板�兼樉瀛橈紝妯″瀷+缂撳瓨浣跨敤/宄板�兼樉瀛樸��
 
-**鏌ョ湅璁粌鏃ュ織**
-
+##### tensorboard鍙鍖�
 ```bash
 tensorboard --logdir /xxxx/FunASR/examples/industrial_data_pretraining/paraformer/outputs/log/tensorboard
 ```
-
-
-## 瀵煎嚭onnx
-
-```python
-from funasr import AutoModel
-wav_file = "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav"
-
-model = AutoModel(model="iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
-                  model_revision="v2.0.4")
-
-res = model.export(input=wav_file, type="onnx", quantize=False)
-print(res)
-```
\ No newline at end of file
+娴忚鍣ㄤ腑鎵撳紑锛歨ttp://localhost:6006/

--
Gitblit v1.9.1