From 790043364026835b0d834b165b1a65f7323cb6f1 Mon Sep 17 00:00:00 2001
From: 游雁 <zhifu.gzf@alibaba-inc.com>
Date: 星期三, 16 十月 2024 14:31:31 +0800
Subject: [PATCH] funasr tables

---
 docs/tutorial/README_zh.md |   85 ++++++++++++++++++++++++++++++++++++------
 1 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/docs/tutorial/README_zh.md b/docs/tutorial/README_zh.md
index fa85290..85c1950 100644
--- a/docs/tutorial/README_zh.md
+++ b/docs/tutorial/README_zh.md
@@ -7,6 +7,7 @@
  <a href="#妯″瀷鎺ㄧ悊"> 妯″瀷鎺ㄧ悊 </a>   
 锝�<a href="#妯″瀷璁粌涓庢祴璇�"> 妯″瀷璁粌涓庢祴璇� </a>
 锝�<a href="#妯″瀷瀵煎嚭涓庢祴璇�"> 妯″瀷瀵煎嚭涓庢祴璇� </a>
+锝�<a href="#鏂版ā鍨嬫敞鍐屾暀绋�"> 鏂版ā鍨嬫敞鍐屾暀绋� </a>
 </h4>
 </div>
 
@@ -131,7 +132,7 @@
 
 model = AutoModel(model="fsmn-vad")
 
-wav_file = f"{model.model_path}/example/asr_example.wav"
+wav_file = f"{model.model_path}/example/vad_example.wav"
 res = model.generate(input=wav_file)
 print(res)
 ```
@@ -225,7 +226,7 @@
 ++train_conf.validate_interval=2000 \
 ++train_conf.save_checkpoint_interval=2000 \
 ++train_conf.keep_nbest_models=20 \
-++train_conf.avg_nbest_model=5 \
+++train_conf.avg_nbest_model=10 \
 ++optim_conf.lr=0.0002 \
 ++output_dir="${output_dir}" &> ${log_file}
 ```
@@ -235,13 +236,17 @@
 - `valid_data_set_list`锛坰tr锛夛細楠岃瘉鏁版嵁璺緞锛岄粯璁や负jsonl鏍煎紡锛屽叿浣撳弬鑰冿紙[渚嬪瓙](https://github.com/alibaba-damo-academy/FunASR/blob/main/data/list)锛夈��
 - `dataset_conf.batch_type`锛坰tr锛夛細`example`锛堥粯璁わ級锛宐atch鐨勭被鍨嬨�俙example`琛ㄧず鎸夌収鍥哄畾鏁扮洰batch_size涓牱鏈粍batch锛沗length` or `token` 琛ㄧず鍔ㄦ�佺粍batch锛宐atch鎬婚暱搴︽垨鑰卼oken鏁颁负batch_size銆�
 - `dataset_conf.batch_size`锛坕nt锛夛細涓� `batch_type` 鎼厤浣跨敤锛屽綋 `batch_type=example` 鏃讹紝琛ㄧず鏍锋湰涓暟锛涘綋 `batch_type=length` 鏃讹紝琛ㄧず鏍锋湰涓暱搴︼紝鍗曚綅涓篺bank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
-- `train_conf.max_epoch`锛坕nt锛夛細璁粌鎬籩poch鏁般��
-- `train_conf.log_interval`锛坕nt锛夛細鎵撳嵃鏃ュ織闂撮殧step鏁般��
-- `train_conf.resume`锛坕nt锛夛細鏄惁寮�鍚柇鐐归噸璁��
-- `train_conf.validate_interval`锛坕nt锛夛細璁粌涓仛楠岃瘉娴嬭瘯鐨勯棿闅攕tep鏁般��
-- `train_conf.save_checkpoint_interval`锛坕nt锛夛細璁粌涓ā鍨嬩繚瀛橀棿闅攕tep鏁般��
-- `train_conf.keep_nbest_models`锛坕nt锛夛細淇濈暀鏈�澶у灏戜釜妯″瀷鍙傛暟锛屾寜鐓ч獙璇侀泦acc鎺掑簭锛屼粠楂樺埌搴曚繚鐣欍��
-- `train_conf.avg_nbest_model`锛坕nt锛夛細瀵筧cc鏈�楂樼殑n涓ā鍨嬪彇骞冲潎銆�
+- `train_conf.max_epoch`锛坕nt锛夛細`100`锛堥粯璁わ級锛岃缁冩�籩poch鏁般��
+- `train_conf.log_interval`锛坕nt锛夛細`50`锛堥粯璁わ級锛屾墦鍗版棩蹇楅棿闅攕tep鏁般��
+- `train_conf.resume`锛坕nt锛夛細`True`锛堥粯璁わ級锛屾槸鍚﹀紑鍚柇鐐归噸璁��
+- `train_conf.validate_interval`锛坕nt锛夛細`5000`锛堥粯璁わ級锛岃缁冧腑鍋氶獙璇佹祴璇曠殑闂撮殧step鏁般��
+- `train_conf.save_checkpoint_interval`锛坕nt锛夛細`5000`锛堥粯璁わ級锛岃缁冧腑妯″瀷淇濆瓨闂撮殧step鏁般��
+- `train_conf.avg_keep_nbest_models_type`锛坰tr锛夛細`acc`锛堥粯璁わ級锛屼繚鐣檔best鐨勬爣鍑嗕负acc锛堣秺澶ц秺濂斤級銆俙loss`琛ㄧず锛屼繚鐣檔best鐨勬爣鍑嗕负loss锛堣秺灏忚秺濂斤級銆�
+- `train_conf.keep_nbest_models`锛坕nt锛夛細`500`锛堥粯璁わ級锛屼繚鐣欐渶澶у灏戜釜妯″瀷鍙傛暟锛岄厤鍚� `avg_keep_nbest_models_type` 鎸夌収楠岃瘉闆� acc/loss 淇濈暀鏈�浣崇殑n涓ā鍨嬶紝鍏朵粬鍒犻櫎锛岃妭绾﹀瓨鍌ㄧ┖闂淬��
+- `train_conf.avg_nbest_model`锛坕nt锛夛細`10`锛堥粯璁わ級锛屼繚鐣欐渶澶у灏戜釜妯″瀷鍙傛暟锛岄厤鍚� `avg_keep_nbest_models_type` 鎸夌収楠岃瘉闆� acc/loss 瀵规渶浣崇殑n涓ā鍨嬪钩鍧囥��
+- `train_conf.accum_grad`锛坕nt锛夛細`1`锛堥粯璁わ級锛屾搴︾疮绉姛鑳姐��
+- `train_conf.grad_clip`锛坒loat锛夛細`10.0`锛堥粯璁わ級锛屾搴︽埅鏂姛鑳姐��
+- `train_conf.use_fp16`锛坆ool锛夛細`False`锛堥粯璁わ級锛屽紑鍚痜p16璁粌锛屽姞蹇缁冮�熷害銆�
 - `optim_conf.lr`锛坒loat锛夛細瀛︿範鐜囥��
 - `output_dir`锛坰tr锛夛細妯″瀷淇濆瓨璺緞銆�
 - `**kwargs`(dict): 鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝杩囨护20s浠ヤ笂闀块煶棰戯細`dataset_conf.max_token_length=2000`锛屽崟浣嶄负闊抽fbank甯ф暟锛�1甯�10ms锛夋垨鑰呮枃瀛梩oken涓暟銆�
@@ -264,7 +269,7 @@
 export CUDA_VISIBLE_DEVICES="0,1"
 gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
 
-torchrun --nnodes 2 --node_rank 0 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+torchrun --nnodes 2 --node_rank 0 --nproc_per_node ${gpu_num} --master_addr 192.168.1.1 --master_port 12345 \
 ../../../funasr/bin/train.py ${train_args}
 ```
 鍦ㄤ粠鑺傜偣涓婏紙鍋囪IP涓�192.168.1.2锛夛紝浣犻渶瑕佺‘淇滿ASTER_ADDR鍜孧ASTER_PORT鐜鍙橀噺涓庝富鑺傜偣璁剧疆鐨勪竴鑷达紝骞惰繍琛屽悓鏍风殑鍛戒护锛�
@@ -272,7 +277,7 @@
 export CUDA_VISIBLE_DEVICES="0,1"
 gpu_num=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
 
-torchrun --nnodes 2 --node_rank 1 --nproc_per_node ${gpu_num} --master_addr=192.168.1.1 --master_port=12345 \
+torchrun --nnodes 2 --node_rank 1 --nproc_per_node ${gpu_num} --master_addr 192.168.1.1 --master_port 12345 \
 ../../../funasr/bin/train.py ${train_args}
 ```
 
@@ -355,7 +360,7 @@
 
 #### 鏈塩onfiguration.json
 
-鍋囧畾锛岃缁冩ā鍨嬭矾寰勪负锛�./model_dir锛屽鏋滄敼鐩綍涓嬫湁鐢熸垚configuration.json锛屽彧闇�瑕佸皢 [涓婅堪妯″瀷鎺ㄧ悊鏂规硶](https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/README_zh.md#%E6%A8%A1%E5%9E%8B%E6%8E%A8%E7%90%86) 涓ā鍨嬪悕瀛椾慨鏀逛负妯″瀷璺緞鍗冲彲
+鍋囧畾锛岃缁冩ā鍨嬭矾寰勪负锛�./model_dir锛屽鏋滆鐩綍涓嬫湁鐢熸垚configuration.json锛屽彧闇�瑕佸皢 [涓婅堪妯″瀷鎺ㄧ悊鏂规硶](https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/README_zh.md#%E6%A8%A1%E5%9E%8B%E6%8E%A8%E7%90%86) 涓ā鍨嬪悕瀛椾慨鏀逛负妯″瀷璺緞鍗冲彲
 
 渚嬪锛�
 
@@ -429,4 +434,58 @@
 print(result)
 ```
 
-鏇村渚嬪瓙璇峰弬鑰� [鏍蜂緥](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime)
\ No newline at end of file
+鏇村渚嬪瓙璇峰弬鑰� [鏍蜂緥](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime)
+
+<a name="鏂版ā鍨嬫敞鍐屾暀绋�"></a>
+## 鏂版ā鍨嬫敞鍐屾暀绋�
+
+
+### 鏌ョ湅娉ㄥ唽琛�
+
+```python
+from funasr.register import tables
+
+tables.print()
+```
+
+鏀寔鏌ョ湅鎸囧畾绫诲瀷鐨勬敞鍐岃〃锛歚tables.print("model")`
+
+
+### 娉ㄥ唽鏂版ā鍨�
+
+```python
+from funasr.register import tables
+
+@tables.register("model_classes", "MinMo_S2T")
+class MinMo_S2T(nn.Module):
+  def __init__(*args, **kwargs):
+    ...
+
+  def forward(
+      self,
+      **kwargs,
+  ):  
+
+  def inference(
+      self,
+      data_in,
+      data_lengths=None,
+      key: list = None,
+      tokenizer=None,
+      frontend=None,
+      **kwargs,
+  ):
+    ...
+
+```
+
+鐒跺悗鍦╟onfig.yaml涓寚瀹氭柊娉ㄥ唽妯″瀷
+
+```yaml
+model: MinMo_S2T
+model_conf:
+  ...
+```
+
+
+[鏇村璇︾粏鏁欑▼鏂囨。](https://github.com/alibaba-damo-academy/FunASR/docs/tutorial/Tables_zh.md)
\ No newline at end of file

--
Gitblit v1.9.1