From 8c7b7e5feb68fda1fc4ddd627bad0f915358149e Mon Sep 17 00:00:00 2001
From: Zhanzhao (Deo) Liang <liangzhanzhao1985@gmail.com>
Date: 星期三, 25 十二月 2024 16:40:29 +0800
Subject: [PATCH] fix export_meta import of sense voice (#2334)
---
docs/tutorial/Tables_zh.md | 143 +++++++++++++++++++++++++++--------------------
1 files changed, 83 insertions(+), 60 deletions(-)
diff --git a/docs/tutorial/Tables_zh.md b/docs/tutorial/Tables_zh.md
index 9f616cf..e9360e0 100644
--- a/docs/tutorial/Tables_zh.md
+++ b/docs/tutorial/Tables_zh.md
@@ -1,39 +1,21 @@
-# FunASR-1.x.x聽娉ㄥ唽妯″瀷鏁欑▼
+# FunASR-1.x.x聽娉ㄥ唽鏂版ā鍨嬫暀绋�
-1.0鐗堟湰鐨勮璁″垵琛锋槸銆�**璁╂ā鍨嬮泦鎴愭洿绠�鍗�**銆戯紝鏍稿績feature涓烘敞鍐岃〃涓嶢utoModel锛�
+(绠�浣撲腑鏂噟[English](./Tables.md))
+
+funasr-1.x.x聽鐗堟湰鐨勮璁″垵琛锋槸銆�**璁╂ā鍨嬮泦鎴愭洿绠�鍗�**銆戯紝鏍稿績feature涓烘敞鍐岃〃涓嶢utoModel锛�
* 娉ㄥ唽琛ㄧ殑寮曞叆锛屼娇寰楀紑鍙戜腑鍙互鐢ㄦ惌绉湪鐨勬柟寮忔帴鍏ユā鍨嬶紝鍏煎澶氱task锛�
-
-* 鏂拌璁$殑AutoModel鎺ュ彛锛岀粺涓�modelscope銆乭uggingface涓巉unasr鎺ㄧ悊涓庤缁冩帴鍙o紝鏀寔鑷敱閫夋嫨涓嬭浇浠撳簱锛�
-
-* 鏀寔妯″瀷瀵煎嚭锛宒emo绾у埆鏈嶅姟閮ㄧ讲锛屼互鍙婂伐涓氱骇鍒骞跺彂鏈嶅姟閮ㄧ讲锛�
-
-* 缁熶竴瀛︽湳涓庡伐涓氭ā鍨嬫帹鐞嗚缁冭剼鏈紱
-
-
+* 鏂拌璁$殑AutoModel鎺ュ彛锛岀粺涓�modelscope銆乭uggingface涓巉unasr鎺ㄧ悊涓庤缁冩帴鍙o紝鏀寔鑷敱閫夋嫨涓嬭浇浠撳簱锛�
+
+* 鏀寔妯″瀷瀵煎嚭锛宒emo绾у埆鏈嶅姟閮ㄧ讲锛屼互鍙婂伐涓氱骇鍒骞跺彂鏈嶅姟閮ㄧ讲锛�
+
+* 缁熶竴瀛︽湳涓庡伐涓氭ā鍨嬫帹鐞嗚缁冭剼鏈紱
+
# 蹇�熶笂鎵�
## 鍩轰簬automodel鐢ㄦ硶
-
-### Paraformer妯″瀷
-
-杈撳叆浠绘剰鏃堕暱璇煶锛岃緭鍑轰负璇煶鍐呭瀵瑰簲鏂囧瓧锛屾枃瀛楀叿鏈夋爣鐐规柇鍙ワ紝瀛楃骇鍒椂闂存埑锛屼互鍙婅璇濅汉韬唤銆�
-
-```python
-from funasr import AutoModel
-
-model = AutoModel(model="paraformer-zh",
- vad_model="fsmn-vad",
- vad_kwargs={"max_single_segment_time": 60000},
- punc_model="ct-punc",
- # spk_model="cam++"
- )
-wav_file = f"{model.model_path}/example/asr_example.wav"
-res = model.generate(input=wav_file, batch_size_s=300, batch_size_threshold_s=60, hotword='榄旀惌')
-print(res)
-```
### SenseVoiceSmall妯″瀷
@@ -69,19 +51,19 @@
```
* `model`(str):聽[妯″瀷浠撳簱](https://github.com/alibaba-damo-academy/FunASR/tree/main/model_zoo)聽涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
-
+
* `device`(str):聽`cuda:0`锛堥粯璁pu0锛夛紝浣跨敤聽GPU聽杩涜鎺ㄧ悊锛屾寚瀹氥�傚鏋滀负`cpu`锛屽垯浣跨敤聽CPU聽杩涜鎺ㄧ悊
-
+
* `ncpu`(int):聽`4`聽锛堥粯璁わ級锛岃缃敤浜幝燙PU聽鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
-
+
* `output_dir`(str):聽`None`聽锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
-
+
* `batch_size`(int):聽`1`聽锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊锛屾牱鏈釜鏁�
-
+
* `hub`(str)锛歚ms`锛堥粯璁わ級锛屼粠modelscope涓嬭浇妯″瀷銆傚鏋滀负`hf`锛屼粠huggingface涓嬭浇妯″瀷銆�
-
+
* `**kwargs`(dict):聽鎵�鏈夊湪`config.yaml`涓弬鏁帮紝鍧囧彲浠ョ洿鎺ュ湪姝ゅ鎸囧畾锛屼緥濡傦紝vad妯″瀷涓渶澶у垏鍓查暱搴β燻max_single_segment_time=6000`聽锛堟绉掞級銆�
-
+
#### AutoModel聽鎺ㄧ悊
@@ -90,13 +72,13 @@
```
* * wav鏂囦欢璺緞,聽渚嬪:聽asr\_example.wav
-
+
* pcm鏂囦欢璺緞,聽渚嬪:聽asr\_example.pcm锛屾鏃堕渶瑕佹寚瀹氶煶棰戦噰鏍风巼fs锛堥粯璁や负16000锛�
-
+
* 闊抽瀛楄妭鏁版祦锛屼緥濡傦細楹﹀厠椋庣殑瀛楄妭鏁版暟鎹�
-
+
* wav.scp锛宬aldi聽鏍峰紡鐨劼爓av聽鍒楄〃聽(`wav_id聽\t聽wav_path`),聽渚嬪:
-
+
```plaintext
asr_example1 ./audios/asr_example1.wav
@@ -107,23 +89,27 @@
鍦ㄨ繖绉嶈緭鍏ヂ�
* 闊抽閲囨牱鐐癸紝渚嬪锛歚audio,聽rate聽=聽soundfile.read("asr_example_zh.wav")`,聽鏁版嵁绫诲瀷涓郝爊umpy.ndarray銆傛敮鎸乥atch杈撳叆锛岀被鍨嬩负list锛毬燻[audio_sample1,聽audio_sample2,聽...,聽audio_sampleN]`
-
+
* fbank杈撳叆锛屾敮鎸佺粍batch銆俿hape涓篭[batch,聽frames,聽dim\]锛岀被鍨嬩负torch.Tensor锛屼緥濡�
-
+
* `output_dir`:聽None聽锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
-
+
* `**kwargs`(dict):聽涓庢ā鍨嬬浉鍏崇殑鎺ㄧ悊鍙傛暟锛屼緥濡傦紝`beam_size=10`锛宍decoding_ctc_weight=0.1`銆�
-
+
璇︾粏鏂囨。閾炬帴锛歔https://github.com/modelscope/FunASR/blob/main/examples/README\_zh.md](https://github.com/modelscope/FunASR/blob/main/examples/README_zh.md)
# 娉ㄥ唽琛ㄨ瑙�
+浠enseVoiceSmall妯″瀷涓轰緥锛岃瑙e浣曟敞鍐屾柊妯″瀷锛屾ā鍨嬮摼鎺ワ細
+
+**modelscope锛�**[https://www.modelscope.cn/models/iic/SenseVoiceSmall/files](https://www.modelscope.cn/models/iic/SenseVoiceSmall/files)
+
+**huggingface锛�**[https://huggingface.co/FunAudioLLM/SenseVoiceSmall](https://huggingface.co/FunAudioLLM/SenseVoiceSmall)
+
## 妯″瀷璧勬簮鐩綍

-
-**妯″瀷閾炬帴涓猴細**[https://www.modelscope.cn/models/iic/SenseVoiceSmall/files](https://www.modelscope.cn/models/iic/SenseVoiceSmall/files)
**閰嶇疆鏂囦欢**锛歝onfig.yaml
@@ -142,7 +128,7 @@
pos_enc_class: SinusoidalPositionEncoder
normalize_before: true
kernel_size: 11
- sanm_shfit: 0
+ sanm_shift: 0
selfattention_layer_type: sanm
@@ -213,7 +199,7 @@
**妯″瀷鍙傛暟**锛歮odel.pt
-**璺緞瑙f瀽**锛歝onfiguration.json
+**璺緞瑙f瀽**锛歝onfiguration.json锛堥潪蹇呴渶锛�
```json
{
@@ -222,19 +208,31 @@
"model": {"type" : "funasr"},
"pipeline": {"type":"funasr-pipeline"},
"model_name_in_hub": {
- "ms":"",
+ "ms":"",
"hf":""},
"file_path_metas": {
- "init_param":"model.pt",
+ "init_param":"model.pt",
"config":"config.yaml",
"tokenizer_conf": {"bpemodel": "chn_jpn_yue_eng_ko_spectok.bpe.model"},
"frontend_conf":{"cmvn_file": "am.mvn"}}
}
```
-鍐呭鍙互澶嶇敤锛岀洿鎺ユ嫹璐濆嵆鍙紝闇�瑕佹敞鎰忓瓧娈� `file_path_metas` 鎵�鏈夊唴瀹逛細鑷姩鎷兼帴妯″瀷璧勬簮璺緞锛屽苟涓斾細瑕嗙洊 `config.yaml` 涓浉鍚屽瓧娈电殑璺緞銆�
+configuration.json鐨勪綔鐢ㄦ槸缁檉ile\_path\_metas涓殑item鎷兼帴涓婃ā鍨嬫牴鐩綍锛屼互渚夸簬璺緞鑳藉琚纭殑瑙f瀽锛屼互涓婁负渚嬶紝鍋囪妯″瀷鏍圭洰褰曚负锛�/home/zhifu.gzf/init\_model/SenseVoiceSmall锛岀洰褰曚腑config.yaml涓殑鐩稿叧璺緞琚浛鎹㈡垚浜嗘纭殑璺緞锛堝拷鐣ユ棤鍏抽厤缃級锛�
+
+```yaml
+init_param: /home/zhifu.gzf/init_model/SenseVoiceSmall/model.pt
+
+tokenizer_conf:
+ bpemodel: /home/zhifu.gzf/init_model/SenseVoiceSmall/chn_jpn_yue_eng_ko_spectok.bpe.model
+
+frontend_conf:
+ cmvn_file: /home/zhifu.gzf/init_model/SenseVoiceSmall/am.mvn
+```
## 娉ㄥ唽琛�
+
+
### 鏌ョ湅娉ㄥ唽琛�
@@ -244,7 +242,24 @@
tables.print()
```
-鏀寔鏌ョ湅鎸囧畾绫诲瀷鐨勬敞鍐岃〃锛屼緥濡傚彧鐪嬫敞鍐岀殑`model`绫伙細`tables.print("model")`
+鏀寔鏌ョ湅鎸囧畾绫诲瀷鐨勬敞鍐岃〃锛歕`tables.print("model")\`锛岀洰鍓峟unasr宸茬粡娉ㄥ唽妯″瀷濡備笂鍥炬墍绀恒�傜洰鍓嶉鍏堝畾涔変簡濡備笅鍑犱釜鍒嗙被锛�
+
+```python
+ model_classes = {}
+ frontend_classes = {}
+ specaug_classes = {}
+ normalize_classes = {}
+ encoder_classes = {}
+ decoder_classes = {}
+ joint_network_classes = {}
+ predictor_classes = {}
+ stride_conv_classes = {}
+ tokenizer_classes = {}
+ dataloader_classes = {}
+ batch_sampler_classes = {}
+ dataset_classes = {}
+ index_ds_classes = {}
+```
### 娉ㄥ唽妯″瀷
@@ -259,7 +274,7 @@
def forward(
self,
**kwargs,
- ):
+ ):
def inference(
self,
@@ -274,7 +289,15 @@
```
-鍦ㄩ渶瑕佹敞鍐岀殑绫诲悕鍓嶅姞涓� `@tables.register("model_classes","SenseVoiceSmall")`锛屽嵆鍙畬鎴愭敞鍐岋紝绫婚渶瑕佸疄鐜版湁锛歘_init__锛宖orward锛宨nference鏂规硶銆�
+鍦ㄩ渶瑕佹敞鍐岀殑绫诲悕鍓嶅姞涓娐燖tables.register("model\_classes",聽"SenseVoiceSmall")锛屽嵆鍙畬鎴愭敞鍐岋紝绫婚渶瑕佸疄鐜版湁锛歕_\_init\_\_锛宖orward锛宨nference鏂规硶銆�
+
+register鐢ㄦ硶锛�
+
+```python
+@tables.register("娉ㄥ唽鍒嗙被", "娉ㄥ唽鍚�")
+```
+
+鍏朵腑锛�"娉ㄥ唽鍒嗙被"鍙互鏄鍏堝畾涔夊ソ鐨勫垎绫伙紙瑙佷笂闈㈠浘锛夛紝濡傛灉鏄嚜宸卞畾涔夌殑鏂板垎绫伙紝浼氳嚜鍔ㄥ皢鏂板垎绫诲啓杩涙敞鍐岃〃鍒嗙被涓紝"娉ㄥ唽鍚�"鍗冲笇鏈涙敞鍐屽悕瀛楋紝鍚庣画鍙互鐩存帴鏉ヤ娇鐢ㄣ��
瀹屾暣浠g爜锛歔https://github.com/modelscope/FunASR/blob/main/funasr/models/sense\_voice/model.py#L443](https://github.com/modelscope/FunASR/blob/main/funasr/models/sense_voice/model.py#L443)
@@ -286,9 +309,9 @@
...
```
-## 娉ㄥ唽澶辫触
+### 娉ㄥ唽澶辫触
-濡傛灉鍑虹幇鎵句笉鍒版敞鍐屾ā鍨嬫垨鑰呮敞鍐屽嚱鏁帮紝`assert model_class is not None, f'{kwargs["model"]} is not registered'`銆傛ā鍨嬫敞鍐岀殑鍘熺悊鏄紝import 妯″瀷鏂囦欢锛屽彲浠ラ�氳繃import鏉ユ煡鐪嬪叿浣撴敞鍐屽け璐ュ師鍥狅紝渚嬪锛屼笂杩版ā鍨嬫枃浠朵负锛宖unasr/models/sense_voice/model.py锛�
+濡傛灉鍑虹幇鎵句笉鍒版敞鍐屾ā鍨嬫垨鍙戞柟娉曪紝assert聽model\_class聽is聽not聽None,聽f'{kwargs\["model"\]}聽is聽not聽registered'銆傛ā鍨嬫敞鍐岀殑鍘熺悊鏄紝import聽妯″瀷鏂囦欢锛屽彲浠ラ�氳繃import鏉ユ煡鐪嬪叿浣撴敞鍐屽け璐ュ師鍥狅紝渚嬪锛屼笂杩版ā鍨嬫枃浠朵负锛宖unasr/models/sense\_voice/model.py锛�
```python
from funasr.models.sense_voice.model import *
@@ -297,9 +320,9 @@
## 娉ㄥ唽鍘熷垯
* Model锛氭ā鍨嬩箣闂翠簰鐩哥嫭绔嬶紝姣忎竴涓ā鍨嬶紝閮介渶瑕佸湪funasr/models/涓嬮潰鏂板缓涓�涓ā鍨嬬洰褰曪紝涓嶈閲囩敤绫荤殑缁ф壙鏂规硶锛侊紒锛佷笉瑕佷粠鍏朵粬妯″瀷鐩綍涓璱mport锛屾墍鏈夐渶瑕佺敤鍒扮殑閮藉崟鐙斁鍒拌嚜宸辩殑妯″瀷鐩綍涓紒锛侊紒涓嶈淇敼鐜版湁鐨勬ā鍨嬩唬鐮侊紒锛侊紒
-
+
* dataset锛宖rontend锛宼okenizer锛屽鏋滆兘澶嶇敤鐜版湁鐨勶紝鐩存帴澶嶇敤锛屽鏋滀笉鑳藉鐢紝璇锋敞鍐屼竴涓柊鐨勶紝鍐嶄慨鏀癸紝涓嶈淇敼鍘熸潵鐨勶紒锛侊紒
-
+
# 鐙珛浠撳簱
@@ -313,8 +336,8 @@
# trust_remote_code锛歚True` 琛ㄧず model 浠g爜瀹炵幇浠� `remote_code` 澶勫姞杞斤紝`remote_code` 鎸囧畾 `model` 鍏蜂綋浠g爜鐨勪綅缃紙渚嬪锛屽綋鍓嶇洰褰曚笅鐨� `model.py`锛夛紝鏀寔缁濆璺緞涓庣浉瀵硅矾寰勶紝浠ュ強缃戠粶 url銆�
model = AutoModel(
model="iic/SenseVoiceSmall",
- trust_remote_code=True,
- remote_code="./model.py",
+ trust_remote_code=True,
+ remote_code="./model.py",
)
```
@@ -337,4 +360,4 @@
print(text)
```
-寰皟鍙傝�冿細[https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh](https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh)
\ No newline at end of file
+寰皟鍙傝�冿細[https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh](https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh)
--
Gitblit v1.9.1