From 8c7b7e5feb68fda1fc4ddd627bad0f915358149e Mon Sep 17 00:00:00 2001
From: Zhanzhao (Deo) Liang <liangzhanzhao1985@gmail.com>
Date: 星期三, 25 十二月 2024 16:40:29 +0800
Subject: [PATCH] fix export_meta import of sense voice (#2334)
---
docs/tutorial/Tables.md | 70 ++++++++++++++++++-----------------
1 files changed, 36 insertions(+), 34 deletions(-)
diff --git a/docs/tutorial/Tables.md b/docs/tutorial/Tables.md
index 3dfaa73..831a3ac 100644
--- a/docs/tutorial/Tables.md
+++ b/docs/tutorial/Tables.md
@@ -1,15 +1,17 @@
-# FunASR-1.x.x聽Registration聽Tutorial
+# FunASR-1.x.x聽Registration聽 New Model Tutorial
+
+([绠�浣撲腑鏂嘳(./Tables_zh.md)|English)
The聽original聽intention聽of聽the聽funasr-1.x.x聽version聽is聽to聽make聽model聽integration聽easier.聽The聽core聽feature聽is聽the聽registry聽and聽AutoModel:
* The聽introduction聽of聽the聽registry聽enables聽the聽development聽of聽building聽blocks聽to聽access聽the聽model,聽compatible聽with聽a聽variety聽of聽tasks;
-
+
* The聽newly聽designed聽AutoModel聽interface聽unifies聽modelscope,聽huggingface,聽and聽funasr聽inference聽and聽training聽interfaces,聽and聽supports聽free聽download聽of聽repositories;
-
+
* Support聽model聽export,聽demo-level聽service聽deployment,聽and聽industrial-level聽multi-concurrent聽service聽deployment;
-
+
* Unify聽academic聽and聽industrial聽model聽inference聽training聽scripts;
-
+
# Quick聽to聽get聽started
@@ -49,19 +51,19 @@
```
* `model`(str):聽[Model聽Warehouse](https://github.com/alibaba-damo-academy/FunASR/tree/main/model_zoo)The聽model聽name聽in,聽or聽the聽model聽path聽in聽the聽local聽disk
-
+
* `device`(str):聽`cuda:0`(Default聽gpu0),聽using聽GPU聽for聽inference,聽specified.聽If`cpu`Then聽the聽CPU聽is聽used聽for聽inference
-
+
* `ncpu`(int):聽`4`(Default),聽set聽the聽number聽of聽threads聽used聽for聽CPU聽internal聽operation聽parallelism
-
+
* `output_dir`(str):聽`None`(Default)聽If聽set,聽the聽output聽path聽of聽the聽output聽result
-
+
* `batch_size`(int):聽`1`(Default),聽batch聽processing聽during聽decoding,聽number聽of聽samples
-
+
* `hub`(str)锛歚ms`(Default)聽to聽download聽the聽model聽from聽modelscope.聽If`hf`To聽download聽the聽model聽from聽huggingface.
-
+
* `**kwargs`(dict):聽All聽in`config.yaml`Parameters,聽which聽can聽be聽specified聽directly聽here,聽for聽example,聽the聽maximum聽cut聽length聽in聽the聽vad聽model.`max_single_segment_time=6000`(Milliseconds).
-
+
#### AutoModel聽reasoning
@@ -70,13 +72,13 @@
```
* * wav聽file聽path,聽for聽example:聽asr\_example.wav
-
+
* pcm聽file聽path,聽for聽example:聽asr\_example.pcm,聽you聽need聽to聽specify聽the聽audio聽sampling聽rate聽fs聽(default聽is聽16000)
-
+
* Audio聽byte聽stream,聽for聽example:聽microphone聽byte聽data
-
+
* wav.scp,kaldi-style聽wav聽list聽(`wav_id聽\t聽wav_path`),聽for聽example:
-
+
```plaintext
Asr_example1./audios/asr_example1.wav
@@ -87,13 +89,13 @@
In聽this聽input
* Audio聽sampling聽points,聽for聽example:`audio,聽rate聽=聽soundfile.read("asr_example_zh.wav")`Is聽numpy.ndarray.聽batch聽input聽is聽supported.聽The聽type聽is聽list:`[audio_sample1,聽audio_sample2,聽...,聽audio_sampleN]`
-
+
* fbank聽input,聽support聽group聽batch.聽shape聽is聽\[batch,聽frames,聽dim\],聽type聽is聽torch.Tensor,聽for聽example
-
+
* `output_dir`:聽None聽(default),聽if聽set,聽the聽output聽path聽of聽the聽output聽result
-
+
* `**kwargs`(dict):聽Model-related聽inference聽parameters,聽e.g,`beam_size=10`,`decoding_ctc_weight=0.1`.
-
+
Detailed聽documentation聽link:[https://github.com/modelscope/FunASR/blob/main/examples/README\_zh.md](https://github.com/modelscope/FunASR/blob/main/examples/README_zh.md)
@@ -126,7 +128,7 @@
pos_enc_class: SinusoidalPositionEncoder
normalize_before: true
kernel_size: 11
- sanm_shfit: 0
+ sanm_shift: 0
selfattention_layer_type: sanm
@@ -206,23 +208,23 @@
"model": {"type" : "funasr"},
"pipeline": {"type":"funasr-pipeline"},
"model_name_in_hub": {
- "ms":"",
+ "ms":"",
"hf":""},
"file_path_metas": {
- "init_param":"model.Pt"
-"Config": "config.yaml"
-Languagename_conf: {"bpemodel": "chn_jpn_yue_eng_spectok.bpe.Model"},
-"Frontend_conf":{"cmvn_file": "am.mvn"}}
+ "init_param":"model.pt",
+ "config":"config.yaml",
+ "tokenizer_conf": {"bpemodel": "chn_jpn_yue_eng_ko_spectok.bpe.model"},
+ "frontend_conf":{"cmvn_file": "am.mvn"}}
}
```
The聽function聽of聽configuration.json聽is聽to聽add聽the聽model聽root聽directory聽to聽the聽item聽in聽file\_path\_metas,聽so聽that聽the聽path聽can聽be聽correctly聽parsed.聽For聽example,聽assume聽that聽the聽model聽root聽directory聽is:/home/zhifu.gzf/init\_model/SenseVoiceSmall,The聽relevant聽path聽in聽config.yaml聽in聽the聽directory聽is聽replaced聽with聽the聽correct聽path聽(ignoring聽irrelevant聽configuration):
```yaml
-Init_param: /home/zhifu.gz F/init_model/sensevoicemail Mall/model.pt
+init_param: /home/zhifu.gz F/init_model/sensevoicemail Mall/model.pt
-Tokenizer_conf:
-Bmodeler: /home/Zhifu.gzf/init_model/SenseVoiceSmall/chn_jpn_yue_eng_ko_spectok.bpe.model
+tokenizer_conf:
+ bpemodel: /home/Zhifu.gzf/init_model/SenseVoiceSmall/chn_jpn_yue_eng_ko_spectok.bpe.model
frontend_conf:
cmvn_file: /home/zhifu.Gzf/init_model/SenseVoiceSmall/am.mvn
@@ -272,7 +274,7 @@
def forward(
self,
**kwargs,
- ):
+ ):
def inference(
self,
@@ -318,9 +320,9 @@
## Principles聽of聽Registration
* Model:聽models聽are聽independent聽of聽each聽other.聽Each聽Model聽needs聽to聽create聽a聽new聽Model聽directory聽under聽funasr/models/.聽Do聽not聽use聽class聽inheritance聽method!!!聽Do聽not聽import聽from聽other聽model聽directories,聽and聽put聽everything聽you聽need聽into聽your聽own聽model聽directory!!!聽Do聽not聽modify聽the聽existing聽model聽code!!!
-
+
* dataset,frontend,tokenizer,聽if聽you聽can聽reuse聽the聽existing聽one,聽reuse聽it聽directly,聽if聽you聽cannot聽reuse聽it,聽please聽register聽a聽new聽one,聽modify聽it聽again,聽and聽do聽not聽modify聽the聽original聽one!!!
-
+
# Independent聽warehouse
@@ -335,7 +337,7 @@
model = AutoModel (
model="iic/SenseVoiceSmall ",
trust_remote_code=True
-remote_code = "./model.py",
+remote_code = "./model.py",
)
```
@@ -358,4 +360,4 @@
print(text)
```
-Trim聽reference:[https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh](https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh)
\ No newline at end of file
+Trim聽reference:[https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh](https://github.com/FunAudioLLM/SenseVoice/blob/main/finetune.sh)
--
Gitblit v1.9.1