From 28ccfbfc51068a663a80764e14074df5edf2b5ba Mon Sep 17 00:00:00 2001
From: kongdeqiang <kongdeqiang960204@163.com>
Date: 星期五, 13 三月 2026 17:41:41 +0800
Subject: [PATCH] 提交
---
README_zh.md | 340 +++++++++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 286 insertions(+), 54 deletions(-)
diff --git a/README_zh.md b/README_zh.md
index 8b188b7..b367008 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -2,12 +2,14 @@
(绠�浣撲腑鏂噟[English](./README.md))
-# FunASR: A Fundamental End-to-End Speech Recognition Toolkit
-<p align="left">
- <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
- <a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
- <a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
-</p>
+
+
+[](https://github.com/Akshay090/svg-banners)
+
+[//]: # (# FunASR: A Fundamental End-to-End Speech Recognition Toolkit)
+
+[](https://pypi.org/project/funasr/)
+
FunASR甯屾湜鍦ㄨ闊宠瘑鍒殑瀛︽湳鐮旂┒鍜屽伐涓氬簲鐢ㄤ箣闂存灦璧蜂竴搴фˉ姊併�傞�氳繃鍙戝竷宸ヤ笟绾ц闊宠瘑鍒ā鍨嬬殑璁粌鍜屽井璋冿紝鐮旂┒浜哄憳鍜屽紑鍙戜汉鍛樺彲浠ユ洿鏂逛究鍦拌繘琛岃闊宠瘑鍒ā鍨嬬殑鐮旂┒鍜岀敓浜э紝骞舵帹鍔ㄨ闊宠瘑鍒敓鎬佺殑鍙戝睍銆傝璇煶璇嗗埆鏇存湁瓒o紒
@@ -17,8 +19,8 @@
锝�<a href="#鏈�鏂板姩鎬�"> 鏈�鏂板姩鎬� </a>
锝�<a href="#瀹夎鏁欑▼"> 瀹夎 </a>
锝�<a href="#蹇�熷紑濮�"> 蹇�熷紑濮� </a>
-锝�<a href="https://alibaba-damo-academy.github.io/FunASR/en/index.html"> 鏁欑▼鏂囨。 </a>
-锝�<a href="./docs/model_zoo/modelscope_models.md"> 妯″瀷浠撳簱 </a>
+锝�<a href="https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/tutorial/README_zh.md"> 鏁欑▼鏂囨。 </a>
+锝�<a href="#妯″瀷浠撳簱"> 妯″瀷浠撳簱 </a>
锝�<a href="#鏈嶅姟閮ㄧ讲"> 鏈嶅姟閮ㄧ讲 </a>
锝�<a href="#鑱旂郴鎴戜滑"> 鑱旂郴鎴戜滑 </a>
</h4>
@@ -27,84 +29,308 @@
<a name="鏍稿績鍔熻兘"></a>
## 鏍稿績鍔熻兘
- FunASR鏄竴涓熀纭�璇煶璇嗗埆宸ュ叿鍖咃紝鎻愪緵澶氱鍔熻兘锛屽寘鎷闊宠瘑鍒紙ASR锛夈�佽闊崇鐐规娴嬶紙VAD锛夈�佹爣鐐规仮澶嶃�佽瑷�妯″瀷銆佽璇濅汉楠岃瘉銆佽璇濅汉鍒嗙鍜屽浜哄璇濊闊宠瘑鍒瓑銆侳unASR鎻愪緵浜嗕究鎹风殑鑴氭湰鍜屾暀绋嬶紝鏀寔棰勮缁冨ソ鐨勬ā鍨嬬殑鎺ㄧ悊涓庡井璋冦��
-- 鎴戜滑鍦╗ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition)涓嶽huggingface](https://huggingface.co/FunAudio)涓婂彂甯冧簡澶ч噺寮�婧愭暟鎹泦鎴栬�呮捣閲忓伐涓氭暟鎹缁冪殑妯″瀷锛屽彲浠ラ�氳繃鎴戜滑鐨刐妯″瀷浠撳簱](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md)浜嗚В妯″瀷鐨勮缁嗕俊鎭�備唬琛ㄦ�х殑[Paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)闈炶嚜鍥炲綊绔埌绔闊宠瘑鍒ā鍨嬪叿鏈夐珮绮惧害銆侀珮鏁堢巼銆佷究鎹烽儴缃茬殑浼樼偣锛屾敮鎸佸揩閫熸瀯寤鸿闊宠瘑鍒湇鍔★紝璇︾粏淇℃伅鍙互闃呰([鏈嶅姟閮ㄧ讲鏂囨。](funasr/runtime/readme_cn.md))銆�
+- 鎴戜滑鍦╗ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition)涓嶽huggingface](https://huggingface.co/FunASR)涓婂彂甯冧簡澶ч噺寮�婧愭暟鎹泦鎴栬�呮捣閲忓伐涓氭暟鎹缁冪殑妯″瀷锛屽彲浠ラ�氳繃鎴戜滑鐨刐妯″瀷浠撳簱](https://github.com/modelscope/FunASR/blob/main/model_zoo/readme_zh.md)浜嗚В妯″瀷鐨勮缁嗕俊鎭�備唬琛ㄦ�х殑[Paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)闈炶嚜鍥炲綊绔埌绔闊宠瘑鍒ā鍨嬪叿鏈夐珮绮惧害銆侀珮鏁堢巼銆佷究鎹烽儴缃茬殑浼樼偣锛屾敮鎸佸揩閫熸瀯寤鸿闊宠瘑鍒湇鍔★紝璇︾粏淇℃伅鍙互闃呰([鏈嶅姟閮ㄧ讲鏂囨。](runtime/readme_cn.md))銆�
<a name="鏈�鏂板姩鎬�"></a>
## 鏈�鏂板姩鎬�
-- 20223/10/17: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(runtime/readme_cn.html#cpu))
+- 2024/10/29: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.12 鍙戝竷锛�2pass-offline妯″紡鏀寔SensevoiceSmall妯″瀷锛涜缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/10/10锛氭柊澧炲姞Whisper-large-v3-turbo妯″瀷鏀寔锛屽璇█璇煶璇嗗埆/缈昏瘧/璇璇嗗埆锛屾敮鎸佷粠 [modelscope](examples/industrial_data_pretraining/whisper/demo.py)浠撳簱涓嬭浇锛屼篃鏀寔浠� [openai](examples/industrial_data_pretraining/whisper/demo_from_openai.py)浠撳簱涓嬭浇妯″瀷銆�
+- 2024/09/26: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.6銆佽嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔� 1.7銆佷腑鏂囧疄鏃惰闊冲惉鍐欐湇鍔� 1.11 鍙戝竷锛屼慨澶峅NNX鍐呭瓨娉勬紡銆佹敮鎸丼ensevoiceSmall onnx妯″瀷锛涗腑鏂囩绾挎枃浠惰浆鍐欐湇鍔PU 2.0 鍙戝竷锛屼慨澶嶆樉瀛樻硠婕�; 璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/09/25锛氭柊澧炶闊冲敜閱掓ā鍨嬶紝鏀寔[fsmn_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [fsmn_kws_mt](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [sanm_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-offline), [sanm_kws_streaming](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online) 4涓ā鍨嬬殑寰皟鍜屾帹鐞嗐��
+- 2024/07/04锛歔SenseVoice](https://github.com/FunAudioLLM/SenseVoice) 鏄竴涓熀纭�璇煶鐞嗚В妯″瀷锛屽叿澶囧绉嶈闊崇悊瑙h兘鍔涳紝娑电洊浜嗚嚜鍔ㄨ闊宠瘑鍒紙ASR锛夈�佽瑷�璇嗗埆锛圠ID锛夈�佹儏鎰熻瘑鍒紙SER锛変互鍙婇煶棰戜簨浠舵娴嬶紙AED锛夈��
+- 2024/07/01锛氫腑鏂囩绾挎枃浠惰浆鍐欐湇鍔PU鐗堟湰 1.1鍙戝竷锛屼紭鍖朾ladedisc妯″瀷鍏煎鎬ч棶棰橈紱璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/06/27锛氫腑鏂囩绾挎枃浠惰浆鍐欐湇鍔PU鐗堟湰 1.0鍙戝竷锛屾敮鎸佸姩鎬乥atch锛屾敮鎸佸璺苟鍙戯紝鍦ㄩ暱闊抽娴嬭瘯闆嗕笂鍗曠嚎RTF涓�0.0076锛屽绾垮姞閫熸瘮涓�1200+锛圕PU涓�330+锛夛紱璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/05/15锛氭柊澧炲姞鎯呮劅璇嗗埆妯″瀷锛孾emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)锛孾emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)锛孾emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary)锛岃緭鍑烘儏鎰熺被鍒负锛氱敓姘�/angry锛屽紑蹇�/happy锛屼腑绔�/neutral锛岄毦杩�/sad銆�
+- 2024/05/15: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.5銆佽嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔� 1.6銆佷腑鏂囧疄鏃惰闊冲惉鍐欐湇鍔� 1.10 鍙戝竷锛岄�傞厤FunASR 1.0妯″瀷缁撴瀯锛涜缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/03/05锛氭柊澧炲姞Qwen-Audio涓嶲wen-Audio-Chat闊抽鏂囨湰妯℃�佸ぇ妯″瀷锛屽湪澶氫釜闊抽棰嗗煙娴嬭瘯姒滃崟鍒锋锛屼腑鏀寔璇煶瀵硅瘽锛岃缁嗙敤娉曡 [绀轰緥](examples/industrial_data_pretraining/qwen_audio)銆�
+- 2024/03/05锛氭柊澧炲姞Whisper-large-v3妯″瀷鏀寔锛屽璇█璇煶璇嗗埆/缈昏瘧/璇璇嗗埆锛屾敮鎸佷粠 [modelscope](examples/industrial_data_pretraining/whisper/demo.py)浠撳簱涓嬭浇锛屼篃鏀寔浠� [openai](examples/industrial_data_pretraining/whisper/demo_from_openai.py)浠撳簱涓嬭浇妯″瀷銆�
+- 2024/03/05: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.4銆佽嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔� 1.5銆佷腑鏂囧疄鏃惰闊冲惉鍐欐湇鍔� 1.9 鍙戝竷锛宒ocker闀滃儚鏀寔arm64骞冲彴锛屽崌绾odelscope鐗堟湰锛涜缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/01/30锛歠unasr-1.0鍙戝竷锛屾洿鏂拌鏄嶽鏂囨。](https://github.com/alibaba-damo-academy/FunASR/discussions/1319)
+
+<details><summary>灞曞紑鏃ュ織</summary>
+
+- 2024/01/30锛氭柊澧炲姞鎯呮劅璇嗗埆 [妯″瀷閾炬帴](https://www.modelscope.cn/models/iic/emotion2vec_base_finetuned/summary)锛屽師濮嬫ā鍨� [repo](https://github.com/ddlBoJack/emotion2vec).
+- 2024/01/25: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.2銆佽嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔� 1.3锛屼紭鍖杤ad鏁版嵁澶勭悊鏂瑰紡锛屽ぇ骞呴檷浣庡嘲鍊煎唴瀛樺崰鐢紝鍐呭瓨娉勬紡浼樺寲锛涗腑鏂囧疄鏃惰闊冲惉鍐欐湇鍔� 1.7 鍙戝竷锛屽鎴风浼樺寲锛涜缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
+- 2024/01/09: funasr绀惧尯杞欢鍖厀indows 2.0鐗堟湰鍙戝竷锛屾敮鎸佽蒋浠跺寘涓枃绂荤嚎鏂囦欢杞啓4.1銆佽嫳鏂囩绾挎枃浠惰浆鍐�1.2銆佷腑鏂囧疄鏃跺惉鍐欐湇鍔�1.6鐨勬渶鏂板姛鑳斤紝璇︾粏淇℃伅鍙傞槄([FunASR绀惧尯杞欢鍖厀indows鐗堟湰](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
+- 2024/01/03: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.0 鍙戝竷锛屾柊澧炴敮鎸�8k妯″瀷銆佷紭鍖栨椂闂存埑涓嶅尮閰嶉棶棰樺強澧炲姞鍙ュ瓙绾у埆鏃堕棿鎴炽�佷紭鍖栬嫳鏂囧崟璇峟st鐑瘝鏁堟灉銆佹敮鎸佽嚜鍔ㄥ寲閰嶇疆绾跨▼鍙傛暟锛屽悓鏃朵慨澶嶅凡鐭ョ殑crash闂鍙婂唴瀛樻硠婕忛棶棰橈紝璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md#涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2024/01/03: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.6 鍙戝竷锛�2pass-offline妯″紡鏀寔Ngram璇█妯″瀷瑙g爜銆亀fst鐑瘝锛屽悓鏃朵慨澶嶅凡鐭ョ殑crash闂鍙婂唴瀛樻硠婕忛棶棰橈紝璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md#涓枃瀹炴椂璇煶鍚啓鏈嶅姟cpu鐗堟湰))
+- 2024/01/03: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.2 鍙戝竷锛屼慨澶嶅凡鐭ョ殑crash闂鍙婂唴瀛樻硠婕忛棶棰橈紝璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md#鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2023/12/04: funasr绀惧尯杞欢鍖厀indows 1.0鐗堟湰鍙戝竷锛屾敮鎸佷腑鏂囩绾挎枃浠惰浆鍐欍�佽嫳鏂囩绾挎枃浠惰浆鍐欍�佷腑鏂囧疄鏃跺惉鍐欐湇鍔★紝璇︾粏淇℃伅鍙傞槄([FunASR绀惧尯杞欢鍖厀indows鐗堟湰](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
+- 2023/11/08锛氫腑鏂囩绾挎枃浠惰浆鍐欐湇鍔�3.0 CPU鐗堟湰鍙戝竷锛屾柊澧炴爣鐐瑰ぇ妯″瀷銆丯gram璇█妯″瀷涓巜fst鐑瘝锛岃缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md#涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2023/10/17: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md#鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
- 2023/10/13: [SlideSpeech](https://slidespeech.github.io/): 涓�涓ぇ瑙勬ā鐨勫妯℃�侀煶瑙嗛璇枡搴擄紝涓昏鏄湪绾夸細璁垨鑰呭湪绾胯绋嬪満鏅紝鍖呭惈浜嗗ぇ閲忎笌鍙戣█浜鸿璇濆疄鏃跺悓姝ョ殑骞荤伅鐗囥��
- 2023.10.10: [Paraformer-long-Spk](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr_vad_spk/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/demo.py)妯″瀷鍙戝竷锛屾敮鎸佸湪闀胯闊宠瘑鍒殑鍩虹涓婅幏鍙栨瘡鍙ヨ瘽鐨勮璇濅汉鏍囩銆�
- 2023.10.07: [FunCodec](https://github.com/alibaba-damo-academy/FunCodec): FunCodec鎻愪緵寮�婧愭ā鍨嬪拰璁粌宸ュ叿锛屽彲浠ョ敤浜庨煶棰戠鏁g紪鐮侊紝浠ュ強鍩轰簬绂绘暎缂栫爜鐨勮闊宠瘑鍒�佽闊冲悎鎴愮瓑浠诲姟銆�
-- 2023.09.01: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟2.0 CPU鐗堟湰鍙戝竷锛屾柊澧瀎fmpeg銆佹椂闂存埑涓庣儹璇嶆ā鍨嬫敮鎸侊紝璇︾粏淇℃伅鍙傞槄([涓�閿儴缃叉枃妗(runtime/readme_cn.html#id6))
-- 2023.08.07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(runtime/readme_cn.html#id3))
+- 2023.09.01: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟2.0 CPU鐗堟湰鍙戝竷锛屾柊澧瀎fmpeg銆佹椂闂存埑涓庣儹璇嶆ā鍨嬫敮鎸侊紝璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md#涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2023.08.07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md#涓枃瀹炴椂璇煶鍚啓鏈嶅姟cpu鐗堟湰))
- 2023.07.17: BAT涓�绉嶄綆寤惰繜浣庡唴瀛樻秷鑰楃殑RNN-T妯″瀷鍙戝竷锛岃缁嗕俊鎭弬闃咃紙[BAT](egs/aishell/bat)锛�
- 2023.06.26: ASRU2023 澶氶�氶亾澶氭柟浼氳杞綍鎸戞垬璧�2.0瀹屾垚绔炶禌缁撴灉鍏竷锛岃缁嗕俊鎭弬闃咃紙[M2MeT2.0](https://alibaba-damo-academy.github.io/FunASR/m2met2_cn/index.html)锛�
+</details>
+
<a name="瀹夎鏁欑▼"></a>
## 瀹夎鏁欑▼
-FunASR瀹夎鏁欑▼璇烽槄璇伙紙[Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)锛�
+
+- 瀹夎funasr涔嬪墠锛岀‘淇濆凡缁忓畨瑁呬簡涓嬮潰渚濊禆鐜:
+```text
+python>=3.8
+torch>=1.13
+torchaudio
+```
+
+- pip瀹夎
+```shell
+pip3 install -U funasr
+```
+
+- 鎴栬�呬粠婧愪唬鐮佸畨瑁�
+``` sh
+git clone https://github.com/alibaba/FunASR.git && cd FunASR
+pip3 install -e ./
+```
+
+濡傛灉闇�瑕佷娇鐢ㄥ伐涓氶璁粌妯″瀷锛屽畨瑁卪odelscope涓巋uggingface_hub锛堝彲閫夛級
+
+```shell
+pip3 install -U modelscope huggingface huggingface_hub
+```
## 妯″瀷浠撳簱
-FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](./MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�僛妯″瀷浠撳簱]()銆�
+FunASR寮�婧愪簡澶ч噺鍦ㄥ伐涓氭暟鎹笂棰勮缁冩ā鍨嬶紝鎮ㄥ彲浠ュ湪[妯″瀷璁稿彲鍗忚](./MODEL_LICENSE)涓嬭嚜鐢变娇鐢ㄣ�佸鍒躲�佷慨鏀瑰拰鍒嗕韩FunASR妯″瀷锛屼笅闈㈠垪涓句唬琛ㄦ�х殑妯″瀷锛屾洿澶氭ā鍨嬭鍙傝�� [妯″瀷浠撳簱](./model_zoo)銆�
-锛堟敞锛歔馃]()琛ㄧずHuggingface妯″瀷浠撳簱閾炬帴锛孾猸怾()琛ㄧずModelScope妯″瀷浠撳簱閾炬帴锛�
+锛堟敞锛氣瓙 琛ㄧずModelScope妯″瀷浠撳簱锛岎煠� 琛ㄧずHuggingface妯″瀷浠撳簱锛岎煃�琛ㄧずOpenAI妯″瀷浠撳簱锛�
-| 妯″瀷鍚嶅瓧 | 浠诲姟璇︽儏 | 璁粌鏁版嵁 | 鍙傛暟閲� |
-|:------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------:|:------------:|:----:|
-| paraformer-zh ([馃]() [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-zh-spk ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) ) | 鍒嗚鑹茶闊宠瘑鍒紝甯︽椂闂存埑杈撳嚭锛岄潪瀹炴椂 | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-zh-online ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) ) | 璇煶璇嗗埆锛屽疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-en ([馃]() [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) ) | 鍒嗚鑹茶闊宠瘑鍒紝甯︽椂闂存埑杈撳嚭锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| paraformer-en-spk ([馃]() [猸怾() ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| conformer-en ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| ct-punc ([馃]() [猸怾(https://modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large/summary) ) | 鏍囩偣鎭㈠锛岄潪瀹炴椂 | 100M锛屼腑鏂囦笌鑻辨枃 | 1.1G |
-| fsmn-vad ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) ) | 璇煶绔偣妫�娴嬶紝瀹炴椂 | 5000灏忔椂锛屼腑鏂囦笌鑻辨枃 | 0.4M |
-| fa-zh ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) ) | 瀛楃骇鍒椂闂存埑棰勬祴 | 50000灏忔椂锛屼腑鏂� | 38M |
-
+| 妯″瀷鍚嶅瓧 | 浠诲姟璇︽儏 | 璁粌鏁版嵁 | 鍙傛暟閲� |
+|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------:|:--------------:|:------:|
+| SenseVoiceSmall <br> ([猸怾(https://www.modelscope.cn/models/iic/SenseVoiceSmall) [馃](https://huggingface.co/FunAudioLLM/SenseVoiceSmall) ) | 澶氱璇煶鐞嗚В鑳藉姏锛屾兜鐩栦簡鑷姩璇煶璇嗗埆锛圓SR锛夈�佽瑷�璇嗗埆锛圠ID锛夈�佹儏鎰熻瘑鍒紙SER锛変互鍙婇煶棰戜簨浠舵娴嬶紙AED锛� | 400000灏忔椂锛屼腑鏂� | 330M |
+| paraformer-zh <br> ([猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) [馃](https://huggingface.co/funasr/paraformer-zh) ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
+| paraformer-zh-streaming <br> ( [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) [馃](https://huggingface.co/funasr/paraformer-zh-streaming) ) | 璇煶璇嗗埆锛屽疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
+| paraformer-en <br> ( [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) [馃](https://huggingface.co/funasr/paraformer-en) ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
+| conformer-en <br> ( [猸怾(https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) [馃](https://huggingface.co/funasr/conformer-en) ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
+| ct-punc <br> ( [猸怾(https://modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large/summary) [馃](https://huggingface.co/funasr/ct-punc) ) | 鏍囩偣鎭㈠ | 100M锛屼腑鏂囦笌鑻辨枃 | 290M |
+| fsmn-vad <br> ( [猸怾(https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) [馃](https://huggingface.co/funasr/fsmn-vad) ) | 璇煶绔偣妫�娴嬶紝瀹炴椂 | 5000灏忔椂锛屼腑鏂囦笌鑻辨枃 | 0.4M |
+| fsmn-kws <br> ( [猸怾(https://modelscope.cn/models/iic/speech_charctc_kws_phone-xiaoyun/summary) ) | 璇煶鍞ら啋锛屽疄鏃� | 5000灏忔椂锛屼腑鏂� | 0.7M |
+| fa-zh <br> ( [猸怾(https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) [馃](https://huggingface.co/funasr/fa-zh) ) | 瀛楃骇鍒椂闂存埑棰勬祴 | 50000灏忔椂锛屼腑鏂� | 38M |
+| cam++ <br> ( [猸怾(https://modelscope.cn/models/iic/speech_campplus_sv_zh-cn_16k-common/summary) [馃](https://huggingface.co/funasr/campplus) ) | 璇磋瘽浜虹‘璁�/鍒嗗壊 | 5000灏忔椂 | 7.2M |
+| Whisper-large-v3 <br> ([猸怾(https://www.modelscope.cn/models/iic/Whisper-large-v3/summary) [馃崁](https://github.com/openai/whisper) ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 澶氳瑷� | 1550 M |
+| Whisper-large-v3-turbo <br> ([猸怾(https://www.modelscope.cn/models/iic/Whisper-large-v3-turbo/summary) [馃崁](https://github.com/openai/whisper) ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 澶氳瑷� | 809 M |
+| Qwen-Audio <br> ([猸怾(examples/industrial_data_pretraining/qwen_audio/demo.py) [馃](https://huggingface.co/Qwen/Qwen-Audio) ) | 闊抽鏂囨湰澶氭ā鎬佸ぇ妯″瀷锛堥璁粌锛� | 澶氳瑷� | 8B |
+| Qwen-Audio-Chat <br> ([猸怾(examples/industrial_data_pretraining/qwen_audio/demo_chat.py) [馃](https://huggingface.co/Qwen/Qwen-Audio-Chat) ) | 闊抽鏂囨湰澶氭ā鎬佸ぇ妯″瀷锛坈hat鐗堟湰锛� | 澶氳瑷� | 8B |
+| emotion2vec+large <br> ([猸怾(https://modelscope.cn/models/iic/emotion2vec_plus_large/summary) [馃](https://huggingface.co/emotion2vec/emotion2vec_plus_large) ) | 鎯呮劅璇嗗埆妯″瀷 | 40000灏忔椂锛�4绉嶆儏鎰熺被鍒� | 300M |
<a name="蹇�熷紑濮�"></a>
## 蹇�熷紑濮�
-FunASR鏀寔鏁颁竾灏忔椂宸ヤ笟鏁版嵁璁粌鐨勬ā鍨嬬殑鎺ㄧ悊鍜屽井璋冿紝璇︾粏淇℃伅鍙互鍙傞槄锛圼modelscope_egs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)锛夛紱涔熸敮鎸佸鏈爣鍑嗘暟鎹泦妯″瀷鐨勮缁冨拰寰皟锛岃缁嗕俊鎭彲浠ュ弬闃咃紙[egs](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html)锛夈��
-涓嬮潰涓哄揩閫熶笂鎵嬫暀绋嬶紝娴嬭瘯闊抽锛圼涓枃](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav)锛孾鑻辨枃]()锛�
+涓嬮潰涓哄揩閫熶笂鎵嬫暀绋嬶紝娴嬭瘯闊抽锛圼涓枃](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav)锛孾鑻辨枃](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_en.wav)锛�
+
+### 鍙墽琛屽懡浠よ
+
+```shell
+funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=asr_example_zh.wav
+```
+
+娉細鏀寔鍗曟潯闊抽鏂囦欢璇嗗埆锛屼篃鏀寔鏂囦欢鍒楄〃锛屽垪琛ㄤ负kaldi椋庢牸wav.scp锛歚wav_id wav_path`
+
### 闈炲疄鏃惰闊宠瘑鍒�
+#### SenseVoice
```python
-from funasr import infer
+from funasr import AutoModel
+from funasr.utils.postprocess_utils import rich_transcription_postprocess
-p = infer(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc", model_hub="ms")
+model_dir = "iic/SenseVoiceSmall"
-res = p("asr_example_zh.wav", batch_size_token=5000)
+model = AutoModel(
+ model=model_dir,
+ vad_model="fsmn-vad",
+ vad_kwargs={"max_single_segment_time": 30000},
+ device="cuda:0",
+)
+
+# en
+res = model.generate(
+ input=f"{model.model_path}/example/en.mp3",
+ cache={},
+ language="auto", # "zn", "en", "yue", "ja", "ko", "nospeech"
+ use_itn=True,
+ batch_size_s=60,
+ merge_vad=True, #
+ merge_length_s=15,
+)
+text = rich_transcription_postprocess(res[0]["text"])
+print(text)
+```
+鍙傛暟璇存槑锛�
+- `model_dir`锛氭ā鍨嬪悕绉帮紝鎴栨湰鍦扮鐩樹腑鐨勬ā鍨嬭矾寰勩��
+- `vad_model`锛氳〃绀哄紑鍚疺AD锛孷AD鐨勪綔鐢ㄦ槸灏嗛暱闊抽鍒囧壊鎴愮煭闊抽锛屾鏃舵帹鐞嗚�楁椂鍖呮嫭浜哣AD涓嶴enseVoice鎬昏�楁椂锛屼负閾捐矾鑰楁椂锛屽鏋滈渶瑕佸崟鐙祴璇昐enseVoice妯″瀷鑰楁椂锛屽彲浠ュ叧闂璙AD妯″瀷銆�
+- `vad_kwargs`锛氳〃绀篤AD妯″瀷閰嶇疆,`max_single_segment_time`: 琛ㄧず`vad_model`鏈�澶у垏鍓查煶棰戞椂闀�, 鍗曚綅鏄绉抦s銆�
+- `use_itn`锛氳緭鍑虹粨鏋滀腑鏄惁鍖呭惈鏍囩偣涓庨�嗘枃鏈鍒欏寲銆�
+- `batch_size_s` 琛ㄧず閲囩敤鍔ㄦ�乥atch锛宐atch涓�婚煶棰戞椂闀匡紝鍗曚綅涓虹s銆�
+- `merge_vad`锛氭槸鍚﹀皢 vad 妯″瀷鍒囧壊鐨勭煭闊抽纰庣墖鍚堟垚锛屽悎骞跺悗闀垮害涓篳merge_length_s`锛屽崟浣嶄负绉抯銆�
+- `ban_emo_unk`锛氱鐢╡mo_unk鏍囩锛岀鐢ㄥ悗鎵�鏈夌殑鍙ュ瓙閮戒細琚祴涓庢儏鎰熸爣绛俱��
+
+#### Paraformer
+```python
+from funasr import AutoModel
+# paraformer-zh is a multi-functional asr model
+# use vad, punc, spk or not as you need
+model = AutoModel(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc",
+ # spk_model="cam++"
+ )
+res = model.generate(input=f"{model.model_path}/example/asr_example.wav",
+ batch_size_s=300,
+ hotword='榄旀惌')
print(res)
```
-娉細`model_hub`锛氳〃绀烘ā鍨嬩粨搴擄紝`ms`涓洪�夋嫨modelscope涓嬭浇锛宍hf`涓洪�夋嫨huggingface涓嬭浇銆�
+娉細`hub`锛氳〃绀烘ā鍨嬩粨搴擄紝`ms`涓洪�夋嫨modelscope涓嬭浇锛宍hf`涓洪�夋嫨huggingface涓嬭浇銆�
### 瀹炴椂璇煶璇嗗埆
-```python
-from funasr import infer
-p = infer(model="paraformer-zh-streaming", model_hub="ms")
+```python
+from funasr import AutoModel
chunk_size = [0, 10, 5] #[0, 10, 5] 600ms, [0, 8, 4] 480ms
-param_dict = {"cache": dict(), "is_final": False, "chunk_size": chunk_size, "encoder_chunk_look_back": 4, "decoder_chunk_look_back": 1}
+encoder_chunk_look_back = 4 #number of chunks to lookback for encoder self-attention
+decoder_chunk_look_back = 1 #number of encoder chunks to lookback for decoder cross-attention
-import torchaudio
-speech = torchaudio.load("asr_example_zh.wav")[0][0]
-speech_length = speech.shape[0]
+model = AutoModel(model="paraformer-zh-streaming")
-stride_size = chunk_size[1] * 960
-sample_offset = 0
-for sample_offset in range(0, speech_length, min(stride_size, speech_length - sample_offset)):
- param_dict["is_final"] = True if sample_offset + stride_size >= speech_length - 1 else False
- input = speech[sample_offset: sample_offset + stride_size]
- rec_result = p(input=input, param_dict=param_dict)
- print(rec_result)
+import soundfile
+import os
+
+wav_file = os.path.join(model.model_path, "example/asr_example.wav")
+speech, sample_rate = soundfile.read(wav_file)
+chunk_stride = chunk_size[1] * 960 # 600ms
+
+cache = {}
+total_chunk_num = int(len((speech)-1)/chunk_stride+1)
+for i in range(total_chunk_num):
+ speech_chunk = speech[i*chunk_stride:(i+1)*chunk_stride]
+ is_final = i == total_chunk_num - 1
+ res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size, encoder_chunk_look_back=encoder_chunk_look_back, decoder_chunk_look_back=decoder_chunk_look_back)
+ print(res)
```
+
娉細`chunk_size`涓烘祦寮忓欢鏃堕厤缃紝`[0,10,5]`琛ㄧず涓婂睆瀹炴椂鍑哄瓧绮掑害涓篳10*60=600ms`锛屾湭鏉ヤ俊鎭负`5*60=300ms`銆傛瘡娆℃帹鐞嗚緭鍏ヤ负`600ms`锛堥噰鏍风偣鏁颁负`16000*0.6=960`锛夛紝杈撳嚭涓哄搴旀枃瀛楋紝鏈�鍚庝竴涓闊崇墖娈佃緭鍏ラ渶瑕佽缃甡is_final=True`鏉ュ己鍒惰緭鍑烘渶鍚庝竴涓瓧銆�
-鏇村璇︾粏鐢ㄦ硶锛圼鏂颁汉鏂囨。](https://alibaba-damo-academy.github.io/FunASR/en/funasr/quick_start_zh.html)锛�
+<details><summary>鏇村渚嬪瓙</summary>
+### 璇煶绔偣妫�娴嬶紙闈炲疄鏃讹級
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="fsmn-vad")
+
+wav_file = f"{model.model_path}/example/vad_example.wav"
+res = model.generate(input=wav_file)
+print(res)
+```
+娉細VAD妯″瀷杈撳嚭鏍煎紡涓猴細`[[beg1, end1], [beg2, end2], .., [begN, endN]]`锛屽叾涓璥begN/endN`琛ㄧず绗琡N`涓湁鏁堥煶棰戠墖娈电殑璧峰鐐�/缁撴潫鐐癸紝
+鍗曚綅涓烘绉掋��
+
+### 璇煶绔偣妫�娴嬶紙瀹炴椂锛�
+```python
+from funasr import AutoModel
+
+chunk_size = 200 # ms
+model = AutoModel(model="fsmn-vad")
+
+import soundfile
+
+wav_file = f"{model.model_path}/example/vad_example.wav"
+speech, sample_rate = soundfile.read(wav_file)
+chunk_stride = int(chunk_size * sample_rate / 1000)
+
+cache = {}
+total_chunk_num = int(len((speech)-1)/chunk_stride+1)
+for i in range(total_chunk_num):
+ speech_chunk = speech[i*chunk_stride:(i+1)*chunk_stride]
+ is_final = i == total_chunk_num - 1
+ res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size)
+ if len(res[0]["value"]):
+ print(res)
+```
+娉細娴佸紡VAD妯″瀷杈撳嚭鏍煎紡涓�4绉嶆儏鍐碉細
+- `[[beg1, end1], [beg2, end2], .., [begN, endN]]`锛氬悓涓婄绾縑AD杈撳嚭缁撴灉銆�
+- `[[beg, -1]]`锛氳〃绀哄彧妫�娴嬪埌璧峰鐐广��
+- `[[-1, end]]`锛氳〃绀哄彧妫�娴嬪埌缁撴潫鐐广��
+- `[]`锛氳〃绀烘棦娌℃湁妫�娴嬪埌璧峰鐐癸紝涔熸病鏈夋娴嬪埌缁撴潫鐐�
+杈撳嚭缁撴灉鍗曚綅涓烘绉掞紝浠庤捣濮嬬偣寮�濮嬬殑缁濆鏃堕棿銆�
+
+### 鏍囩偣鎭㈠
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="ct-punc")
+
+res = model.generate(input="閭d粖澶╃殑浼氬氨鍒拌繖閲屽惂 happy new year 鏄庡勾瑙�")
+print(res)
+```
+
+### 鏃堕棿鎴抽娴�
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="fa-zh")
+
+wav_file = f"{model.model_path}/example/asr_example.wav"
+text_file = f"{model.model_path}/example/text.txt"
+res = model.generate(input=(wav_file, text_file), data_type=("sound", "text"))
+print(res)
+```
+
+### 鎯呮劅璇嗗埆
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="emotion2vec_plus_large")
+
+wav_file = f"{model.model_path}/example/test.wav"
+
+res = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False)
+print(res)
+```
+
+鏇磋缁嗭紙[鏁欑▼鏂囨。](docs/tutorial/README_zh.md)锛夛紝
+鏇村锛圼妯″瀷绀轰緥](https://github.com/alibaba-damo-academy/FunASR/tree/main/examples/industrial_data_pretraining)锛�
+
+</details>
+
+## 瀵煎嚭ONNX
+### 浠庡懡浠よ瀵煎嚭
+```shell
+funasr-export ++model=paraformer ++quantize=false
+```
+
+### 浠嶱ython瀵煎嚭
+```python
+from funasr import AutoModel
+
+model = AutoModel(model="paraformer")
+
+res = model.export(quantize=False)
+```
+
+### 娴嬭瘯ONNX
+```python
+# pip3 install -U funasr-onnx
+from pathlib import Path
+from runtime.python.onnxruntime.funasr_onnx.paraformer_bin import Paraformer
+
+
+home_dir = Path.home()
+
+model_dir = "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
+model = Paraformer(model_dir, batch_size=1, quantize=True)
+
+wav_path = [f"{home_dir}/.cache/modelscope/hub/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav"]
+
+result = model(wav_path)
+print(result)
+```
+
+鏇村渚嬪瓙璇峰弬鑰� [鏍蜂緥](runtime/python/onnxruntime)
<a name="鏈嶅姟閮ㄧ讲"></a>
## 鏈嶅姟閮ㄧ讲
@@ -122,16 +348,16 @@
<a name="绀惧尯浜ゆ祦"></a>
## 鑱旂郴鎴戜滑
-濡傛灉鎮ㄥ湪浣跨敤涓亣鍒伴棶棰橈紝鍙互鐩存帴鍦╣ithub椤甸潰鎻怚ssues銆傛杩庤闊冲叴瓒g埍濂借�呮壂鎻忎互涓嬬殑閽夐拤缇ゆ垨鑰呭井淇$兢浜岀淮鐮佸姞鍏ョぞ鍖虹兢锛岃繘琛屼氦娴佸拰璁ㄨ銆�
+濡傛灉鎮ㄥ湪浣跨敤涓亣鍒伴棶棰橈紝鍙互鐩存帴鍦╣ithub椤甸潰鎻怚ssues銆傛杩庤闊冲叴瓒g埍濂借�呮壂鎻忎互涓嬬殑閽夐拤缇や簩缁寸爜鍔犲叆绀惧尯缇わ紝杩涜浜ゆ祦鍜岃璁恒��
-| 閽夐拤缇� | 寰俊 |
-|:---------------------------------------------------------------------:|:-----------------------------------------------------:|
-| <div align="left"><img src="docs/images/dingding.jpg" width="250"/> | <img src="docs/images/wechat.png" width="215"/></div> |
+| 閽夐拤缇� |
+|:-------------------------------------------------------------------:|
+| <div align="left"><img src="docs/images/dingding.png" width="250"/> |
## 绀惧尯璐$尞鑰�
-| <div align="left"><img src="docs/images/nwpu.png" width="260"/> | <img src="docs/images/China_Telecom.png" width="200"/> </div> | <img src="docs/images/RapidAI.png" width="200"/> </div> | <img src="docs/images/aihealthx.png" width="200"/> </div> | <img src="docs/images/XVERSE.png" width="250"/> </div> |
-|:---------------------------------------------------------------:|:--------------------------------------------------------------:|:-------------------------------------------------------:|:-----------------------------------------------------------:|:------------------------------------------------------:|
+| <div align="left"><img src="docs/images/alibaba.png" width="260"/> | <div align="left"><img src="docs/images/nwpu.png" width="260"/> | <img src="docs/images/China_Telecom.png" width="200"/> </div> | <img src="docs/images/RapidAI.png" width="200"/> </div> | <img src="docs/images/aihealthx.png" width="200"/> </div> | <img src="docs/images/XVERSE.png" width="250"/> </div> |
+|:------------------------------------------------------------------:|:---------------------------------------------------------------:|:--------------------------------------------------------------:|:-------------------------------------------------------:|:-----------------------------------------------------------:|:------------------------------------------------------:|
璐$尞鑰呭悕鍗曡鍙傝�冿紙[鑷磋阿鍚嶅崟](./Acknowledge.md)锛�
@@ -163,4 +389,10 @@
pages={2063--2067},
doi={10.21437/Interspeech.2022-9996}
}
+@article{shi2023seaco,
+ author={Xian Shi and Yexin Yang and Zerui Li and Yanni Chen and Zhifu Gao and Shiliang Zhang},
+ title={{SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability}},
+ year=2023,
+ journal={arXiv preprint arXiv:2308.03266(accepted by ICASSP2024)},
+}
```
--
Gitblit v1.9.1