From a05e753d11d9c36983ec4e58c421dbcf86d1dcd4 Mon Sep 17 00:00:00 2001 From: Xian Shi <40013335+R1ckShi@users.noreply.github.com> Date: 星期二, 17 十月 2023 16:47:27 +0800 Subject: [PATCH] Merge branch 'main' into dev_onnx --- funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md | 265 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 265 insertions(+), 0 deletions(-) diff --git a/funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md b/funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md new file mode 100644 index 0000000..48c67c9 --- /dev/null +++ b/funasr/runtime/docs/SDK_advanced_guide_offline_en_zh.md @@ -0,0 +1,265 @@ +# FunASR鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟寮�鍙戞寚鍗� + +FunASR鎻愪緵鍙竴閿湰鍦版垨鑰呬簯绔湇鍔″櫒閮ㄧ讲鐨勮嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔★紝鍐呮牳涓篎unASR宸插紑婧恟untime-SDK銆侳unASR-runtime缁撳悎浜嗚揪鎽╅櫌璇煶瀹為獙瀹ゅ湪Modelscope绀惧尯寮�婧愮殑璇煶绔偣妫�娴�(VAD)銆丳araformer-large璇煶璇嗗埆(ASR)銆佹爣鐐规娴�(PUNC) 绛夌浉鍏宠兘鍔涳紝鍙互鍑嗙‘銆侀珮鏁堢殑瀵归煶棰戣繘琛岄珮骞跺彂杞啓銆� + +鏈枃妗d负FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟寮�鍙戞寚鍗椼�傚鏋滄偍鎯冲揩閫熶綋楠岀绾挎枃浠惰浆鍐欐湇鍔★紝鍙弬鑰僛蹇�熶笂鎵媇(#蹇�熶笂鎵�)銆� + +## 鏈嶅姟鍣ㄩ厤缃� + +鐢ㄦ埛鍙互鏍规嵁鑷繁鐨勪笟鍔¢渶姹傦紝閫夋嫨鍚堥�傜殑鏈嶅姟鍣ㄩ厤缃紝鎺ㄨ崘閰嶇疆涓猴細 +- 閰嶇疆1: 锛圶86锛岃绠楀瀷锛夛紝4鏍竩CPU锛屽唴瀛�8G锛屽崟鏈哄彲浠ユ敮鎸佸ぇ绾�32璺殑璇锋眰 +- 閰嶇疆2: 锛圶86锛岃绠楀瀷锛夛紝16鏍竩CPU锛屽唴瀛�32G锛屽崟鏈哄彲浠ユ敮鎸佸ぇ绾�64璺殑璇锋眰 +- 閰嶇疆3: 锛圶86锛岃绠楀瀷锛夛紝64鏍竩CPU锛屽唴瀛�128G锛屽崟鏈哄彲浠ユ敮鎸佸ぇ绾�200璺殑璇锋眰 + +璇︾粏鎬ц兘娴嬭瘯鎶ュ憡锛圼鐐瑰嚮姝ゅ](./benchmark_onnx_cpp.md)锛� + +浜戞湇鍔″巶鍟嗭紝閽堝鏂扮敤鎴凤紝鏈�3涓湀鍏嶈垂璇曠敤娲诲姩锛岀敵璇锋暀绋嬶紙[鐐瑰嚮姝ゅ](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/docs/aliyun_server_tutorial.md)锛� + + +## 蹇�熶笂鎵� +### 闀滃儚鍚姩 + +閫氳繃涓嬭堪鍛戒护鎷夊彇骞跺惎鍔‵unASR runtime-SDK鐨刣ocker闀滃儚锛� + +```shell +sudo docker pull \ + registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0 +mkdir -p ./funasr-runtime-resources/models +sudo docker run -p 10095:10095 -it --privileged=true \ + -v $PWD/funasr-runtime-resources/models:/workspace/models \ + registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.0 +``` +濡傛灉鎮ㄦ病鏈夊畨瑁卍ocker锛屽彲鍙傝�僛Docker瀹夎](#Docker瀹夎) + +### 鏈嶅姟绔惎鍔� + +docker鍚姩涔嬪悗锛屽惎鍔� funasr-wss-server鏈嶅姟绋嬪簭锛� +```shell +cd FunASR/funasr/runtime +nohup bash run_server.sh \ + --download-model-dir /workspace/models \ + --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ + --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \ + --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx > log.out 2>&1 & + +# 濡傛灉鎮ㄦ兂鍏抽棴ssl锛屽鍔犲弬鏁帮細--certfile 0 + +``` +鏈嶅姟绔缁嗗弬鏁颁粙缁嶅彲鍙傝�僛鏈嶅姟绔敤娉曡瑙(#鏈嶅姟绔敤娉曡瑙�) +### 瀹㈡埛绔祴璇曚笌浣跨敤 + +涓嬭浇瀹㈡埛绔祴璇曞伐鍏风洰褰晄amples +```shell +wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz +``` +鎴戜滑浠ython璇█瀹㈡埛绔负渚嬶紝杩涜璇存槑锛屾敮鎸佸绉嶉煶棰戞牸寮忚緭鍏ワ紙.wav, .pcm, .mp3绛夛級锛屼篃鏀寔瑙嗛杈撳叆(.mp4绛�)锛屼互鍙婂鏂囦欢鍒楄〃wav.scp杈撳叆锛屽叾浠栫増鏈鎴风璇峰弬鑰冩枃妗o紙[鐐瑰嚮姝ゅ](#瀹㈡埛绔敤娉曡瑙�)锛夛紝瀹氬埗鏈嶅姟閮ㄧ讲璇峰弬鑰僛濡備綍瀹氬埗鏈嶅姟閮ㄧ讲](#濡備綍瀹氬埗鏈嶅姟閮ㄧ讲) +```shell +python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav" +``` + +------------------ +## Docker瀹夎 + +涓嬭堪姝ラ涓烘墜鍔ㄥ畨瑁卍ocker鐜鐨勬楠わ細 + +### docker鐜瀹夎 +```shell +# Ubuntu锛� +curl -fsSL https://test.docker.com -o test-docker.sh +sudo sh test-docker.sh +# Debian锛� +curl -fsSL https://get.docker.com -o get-docker.sh +sudo sh get-docker.sh +# CentOS锛� +curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun +# MacOS锛� +brew install --cask --appdir=/Applications docker +``` + +瀹夎璇﹁锛歨ttps://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html + +### docker鍚姩 + +```shell +sudo systemctl start docker +``` + + +## 瀹㈡埛绔敤娉曡瑙� + +鍦ㄦ湇鍔″櫒涓婂畬鎴怓unASR鏈嶅姟閮ㄧ讲浠ュ悗锛屽彲浠ラ�氳繃濡備笅鐨勬楠ゆ潵娴嬭瘯鍜屼娇鐢ㄧ绾挎枃浠惰浆鍐欐湇鍔°�� +鐩墠鍒嗗埆鏀寔浠ヤ笅鍑犵缂栫▼璇█瀹㈡埛绔� + +- [Python](#python-client) +- [CPP](#cpp-client) +- [html缃戦〉鐗堟湰](#Html缃戦〉鐗�) +- [Java](#Java-client) + +### python-client +鑻ユ兂鐩存帴杩愯client杩涜娴嬭瘯锛屽彲鍙傝�冨涓嬬畝鏄撹鏄庯紝浠ython鐗堟湰涓轰緥锛� + +```shell +python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline \ + --audio_in "../audio/asr_example.wav" --output_dir "./results" +``` + +鍛戒护鍙傛暟璇存槑锛� +```text +--host 涓篎unASR runtime-SDK鏈嶅姟閮ㄧ讲鏈哄櫒ip锛岄粯璁や负鏈満ip锛�127.0.0.1锛夛紝濡傛灉client涓庢湇鍔′笉鍦ㄥ悓涓�鍙版湇鍔″櫒锛� + 闇�瑕佹敼涓洪儴缃叉満鍣╥p +--port 10095 閮ㄧ讲绔彛鍙� +--mode offline琛ㄧず绂荤嚎鏂囦欢杞啓 +--audio_in 闇�瑕佽繘琛岃浆鍐欑殑闊抽鏂囦欢锛屾敮鎸佹枃浠惰矾寰勶紝鏂囦欢鍒楄〃wav.scp +--thread_num 璁剧疆骞跺彂鍙戦�佺嚎绋嬫暟锛岄粯璁や负1 +--ssl 璁剧疆鏄惁寮�鍚痵sl璇佷功鏍¢獙锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴 +--hotword 濡傛灉妯″瀷涓虹儹璇嶆ā鍨嬶紝鍙互璁剧疆鐑瘝: *.txt(姣忚涓�涓儹璇�) 鎴栬�呯┖鏍煎垎闅旂殑鐑瘝瀛楃涓�(闃块噷宸村反 杈炬懇闄�) +--use_itn 璁剧疆鏄惁浣跨敤itn锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴 +``` + +### cpp-client +杩涘叆samples/cpp鐩綍鍚庯紝鍙互鐢╟pp杩涜娴嬭瘯锛屾寚浠ゅ涓嬶細 +```shell +./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav +``` + +鍛戒护鍙傛暟璇存槑锛� + +```text +--server-ip 涓篎unASR runtime-SDK鏈嶅姟閮ㄧ讲鏈哄櫒ip锛岄粯璁や负鏈満ip锛�127.0.0.1锛夛紝濡傛灉client涓庢湇鍔′笉鍦ㄥ悓涓�鍙版湇鍔″櫒锛� + 闇�瑕佹敼涓洪儴缃叉満鍣╥p +--port 10095 閮ㄧ讲绔彛鍙� +--wav-path 闇�瑕佽繘琛岃浆鍐欑殑闊抽鏂囦欢锛屾敮鎸佹枃浠惰矾寰� +--hotword 濡傛灉妯″瀷涓虹儹璇嶆ā鍨嬶紝鍙互璁剧疆鐑瘝: *.txt(姣忚涓�涓儹璇�) 鎴栬�呯┖鏍煎垎闅旂殑鐑瘝瀛楃涓� (闃块噷宸村反 杈炬懇闄�) +--use-itn 璁剧疆鏄惁浣跨敤itn锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴 +``` + +### Html缃戦〉鐗� + +鍦ㄦ祻瑙堝櫒涓墦寮� html/static/index.html锛屽嵆鍙嚭鐜板涓嬮〉闈紝鏀寔楹﹀厠椋庤緭鍏ヤ笌鏂囦欢涓婁紶锛岀洿鎺ヨ繘琛屼綋楠� + +<img src="images/html.png" width="900"/> + +### Java-client + +```shell +FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline +``` +璇︾粏鍙互鍙傝�冩枃妗o紙[鐐瑰嚮姝ゅ](../java/readme.md)锛� + + + +## 鏈嶅姟绔敤娉曡瑙o細 + +### 鍚姩FunASR鏈嶅姟 +```shell +cd /workspace/FunASR/funasr/runtime +nohup bash run_server.sh \ + --download-model-dir /workspace/models \ + --model-dir damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx \ + --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ + --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \ + --decoder-thread-num 32 \ + --io-thread-num 8 \ + --port 10095 \ + --certfile ../../../ssl_key/server.crt \ + --keyfile ../../../ssl_key/server.key > log.out 2>&1 & + ``` +**run_server.sh鍛戒护鍙傛暟浠嬬粛** +```text +--download-model-dir 妯″瀷涓嬭浇鍦板潃锛岄�氳繃璁剧疆model ID浠嶮odelscope涓嬭浇妯″瀷 +--model-dir modelscope model ID +--quantize True涓洪噺鍖朅SR妯″瀷锛孎alse涓洪潪閲忓寲ASR妯″瀷锛岄粯璁ゆ槸True +--vad-dir modelscope model ID +--vad-quant True涓洪噺鍖朧AD妯″瀷锛孎alse涓洪潪閲忓寲VAD妯″瀷锛岄粯璁ゆ槸True +--punc-dir modelscope model ID +--punc-quant True涓洪噺鍖朠UNC妯″瀷锛孎alse涓洪潪閲忓寲PUNC妯″瀷锛岄粯璁ゆ槸True +--itn-dir modelscope model ID +--port 鏈嶅姟绔洃鍚殑绔彛鍙凤紝榛樿涓� 10095 +--decoder-thread-num 鏈嶅姟绔惎鍔ㄧ殑鎺ㄧ悊绾跨▼鏁帮紝榛樿涓� 8 +--io-thread-num 鏈嶅姟绔惎鍔ㄧ殑IO绾跨▼鏁帮紝榛樿涓� 1 +--certfile ssl鐨勮瘉涔︽枃浠讹紝榛樿涓猴細../../../ssl_key/server.crt锛屽鏋滈渶瑕佸叧闂璼sl锛屽弬鏁拌缃负0 +--keyfile ssl鐨勫瘑閽ユ枃浠讹紝榛樿涓猴細../../../ssl_key/server.key +``` + +### 鍏抽棴FunASR鏈嶅姟 +```text +# 鏌ョ湅 funasr-wss-server 瀵瑰簲鐨凱ID +ps -x | grep funasr-wss-server +kill -9 PID +``` + +### 淇敼妯″瀷鍙婂叾浠栧弬鏁� +鏇挎崲姝e湪浣跨敤鐨勬ā鍨嬫垨鑰呭叾浠栧弬鏁帮紝闇�鍏堝叧闂璅unASR鏈嶅姟锛屼慨鏀归渶瑕佹浛鎹㈢殑鍙傛暟锛屽苟閲嶆柊鍚姩FunASR鏈嶅姟銆傚叾涓ā鍨嬮渶涓篗odelScope涓殑ASR/VAD/PUNC妯″瀷锛屾垨鑰呬粠ModelScope涓ā鍨媐inetune鍚庣殑妯″瀷銆� +```text +# 渚嬪鏇挎崲ASR妯″瀷涓� damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx锛屽垯濡備笅璁剧疆鍙傛暟 --model-dir + --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx +# 璁剧疆绔彛鍙� --port + --port <port number> +# 璁剧疆鏈嶅姟绔惎鍔ㄧ殑鎺ㄧ悊绾跨▼鏁� --decoder-thread-num + --decoder-thread-num <decoder thread num> +# 璁剧疆鏈嶅姟绔惎鍔ㄧ殑IO绾跨▼鏁� --io-thread-num + --io-thread-num <io thread num> +# 鍏抽棴SSL璇佷功 + --certfile 0 +``` + + +鎵ц涓婅堪鎸囦护鍚庯紝鍚姩鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟銆傚鏋滄ā鍨嬫寚瀹氫负ModelScope涓璵odel id锛屼細鑷姩浠嶮oldeScope涓笅杞藉涓嬫ā鍨嬶細 +[FSMN-VAD妯″瀷](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary), +[Paraformer-lagre妯″瀷](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx/summary), +[CT-Transformer鏍囩偣棰勬祴妯″瀷](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx/summary) + +濡傛灉锛屾偍甯屾湜閮ㄧ讲鎮╢inetune鍚庣殑妯″瀷锛堜緥濡�10epoch.pb锛夛紝闇�瑕佹墜鍔ㄥ皢妯″瀷閲嶅懡鍚嶄负model.pb锛屽苟灏嗗師modelscope涓ā鍨媘odel.pb鏇挎崲鎺夛紝灏嗚矾寰勬寚瀹氫负`model_dir`鍗冲彲銆� + + +## 濡備綍瀹氬埗鏈嶅姟閮ㄧ讲 + +FunASR-runtime鐨勪唬鐮佸凡寮�婧愶紝濡傛灉鏈嶅姟绔拰瀹㈡埛绔笉鑳藉緢濂界殑婊¤冻鎮ㄧ殑闇�姹傦紝鎮ㄥ彲浠ユ牴鎹嚜宸辩殑闇�姹傝繘琛岃繘涓�姝ョ殑寮�鍙戯細 +### c++ 瀹㈡埛绔細 + +https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/websocket + +### python 瀹㈡埛绔細 + +https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket + +### 鑷畾涔夊鎴风锛� + +濡傛灉鎮ㄦ兂瀹氫箟鑷繁鐨刢lient锛屽弬鑰僛websocket閫氫俊鍗忚](./websocket_protocol_zh.md) + + +``` + +### c++ 鏈嶅姟绔細 + +#### VAD +```c++ +// VAD妯″瀷鐨勪娇鐢ㄥ垎涓篎smnVadInit鍜孎smnVadInfer涓や釜姝ラ锛� +FUNASR_HANDLE vad_hanlde=FsmnVadInit(model_path, thread_num); +// 鍏朵腑锛歮odel_path 鍖呭惈"model-dir"銆�"quantize"锛宼hread_num涓簅nnx绾跨▼鏁帮紱 +FUNASR_RESULT result=FsmnVadInfer(vad_hanlde, wav_file.c_str(), NULL, 16000); +// 鍏朵腑锛歷ad_hanlde涓篎unOfflineInit杩斿洖鍊硷紝wav_file涓洪煶棰戣矾寰勶紝sampling_rate涓洪噰鏍风巼(榛樿16k) +``` + +浣跨敤绀轰緥璇﹁锛歨ttps://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-vad.cpp + +#### ASR +```text +// ASR妯″瀷鐨勪娇鐢ㄥ垎涓篎unOfflineInit鍜孎unOfflineInfer涓や釜姝ラ锛� +FUNASR_HANDLE asr_hanlde=FunOfflineInit(model_path, thread_num); +// 鍏朵腑锛歮odel_path 鍖呭惈"model-dir"銆�"quantize"锛宼hread_num涓簅nnx绾跨▼鏁帮紱 +FUNASR_RESULT result=FunOfflineInfer(asr_hanlde, wav_file.c_str(), RASR_NONE, NULL, 16000); +// 鍏朵腑锛歛sr_hanlde涓篎unOfflineInit杩斿洖鍊硷紝wav_file涓洪煶棰戣矾寰勶紝sampling_rate涓洪噰鏍风巼(榛樿16k) +``` + +浣跨敤绀轰緥璇﹁锛歨ttps://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline.cpp + +#### PUNC +```text +// PUNC妯″瀷鐨勪娇鐢ㄥ垎涓篊TTransformerInit鍜孋TTransformerInfer涓や釜姝ラ锛� +FUNASR_HANDLE punc_hanlde=CTTransformerInit(model_path, thread_num); +// 鍏朵腑锛歮odel_path 鍖呭惈"model-dir"銆�"quantize"锛宼hread_num涓簅nnx绾跨▼鏁帮紱 +FUNASR_RESULT result=CTTransformerInfer(punc_hanlde, txt_str.c_str(), RASR_NONE, NULL); +// 鍏朵腑锛歱unc_hanlde涓篊TTransformerInit杩斿洖鍊硷紝txt_str涓烘枃鏈� +``` +浣跨敤绀轰緥璇﹁锛歨ttps://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/onnxruntime/bin/funasr-onnx-offline-punc.cpp -- Gitblit v1.9.1