From 5853ebc98f51c79d0ae2955cefe1457cba78efe4 Mon Sep 17 00:00:00 2001
From: Yabin Li <wucong.lyb@alibaba-inc.com>
Date: 星期四, 27 六月 2024 17:38:19 +0800
Subject: [PATCH] Merge Dev blade (#1856)

---
 runtime/docs/SDK_advanced_guide_offline_gpu_zh.md |  209 ++++++++++++++++++++++++++
 runtime/readme_cn.md                              |   15 +
 runtime/docs/SDK_advanced_guide_offline_gpu.md    |  173 +++++++++++++++++++++
 README_zh.md                                      |    1 
 runtime/docs/benchmark_libtorch_cpp.md            |   31 +++
 runtime/readme.md                                 |   16 +
 README.md                                         |    1 
 7 files changed, 444 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 835eed4..a8b15e5 100644
--- a/README.md
+++ b/README.md
@@ -29,6 +29,7 @@
 
 <a name="whats-new"></a>
 ## What's new:
+- 2024/06/27: Offline File Transcription Service GPU 1.0 released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU); ref to ([docs](runtime/readme.md))
 - 2024/05/15锛歟motion recognition models are new supported. [emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)锛孾emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)锛孾emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary). currently supports the following categories: 0: angry 1: happy 2: neutral 3: sad 4: unknown.
 - 2024/05/15: Offline File Transcription Service 4.5, Offline File Transcription Service of English 1.6锛孯eal-time Transcription Service 1.10 released锛宎dapting to FunASR 1.0 model structure锛�([docs](runtime/readme.md))
 - 2024/03/05锛欰dded the Qwen-Audio and Qwen-Audio-Chat large-scale audio-text multimodal models, which have topped multiple audio domain leaderboards. These models support speech dialogue, [usage](examples/industrial_data_pretraining/qwen_audio).
diff --git a/README_zh.md b/README_zh.md
index 43db23b..169face 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -33,6 +33,7 @@
 
 <a name="鏈�鏂板姩鎬�"></a>
 ## 鏈�鏂板姩鎬�
+- 2024/06/27锛氫腑鏂囩绾挎枃浠惰浆鍐欐湇鍔PU鐗堟湰 1.0鍙戝竷锛屾敮鎸佸姩鎬乥atch锛屾敮鎸佸璺苟鍙戯紝鍦ㄩ暱闊抽娴嬭瘯闆嗕笂鍗曠嚎RTF涓�0.0076锛屽绾垮姞閫熸瘮涓�1200+锛圕PU涓�330+锛夛紱璇︾粏淇℃伅鍙傞槄([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
 - 2024/05/15锛氭柊澧炲姞鎯呮劅璇嗗埆妯″瀷锛孾emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)锛孾emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)锛孾emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary)锛岃緭鍑烘儏鎰熺被鍒负锛氱敓姘�/angry锛屽紑蹇�/happy锛屼腑绔�/neutral锛岄毦杩�/sad銆�
 - 2024/05/15: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 4.5銆佽嫳鏂囩绾挎枃浠惰浆鍐欐湇鍔� 1.6銆佷腑鏂囧疄鏃惰闊冲惉鍐欐湇鍔� 1.10 鍙戝竷锛岄�傞厤FunASR 1.0妯″瀷缁撴瀯锛涜缁嗕俊鎭弬闃�([閮ㄧ讲鏂囨。](runtime/readme_cn.md))
 - 2024/03/05锛氭柊澧炲姞Qwen-Audio涓嶲wen-Audio-Chat闊抽鏂囨湰妯℃�佸ぇ妯″瀷锛屽湪澶氫釜闊抽棰嗗煙娴嬭瘯姒滃崟鍒锋锛屼腑鏀寔璇煶瀵硅瘽锛岃缁嗙敤娉曡 [绀轰緥](examples/industrial_data_pretraining/qwen_audio)銆�
diff --git a/runtime/docs/SDK_advanced_guide_offline_gpu.md b/runtime/docs/SDK_advanced_guide_offline_gpu.md
new file mode 100644
index 0000000..a33715c
--- /dev/null
+++ b/runtime/docs/SDK_advanced_guide_offline_gpu.md
@@ -0,0 +1,173 @@
+ # Advanced Development Guide (File transcription service GPU)
+
+([绠�浣撲腑鏂嘳(SDK_advanced_guide_offline_gpu_zh.md)|English)
+
+[//]: # (FunASR provides a Chinese offline file transcription service that can be deployed locally or on a cloud server with just one click. The core of the service is the FunASR runtime SDK, which has been open-sourced. FunASR-runtime combines various capabilities such as speech endpoint detection &#40;VAD&#41;, large-scale speech recognition &#40;ASR&#41; using Paraformer-large, and punctuation detection &#40;PUNC&#41;, which have all been open-sourced by the speech laboratory of DAMO Academy on the Modelscope community. This enables accurate and efficient high-concurrency transcription of audio files.)
+FunASR Offline File Transcription Software Package(GPU) provides a powerful speech-to-text offline file transcription service. With a complete speech recognition pipeline, it combines models for speech endpoint detection, speech recognition, punctuation, etc., allowing for the transcription of long audio and video files, spanning several hours, into punctuated text. It supports simultaneous transcription of hundreds of concurrent requests. The output is text with punctuation, including word-level timestamps, and it supports ITN (Initial Time Normalization) and user-defined hotwords. The server-side integration includes ffmpeg, enabling support for various audio and video formats as input. The software package provides client libraries in multiple programming languages such as HTML, Python, C++, Java, and C#, allowing users to use and further develop the software.
+
+This document serves as a development guide for the FunASR offline file transcription service. If you wish to quickly experience the offline file transcription service, please refer to the one-click deployment example for the FunASR offline file transcription service ([docs](./SDK_tutorial.md)).
+
+<img src="images/offline_structure.jpg"  width="900"/>
+
+
+| TIME       | INFO                                                                                                                             | IMAGE VERSION                | IMAGE ID     |
+|------------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------|--------------|
+| 2024.06.27 | Offline File Transcription Software Package(GPU) 1.0 released | funasr-runtime-sdk-gpu-0.1.0 | aa10f938da3b |
+
+
+## Quick start
+### Docker install
+If you have already installed Docker, ignore this step!
+```shell
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh;
+sudo bash install_docker.sh
+```
+If you do not have Docker installed, please refer to [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
+
+### Pulling and launching images
+Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
+```shell
+sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
+
+sudo docker run --gpus=all -p 10098:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
+```
+
+Introduction to command parameters: 
+```text
+-p <host port>:<mapped docker port>: In the example, host machine (ECS) port 10098 is mapped to port 10095 in the Docker container. Make sure that port 10098 is open in the ECS security rules.
+
+-v <host path>:<mounted Docker path>: In the example, the host machine path /root is mounted to the Docker path /workspace/models.
+```
+
+### Starting the server
+Use the flollowing script to start the server 锛�
+```shell
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript  \
+  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
+  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
+  --itn-dir thuduj12/fst_itn_zh \
+  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &
+
+# If you want to close ssl锛宲lease add锛�--certfile 0
+# If you want to deploy the timestamp or nn hotword model, please set --model-dir to the corresponding model:
+#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript锛坱imestamp锛�
+#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript锛坔otword锛�
+# If you want to load hotwords on the server side, please configure the hotwords in the host machine file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address: /workspace/models/hotwords.txt):
+# One hotword per line, format (hotword weight): 闃块噷宸村反 20"
+```
+
+### More details about the script run_server.sh:
+
+The funasr-wss-server supports downloading models from Modelscope. You can set the model download address (--download-model-dir, default is /workspace/models) and the model ID (--model-dir, --vad-dir, --punc-dir). Here is an example:
+
+```shell
+cd /workspace/FunASR/runtime
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
+  --itn-dir thuduj12/fst_itn_zh \
+  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key \
+  --hotword ../../hotwords.txt > log.txt 2>&1 &
+ ```
+
+Introduction to run_server.sh parameters: 
+```text
+--download-model-dir: Model download address, download models from Modelscope by setting the model ID.
+--model-dir: modelscope model ID or local model path.
+--vad-dir: modelscope model ID or local model path.
+--punc-dir: modelscope model ID or local model path.
+--itn-dir modelscope model ID or local model path.
+--port: Port number that the server listens on. Default is 10095.
+--decoder-thread-num: The number of thread pools on the server side that can handle concurrent requests.
+--io-thread-num: Number of IO threads that the server starts.
+--model-thread-num: The number of internal threads for each recognition route to control the parallelism of the ONNX model. 
+        The default value is 1. It is recommended that decoder-thread-num * model-thread-num equals the total number of threads.
+--certfile <string>: SSL certificate file. Default is ../../../ssl_key/server.crt. If you want to close ssl锛宻et 0
+--keyfile <string>: SSL key file. Default is ../../../ssl_key/server.key. 
+--hotword: Hotword file path, one line for each hotword(e.g.:闃块噷宸村反 20), if the client provides hot words, then combined with the hot words provided by the client.
+```
+
+### Shutting Down the FunASR Service
+```text
+# Check the PID of the funasr-wss-server process
+ps -x | grep funasr-wss-server
+kill -9 PID
+```
+
+### Modifying Models and Other Parameters
+To replace the currently used model or other parameters, you need to first shut down the FunASR service, make the necessary modifications to the parameters you want to replace, and then restart the FunASR service. The model should be either an ASR/VAD/PUNC model from ModelScope or a fine-tuned model obtained from ModelScope.
+```text
+# For example, to replace the ASR model with damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript, use the following parameter setting --model-dir
+    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript 
+# Set the port number using --port
+    --port <port number>
+# Set the number of inference threads the server will start using --decoder-thread-num
+    --decoder-thread-num <decoder thread num>
+# Set the number of IO threads the server will start using --io-thread-num
+    --io-thread-num <io thread num>
+# Disable SSL certificate
+    --certfile 0
+```
+
+After executing the above command, the real-time speech transcription service will be started. If the model is specified as a ModelScope model id, the following models will be automatically downloaded from ModelScope:
+[FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
+[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
+[CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
+[FST-ITN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
+[Ngram lm](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)
+
+If you wish to deploy your fine-tuned model (e.g., 10epoch.pb), you need to manually rename the model to model.pb and replace the original model.pb in ModelScope. Then, specify the path as `model_dir`.
+
+## Starting the client
+After completing the deployment of FunASR offline file transcription service on the server, you can test and use the service by following these steps. Currently, FunASR-bin supports multiple ways to start the client. The following are command-line examples based on python-client, c++-client, and custom client Websocket communication protocol: 
+
+### python-client
+```shell
+python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "./data/wav.scp" --send_without_sleep --output_dir "./results"
+```
+Introduction to command parameters:
+```text
+--host: the IP address of the server. It can be set to 127.0.0.1 for local testing.
+--port: the port number of the server listener.
+--audio_in: the audio input. Input can be a path to a wav file or a wav.scp file (a Kaldi-formatted wav list in which each line includes a wav_id followed by a tab and a wav_path).
+--output_dir: the path to the recognition result output.
+--ssl: whether to use SSL encryption. The default is to use SSL.
+--mode: offline mode.
+--hotword: Hotword file path, one line for each hotword(e.g.:闃块噷宸村反 20)
+--use_itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### c++-client
+```shell
+. /funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path test.wav --thread-num 1 --is-ssl 1
+```
+
+Introduction to command parameters:
+```text
+--server-ip: the IP address of the server. It can be set to 127.0.0.1 for local testing.
+--port: the port number of the server listener.
+--wav-path: the audio input. Input can be a path to a wav file or a wav.scp file (a Kaldi-formatted wav list in which each line includes a wav_id followed by a tab and a wav_path).
+--is-ssl: whether to use SSL encryption. The default is to use SSL.
+--hotword: Hotword file path, one line for each hotword(e.g.:闃块噷宸村反 20)
+--use-itn: whether to use itn, the default value is 1 for enabling and 0 for disabling.
+```
+
+### Custom client
+If you want to define your own client, see the [Websocket communication protocol](./websocket_protocol.md)
+
+## How to customize service deployment
+The code for FunASR-runtime is open source. If the server and client cannot fully meet your needs, you can further develop them based on your own requirements:
+
+### C++ client
+https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/websocket
+
+### Python client
+https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/websocket
+
diff --git a/runtime/docs/SDK_advanced_guide_offline_gpu_zh.md b/runtime/docs/SDK_advanced_guide_offline_gpu_zh.md
new file mode 100644
index 0000000..3416117
--- /dev/null
+++ b/runtime/docs/SDK_advanced_guide_offline_gpu_zh.md
@@ -0,0 +1,209 @@
+# FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟GPU鐗堟湰寮�鍙戞寚鍗�
+
+(绠�浣撲腑鏂噟[English](SDK_advanced_guide_offline_gpu.md))
+
+FunASR绂荤嚎鏂囦欢杞啓GPU杞欢鍖咃紝鎻愪緵浜嗕竴娆惧姛鑳藉己澶х殑璇煶绂荤嚎鏂囦欢杞啓鏈嶅姟銆傛嫢鏈夊畬鏁寸殑璇煶璇嗗埆閾捐矾锛岀粨鍚堜簡璇煶绔偣妫�娴嬨�佽闊宠瘑鍒�佹爣鐐圭瓑妯″瀷锛屽彲浠ュ皢鍑犲崄涓皬鏃剁殑闀块煶棰戜笌瑙嗛璇嗗埆鎴愬甫鏍囩偣鐨勬枃瀛楋紝鑰屼笖鏀寔涓婄櫨璺姹傚悓鏃惰繘琛岃浆鍐欍�傝緭鍑轰负甯︽爣鐐圭殑鏂囧瓧锛屽惈鏈夊瓧绾у埆鏃堕棿鎴筹紝鏀寔ITN涓庣敤鎴疯嚜瀹氫箟鐑瘝绛夈�傛湇鍔＄闆嗘垚鏈塮fmpeg锛屾敮鎸佸悇绉嶉煶瑙嗛鏍煎紡杈撳叆銆傝蒋浠跺寘鎻愪緵鏈塰tml銆乸ython銆乧++銆乯ava涓巆#绛夊绉嶇紪绋嬭瑷�瀹㈡埛绔紝鐢ㄦ埛鍙互鐩存帴浣跨敤涓庤繘涓�姝ュ紑鍙戙��
+
+鏈枃妗ｄ负FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟GPU鐗堟湰寮�鍙戞寚鍗椼�傚鏋滄偍鎯冲揩閫熶綋楠岀绾挎枃浠惰浆鍐欐湇鍔★紝鍙弬鑰僛蹇�熶笂鎵媇(#蹇�熶笂鎵�)銆�
+
+<img src="images/offline_structure.jpg"  width="900"/>
+
+| 鏃堕棿         | 璇︽儏                                                | 闀滃儚鐗堟湰                         | 闀滃儚ID         |
+|------------|---------------------------------------------------|------------------------------|--------------|
+| 2024.06.27 | 绂荤嚎鏂囦欢杞啓鏈嶅姟GPU鐗堟湰1.0 鍙戝竷                  | funasr-runtime-sdk-gpu-0.1.0 | aa10f938da3b |
+
+## 鏈嶅姟鍣ㄩ厤缃�
+
+鐢ㄦ埛鍙互鏍规嵁鑷繁鐨勪笟鍔￠渶姹傦紝閫夋嫨鍚堥�傜殑鏈嶅姟鍣ㄩ厤缃紝鎺ㄨ崘閰嶇疆涓猴細
+- 閰嶇疆1: 锛圙PU锛夛紝8鏍竩CPU锛屽唴瀛�32G锛孷100锛屽崟鏈哄彲浠ユ敮鎸佸ぇ绾�20璺殑璇锋眰
+
+璇︾粏鎬ц兘娴嬭瘯鎶ュ憡锛圼鐐瑰嚮姝ゅ](./benchmark_onnx_cpp.md)锛�
+
+浜戞湇鍔″巶鍟嗭紝閽堝鏂扮敤鎴凤紝鏈�3涓湀鍏嶈垂璇曠敤娲诲姩锛岀敵璇锋暀绋嬶紙[鐐瑰嚮姝ゅ](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/aliyun_server_tutorial.md)锛�
+
+
+## 蹇�熶笂鎵�
+
+### docker瀹夎
+濡傛灉鎮ㄥ凡瀹夎docker锛屽拷鐣ユ湰姝ラ锛�!
+閫氳繃涓嬭堪鍛戒护鍦ㄦ湇鍔″櫒涓婂畨瑁卍ocker锛�
+```shell
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh锛�
+sudo bash install_docker.sh
+```
+docker瀹夎澶辫触璇峰弬鑰� [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
+
+### 闀滃儚鍚姩
+
+閫氳繃涓嬭堪鍛戒护鎷夊彇骞跺惎鍔‵unASR杞欢鍖呯殑docker闀滃儚锛�
+
+```shell
+sudo docker pull \
+  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
+mkdir -p ./funasr-runtime-resources/models
+sudo docker run --gpus=all -p 10098:10095 -it --privileged=true \
+  -v $PWD/funasr-runtime-resources/models:/workspace/models \
+  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
+```
+
+### 鏈嶅姟绔惎鍔�
+
+docker鍚姩涔嬪悗锛屽惎鍔� funasr-wss-server鏈嶅姟绋嬪簭:
+```shell
+cd FunASR/runtime
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript  \
+  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
+  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
+  --itn-dir thuduj12/fst_itn_zh \
+  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &
+
+# 濡傛灉鎮ㄦ兂鍏抽棴ssl锛屽鍔犲弬鏁帮細--certfile 0
+# 榛樿鍔犺浇鏃堕棿鎴虫ā鍨嬶紝濡傛灉鎮ㄦ兂浣跨敤nn鐑瘝妯″瀷杩涜閮ㄧ讲锛岃璁剧疆--model-dir涓哄搴旀ā鍨嬶細
+#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript锛堟椂闂存埑锛�
+#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript锛坣n鐑瘝锛�
+# 濡傛灉鎮ㄦ兂鍦ㄦ湇鍔＄鍔犺浇鐑瘝锛岃鍦ㄥ涓绘満鏂囦欢./funasr-runtime-resources/models/hotwords.txt閰嶇疆鐑瘝锛坉ocker鏄犲皠鍦板潃涓�/workspace/models/hotwords.txt锛�:
+#   姣忚涓�涓儹璇嶏紝鏍煎紡(鐑瘝 鏉冮噸)锛氶樋閲屽反宸� 20锛堟敞锛氱儹璇嶇悊璁轰笂鏃犻檺鍒讹紝浣嗕负浜嗗吋椤炬�ц兘鍜屾晥鏋滐紝寤鸿鐑瘝闀垮害涓嶈秴杩�10锛屼釜鏁颁笉瓒呰繃1k锛屾潈閲�1~100锛�
+```
+濡傛灉鎮ㄦ兂瀹氬埗ngram锛屽弬鑰冩枃妗�([濡備綍璁粌LM](./lm_train_tutorial.md))
+
+鏈嶅姟绔缁嗗弬鏁颁粙缁嶅彲鍙傝�僛鏈嶅姟绔敤娉曡瑙(#鏈嶅姟绔敤娉曡瑙�)
+
+### 瀹㈡埛绔祴璇曚笌浣跨敤
+
+涓嬭浇瀹㈡埛绔祴璇曞伐鍏风洰褰晄amples
+```shell
+wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz
+```
+鎴戜滑浠ython璇█瀹㈡埛绔负渚嬶紝杩涜璇存槑锛屾敮鎸佸绉嶉煶棰戞牸寮忚緭鍏ワ紙.wav, .pcm, .mp3绛夛級锛屼篃鏀寔瑙嗛杈撳叆(.mp4绛�)锛屼互鍙婂鏂囦欢鍒楄〃wav.scp杈撳叆锛屽叾浠栫増鏈鎴风璇峰弬鑰冩枃妗ｏ紙[鐐瑰嚮姝ゅ](#瀹㈡埛绔敤娉曡瑙�)锛夛紝瀹氬埗鏈嶅姟閮ㄧ讲璇峰弬鑰僛濡備綍瀹氬埗鏈嶅姟閮ㄧ讲](#濡備綍瀹氬埗鏈嶅姟閮ㄧ讲)
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+## 瀹㈡埛绔敤娉曡瑙�
+
+鍦ㄦ湇鍔″櫒涓婂畬鎴怓unASR鏈嶅姟閮ㄧ讲浠ュ悗锛屽彲浠ラ�氳繃濡備笅鐨勬楠ゆ潵娴嬭瘯鍜屼娇鐢ㄧ绾挎枃浠惰浆鍐欐湇鍔°��
+鐩墠鍒嗗埆鏀寔浠ヤ笅鍑犵缂栫▼璇█瀹㈡埛绔�
+
+- [Python](#python-client)
+- [CPP](#cpp-client)
+- [html缃戦〉鐗堟湰](#Html缃戦〉鐗�)
+- [Java](#Java-client)
+
+### python-client
+鑻ユ兂鐩存帴杩愯client杩涜娴嬭瘯锛屽彲鍙傝�冨涓嬬畝鏄撹鏄庯紝浠ython鐗堟湰涓轰緥锛�
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline \
+        --audio_in "../audio/asr_example.wav" --output_dir "./results"
+```
+
+鍛戒护鍙傛暟璇存槑锛�
+```text
+--host 涓篎unASR runtime-SDK鏈嶅姟閮ㄧ讲鏈哄櫒ip锛岄粯璁や负鏈満ip锛�127.0.0.1锛夛紝濡傛灉client涓庢湇鍔′笉鍦ㄥ悓涓�鍙版湇鍔″櫒锛�
+       闇�瑕佹敼涓洪儴缃叉満鍣╥p
+--port 10095 閮ㄧ讲绔彛鍙�
+--mode offline琛ㄧず绂荤嚎鏂囦欢杞啓
+--audio_in 闇�瑕佽繘琛岃浆鍐欑殑闊抽鏂囦欢锛屾敮鎸佹枃浠惰矾寰勶紝鏂囦欢鍒楄〃wav.scp
+--thread_num 璁剧疆骞跺彂鍙戦�佺嚎绋嬫暟锛岄粯璁や负1
+--ssl 璁剧疆鏄惁寮�鍚痵sl璇佷功鏍￠獙锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴
+--hotword 鐑瘝鏂囦欢锛屾瘡琛屼竴涓儹璇嶏紝鏍煎紡(鐑瘝 鏉冮噸)锛氶樋閲屽反宸� 20
+--use_itn 璁剧疆鏄惁浣跨敤itn锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴
+```
+
+### cpp-client
+杩涘叆samples/cpp鐩綍鍚庯紝鍙互鐢╟pp杩涜娴嬭瘯锛屾寚浠ゅ涓嬶細
+```shell
+./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
+```
+
+鍛戒护鍙傛暟璇存槑锛�
+```text
+--server-ip 涓篎unASR runtime-SDK鏈嶅姟閮ㄧ讲鏈哄櫒ip锛岄粯璁や负鏈満ip锛�127.0.0.1锛夛紝濡傛灉client涓庢湇鍔′笉鍦ㄥ悓涓�鍙版湇鍔″櫒锛�
+            闇�瑕佹敼涓洪儴缃叉満鍣╥p
+--port 10095 閮ㄧ讲绔彛鍙�
+--wav-path 闇�瑕佽繘琛岃浆鍐欑殑闊抽鏂囦欢锛屾敮鎸佹枃浠惰矾寰�
+--hotword 鐑瘝鏂囦欢锛屾瘡琛屼竴涓儹璇嶏紝鏍煎紡(鐑瘝 鏉冮噸)锛氶樋閲屽反宸� 20
+--thread-num 璁剧疆瀹㈡埛绔嚎绋嬫暟
+--use-itn 璁剧疆鏄惁浣跨敤itn锛岄粯璁�1寮�鍚紝璁剧疆涓�0鍏抽棴
+```
+
+### Html缃戦〉鐗�
+鍦ㄦ祻瑙堝櫒涓墦寮� html/static/index.html锛屽嵆鍙嚭鐜板涓嬮〉闈紝鏀寔楹﹀厠椋庤緭鍏ヤ笌鏂囦欢涓婁紶锛岀洿鎺ヨ繘琛屼綋楠�
+
+<img src="images/html.png"  width="900"/>
+
+### Java-client
+```shell
+FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
+```
+璇︾粏鍙互鍙傝�冩枃妗ｏ紙[鐐瑰嚮姝ゅ](../java/readme.md)锛�
+
+## 鏈嶅姟绔敤娉曡瑙ｏ細
+
+### 鍚姩FunASR鏈嶅姟
+```shell
+cd /workspace/FunASR/runtime
+nohup bash run_server.sh \
+  --download-model-dir /workspace/models \
+  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
+  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
+  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
+  --itn-dir thuduj12/fst_itn_zh \
+  --certfile  ../../../ssl_key/server.crt \
+  --keyfile ../../../ssl_key/server.key \
+  --hotword ../../hotwords.txt  > log.txt 2>&1 &
+ ```
+**run_server.sh鍛戒护鍙傛暟浠嬬粛**
+```text
+--download-model-dir 妯″瀷涓嬭浇鍦板潃锛岄�氳繃璁剧疆model ID浠嶮odelscope涓嬭浇妯″瀷
+--model-dir  modelscope model ID 鎴栬�� 鏈湴妯″瀷璺緞
+--vad-dir  modelscope model ID 鎴栬�� 鏈湴妯″瀷璺緞
+--punc-dir  modelscope model ID 鎴栬�� 鏈湴妯″瀷璺緞
+--lm-dir modelscope model ID 鎴栬�� 鏈湴妯″瀷璺緞
+--itn-dir modelscope model ID 鎴栬�� 鏈湴妯″瀷璺緞
+--port  鏈嶅姟绔洃鍚殑绔彛鍙凤紝榛樿涓� 10095
+--decoder-thread-num  鏈嶅姟绔嚎绋嬫睜涓暟(鏀寔鐨勬渶澶у苟鍙戣矾鏁�)锛�
+                      **寤鸿姣忚矾鍒嗛厤1G鏄惧瓨锛屽嵆20G鏄惧瓨鍙厤缃�20璺苟鍙�**
+--io-thread-num  鏈嶅姟绔惎鍔ㄧ殑IO绾跨▼鏁�
+--model-thread-num  姣忚矾璇嗗埆鐨勫唴閮ㄧ嚎绋嬫暟(鎺у埗ONNX妯″瀷鐨勫苟琛�)锛岄粯璁や负 1锛�
+                    鍏朵腑寤鸿 decoder-thread-num*model-thread-num 绛変簬鎬荤嚎绋嬫暟
+--certfile  ssl鐨勮瘉涔︽枃浠讹紝榛樿涓猴細../../../ssl_key/server.crt锛屽鏋滈渶瑕佸叧闂璼sl锛屽弬鏁拌缃负0
+--keyfile   ssl鐨勫瘑閽ユ枃浠讹紝榛樿涓猴細../../../ssl_key/server.key
+--hotword   鐑瘝鏂囦欢璺緞锛屾瘡琛屼竴涓儹璇嶏紝鏍煎紡锛氱儹璇� 鏉冮噸(渚嬪:闃块噷宸村反 20)锛�
+            濡傛灉瀹㈡埛绔彁渚涚儹璇嶏紝鍒欎笌瀹㈡埛绔彁渚涚殑鐑瘝鍚堝苟涓�璧蜂娇鐢紝鏈嶅姟绔儹璇嶅叏灞�鐢熸晥锛屽鎴风鐑瘝鍙拡瀵瑰搴斿鎴风鐢熸晥銆�
+```
+
+### 鍏抽棴FunASR鏈嶅姟
+```text
+# 鏌ョ湅 funasr-wss-server 瀵瑰簲鐨凱ID
+ps -x | grep funasr-wss-server
+kill -9 PID
+```
+
+### 淇敼妯″瀷鍙婂叾浠栧弬鏁�
+鏇挎崲姝ｅ湪浣跨敤鐨勬ā鍨嬫垨鑰呭叾浠栧弬鏁帮紝闇�鍏堝叧闂璅unASR鏈嶅姟锛屼慨鏀归渶瑕佹浛鎹㈢殑鍙傛暟锛屽苟閲嶆柊鍚姩FunASR鏈嶅姟銆傚叾涓ā鍨嬮渶涓篗odelScope涓殑ASR/VAD/PUNC妯″瀷锛屾垨鑰呬粠ModelScope涓ā鍨媐inetune鍚庣殑妯″瀷銆�
+```text
+# 渚嬪鏇挎崲ASR妯″瀷涓� damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript锛屽垯濡備笅璁剧疆鍙傛暟 --model-dir
+    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript 
+# 璁剧疆绔彛鍙� --port
+    --port <port number>
+# 璁剧疆鏈嶅姟绔惎鍔ㄧ殑鎺ㄧ悊绾跨▼鏁� --decoder-thread-num
+    --decoder-thread-num <decoder thread num>
+# 璁剧疆鏈嶅姟绔惎鍔ㄧ殑IO绾跨▼鏁� --io-thread-num
+    --io-thread-num <io thread num>
+# 鍏抽棴SSL璇佷功 
+    --certfile 0
+```
+
+鎵ц涓婅堪鎸囦护鍚庯紝鍚姩绂荤嚎鏂囦欢杞啓鏈嶅姟銆傚鏋滄ā鍨嬫寚瀹氫负ModelScope涓璵odel id锛屼細鑷姩浠嶮oldeScope涓笅杞藉涓嬫ā鍨嬶細
+[FSMN-VAD妯″瀷](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
+[Paraformer-lagre妯″瀷](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
+[CT-Transformer鏍囩偣棰勬祴妯″瀷](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
+[鍩轰簬FST鐨勪腑鏂嘔TN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
+[Ngram涓枃璇█妯″瀷](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)
+
+濡傛灉锛屾偍甯屾湜閮ㄧ讲鎮╢inetune鍚庣殑妯″瀷锛堜緥濡�10epoch.pb锛夛紝闇�瑕佹墜鍔ㄥ皢妯″瀷閲嶅懡鍚嶄负model.pb锛屽苟灏嗗師modelscope涓ā鍨媘odel.pb鏇挎崲鎺夛紝灏嗚矾寰勬寚瀹氫负`model_dir`鍗冲彲銆�
diff --git a/runtime/docs/benchmark_libtorch_cpp.md b/runtime/docs/benchmark_libtorch_cpp.md
new file mode 100644
index 0000000..b7f99c6
--- /dev/null
+++ b/runtime/docs/benchmark_libtorch_cpp.md
@@ -0,0 +1,31 @@
+# GPU Benchmark (libtorch-cpp)
+
+## Configuration
+### Data set:
+A long audio test set(Non-open source) containing 103 audio files, with durations ranging from 2 to 30 minutes.
+
+## [FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary) + [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary) + [CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx/summary) 
+
+```shell
+./funasr-onnx-offline-rtf \
+    --model-dir    ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
+    --vad-dir   ./damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
+    --punc-dir  ./damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
+    --gpu \
+    --thread-num 20 \
+    --bladedisc true \
+    --batch-size 20 \
+    --wav-path     ./long_test.scp
+```
+Node: run in docker, ref to ([docs](./SDK_advanced_guide_offline_gpu_zh.md))
+
+### Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz 16core-32processor with avx512_vnni, GPU @ A10
+
+| concurrent-tasks | batch  |   RTF  | Speedup Rate |
+|------------------|:------:|:------:|:------------:|
+| 1                |   1    | 0.0076 |      130     |
+| 1                |   20   | 0.0048 |      208     |
+| 20               |   20   | 0.0008 |      1200    |
+
+Node: On CPUs, the single-thread RTF is 0.066, and 32-threads' speedup is 330+
+
diff --git a/runtime/readme.md b/runtime/readme.md
index 28a063d..fb795c9 100644
--- a/runtime/readme.md
+++ b/runtime/readme.md
@@ -7,9 +7,23 @@
 - File transcription service, Mandarin, CPU version, done
 - The real-time transcription service, Mandarin (CPU), done
 - File transcription service, English, CPU version, done
-- File transcription service, Mandarin, GPU version, in progress
+- File transcription service, Mandarin, GPU version, done
 - and more.
 
+## File Transcription Service, Mandarin (GPU)
+
+Currently, the FunASR runtime-SDK supports the deployment of file transcription service, Mandarin (GPU version), with a complete speech recognition chain that can transcribe tens of hours of audio into punctuated text, and supports recognition for more than a hundred concurrent streams. 
+
+To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.
+
+### Whats-new
+- 2024/06/27: File Transcription Service 1.0 GPU released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU), ref to([docs](./docs/benchmark_libtorch_cpp.md)) , docker image version funasr-runtime-sdk-gpu-0.1.0 (aa10f938da3b)
+
+### Advanced Development Guide
+
+The documentation mainly targets advanced developers who require modifications and customization of the service. It supports downloading model deployments from modelscope and also supports deploying models that users have fine-tuned. For detailed information, please refer to the documentation available by [docs](./docs/SDK_advanced_guide_offline_gpu.md)
+
+
 ## File Transcription Service, English (CPU)
 
 Currently, the FunASR runtime-SDK supports the deployment of file transcription service, English (CPU version), with a complete speech recognition chain that can transcribe tens of hours of audio into punctuated text, and supports recognition for more than a hundred concurrent streams. 
diff --git a/runtime/readme_cn.md b/runtime/readme_cn.md
index 9cb7b58..0359d6e 100644
--- a/runtime/readme_cn.md
+++ b/runtime/readme_cn.md
@@ -10,9 +10,22 @@
 - 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圕PU鐗堟湰锛夛紝宸插畬鎴�
 - 涓枃娴佸紡璇煶璇嗗埆鏈嶅姟锛圕PU鐗堟湰锛夛紝宸插畬鎴�
 - 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圕PU鐗堟湰锛夛紝宸插畬鎴�
-- 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圙PU鐗堟湰锛夛紝杩涜涓�
+- 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圙PU鐗堟湰锛夛紝宸插畬鎴�
 - 鏇村鏀寔涓�
 
+## 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圙PU鐗堟湰锛�
+
+涓枃璇煶绂荤嚎鏂囦欢鏈嶅姟閮ㄧ讲锛圙PU鐗堟湰锛夛紝鎷ユ湁瀹屾暣鐨勮闊宠瘑鍒摼璺紝鍙互灏嗗嚑鍗佷釜灏忔椂鐨勯暱闊抽涓庤棰戣瘑鍒垚甯︽爣鐐圭殑鏂囧瓧锛岃�屼笖鏀寔澶氳矾璇锋眰鍚屾椂杩涜杞啓銆�
+涓轰簡鏀寔涓嶅悓鐢ㄦ埛鐨勯渶姹傦紝閽堝涓嶅悓鍦烘櫙锛屽噯澶囦簡涓嶅悓鐨勫浘鏂囨暀绋嬶細
+
+### 鏈�鏂板姩鎬�
+- 2024/06/27:   涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟GPU 1.0 鍙戝竷锛屾敮鎸佸姩鎬乥atch锛屾敮鎸佸璺苟鍙戯紝鍦ㄩ暱闊抽娴嬭瘯闆嗕笂鍗曠嚎RTF涓�0.0076锛屽绾垮姞閫熸瘮涓�1200+锛圕PU涓�330+锛夛紝璇﹁([鏂囨。](./docs/benchmark_libtorch_cpp.md))锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-gpu-0.1.0 (aa10f938da3b)
+
+### 閮ㄧ讲涓庡紑鍙戞枃妗�
+
+閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛屾敮鎸佺敤鎴峰畾鍒舵湇鍔★紝璇︾粏鏂囨。鍙傝�冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_offline_gpu_zh.md)锛�
+
+
 ## 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圕PU鐗堟湰锛�
 
 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟閮ㄧ讲锛圕PU鐗堟湰锛夛紝鎷ユ湁瀹屾暣鐨勮闊宠瘑鍒摼璺紝鍙互灏嗗嚑鍗佷釜灏忔椂鐨勯暱闊抽涓庤棰戣瘑鍒垚甯︽爣鐐圭殑鏂囧瓧锛岃�屼笖鏀寔涓婄櫨璺姹傚悓鏃惰繘琛岃浆鍐欍��

--
Gitblit v1.9.1