python/FunASR-XL.git

parent: d8c1b46d | 补丁 | 提交 | ignore whitespace

雾聪

2024-07-01 31bf3a88a09f6e7c895224f4dc75e8f2c138d5c8

update funasr-runtime-sdk-gpu-0.1.1

6个文件已修改

	README.md	1 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	README_zh.md	1 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/docs/SDK_advanced_guide_offline_gpu.md	20 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/docs/SDK_advanced_guide_offline_gpu_zh.md	20 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/readme.md	1 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/readme_cn.md	1 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

 README.md

@@ -29,6 +29,7 @@

<a name="whats-new"></a>
## What's new:
- 2024/07/01: Offline File Transcription Service GPU 1.1 released, optimize BladeDISC model compatibility issues; ref to ([docs](runtime/readme.md))
- 2024/06/27: Offline File Transcription Service GPU 1.0 released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU); ref to ([docs](runtime/readme.md))
- 2024/05/15：emotion recognition models are new supported. [emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)，[emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)，[emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary). currently supports the following categories: 0: angry 1: happy 2: neutral 3: sad 4: unknown.
- 2024/05/15: Offline File Transcription Service 4.5, Offline File Transcription Service of English 1.6，Real-time Transcription Service 1.10 released，adapting to FunASR 1.0 model structure；([docs](runtime/readme.md))

 README_zh.md

@@ -33,6 +33,7 @@

<a name="最新动态"></a>
## 最新动态
- 2024/07/01：中文离线文件转写服务GPU版本 1.1发布，优化bladedisc模型兼容性问题；详细信息参阅([部署文档](runtime/readme_cn.md))
- 2024/06/27：中文离线文件转写服务GPU版本 1.0发布，支持动态batch，支持多路并发，在长音频测试集上单线RTF为0.0076，多线加速比为1200+（CPU为330+）；详细信息参阅([部署文档](runtime/readme_cn.md))
- 2024/05/15：新增加情感识别模型，[emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)，[emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)，[emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary)，输出情感类别为：生气/angry，开心/happy，中立/neutral，难过/sad。
- 2024/05/15: 中文离线文件转写服务 4.5、英文离线文件转写服务 1.6、中文实时语音听写服务 1.10 发布，适配FunASR 1.0模型结构；详细信息参阅([部署文档](runtime/readme_cn.md))

 runtime/docs/SDK_advanced_guide_offline_gpu.md

@@ -12,6 +12,7 @@

| TIME       | INFO                                                                                                                             | IMAGE VERSION                | IMAGE ID     |
|------------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------|--------------|
| 2024.07.01 | Optimize BladeDISC model compatibility issues | funasr-runtime-sdk-gpu-0.1.1 | b39c6fb16451 |
| 2024.06.27 | Offline File Transcription Software Package(GPU) 1.0 released | funasr-runtime-sdk-gpu-0.1.0 | b86066f4d018 |


@@ -27,9 +28,9 @@
### Pulling and launching images
Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
```shell
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1

sudo docker run --gpus=all -p 10098:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
sudo docker run --gpus=all -p 10098:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
```

Introduction to command parameters: 
@@ -45,16 +46,17 @@
nohup bash run_server.sh \
  --download-model-dir /workspace/models \
  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript  \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch  \
  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
  --itn-dir thuduj12/fst_itn_zh \
  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

***When the service starts for the first time, it will export the TorchScript model, which may take some time. Please be patient***
# If you want to close ssl，please add：--certfile 0
# If you want to deploy the timestamp or nn hotword model, please set --model-dir to the corresponding model:
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript（timestamp）
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript（hotword）
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch（timestamp）
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404（hotword）
# If you want to load hotwords on the server side, please configure the hotwords in the host machine file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address: /workspace/models/hotwords.txt):
# One hotword per line, format (hotword weight): 阿里巴巴 20"
```
@@ -67,7 +69,7 @@
cd /workspace/FunASR/runtime
nohup bash run_server.sh \
  --download-model-dir /workspace/models \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
  --itn-dir thuduj12/fst_itn_zh \
@@ -104,8 +106,8 @@
### Modifying Models and Other Parameters
To replace the currently used model or other parameters, you need to first shut down the FunASR service, make the necessary modifications to the parameters you want to replace, and then restart the FunASR service. The model should be either an ASR/VAD/PUNC model from ModelScope or a fine-tuned model obtained from ModelScope.
```text
# For example, to replace the ASR model with damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript, use the following parameter setting --model-dir
    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript 
# For example, to replace the ASR model with damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch, use the following parameter setting --model-dir
    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch 
# Set the port number using --port
    --port <port number>
# Set the number of inference threads the server will start using --decoder-thread-num
@@ -118,7 +120,7 @@

After executing the above command, the real-time speech transcription service will be started. If the model is specified as a ModelScope model id, the following models will be automatically downloaded from ModelScope:
[FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary),
[CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
[FST-ITN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
[Ngram lm](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)

 runtime/docs/SDK_advanced_guide_offline_gpu_zh.md

@@ -10,6 +10,7 @@

| 时间         | 详情                                                | 镜像版本                         | 镜像ID         |
|------------|---------------------------------------------------|------------------------------|--------------|
| 2024.07.01 | 优化bladedisc模型兼容性问题                  | funasr-runtime-sdk-gpu-0.1.1 | b39c6fb16451 |
| 2024.06.27 | 离线文件转写服务GPU版本1.0 发布                  | funasr-runtime-sdk-gpu-0.1.0 | b86066f4d018 |

## 服务器配置
@@ -39,11 +40,11 @@

```shell
sudo docker pull \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
mkdir -p ./funasr-runtime-resources/models
sudo docker run --gpus=all -p 10098:10095 -it --privileged=true \
  -v $PWD/funasr-runtime-resources/models:/workspace/models \
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
```

### 服务端启动
@@ -54,16 +55,17 @@
nohup bash run_server.sh \
  --download-model-dir /workspace/models \
  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript  \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch  \
  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
  --itn-dir thuduj12/fst_itn_zh \
  --hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

***服务首次启动时会导出torchscript模型，耗时较长，请耐心等待***
# 如果您想关闭ssl，增加参数：--certfile 0
# 默认加载时间戳模型，如果您想使用nn热词模型进行部署，请设置--model-dir为对应模型：
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript（时间戳）
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript（nn热词）
#   damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch（时间戳）
#   damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404（nn热词）
# 如果您想在服务端加载热词，请在宿主机文件./funasr-runtime-resources/models/hotwords.txt配置热词（docker映射地址为/workspace/models/hotwords.txt）:
#   每行一个热词，格式(热词 权重)：阿里巴巴 20（注：热词理论上无限制，但为了兼顾性能和效果，建议热词长度不超过10，个数不超过1k，权重1~100）
```
@@ -148,7 +150,7 @@
cd /workspace/FunASR/runtime
nohup bash run_server.sh \
  --download-model-dir /workspace/models \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
  --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
@@ -187,8 +189,8 @@
### 修改模型及其他参数
替换正在使用的模型或者其他参数，需先关闭FunASR服务，修改需要替换的参数，并重新启动FunASR服务。其中模型需为ModelScope中的ASR/VAD/PUNC模型，或者从ModelScope中模型finetune后的模型。
```text
# 例如替换ASR模型为 damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript，则如下设置参数 --model-dir
    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript 
# 例如替换ASR模型为 damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch，则如下设置参数 --model-dir
    --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch 
# 设置端口号 --port
    --port <port number>
# 设置服务端启动的推理线程数 --decoder-thread-num
@@ -201,7 +203,7 @@

执行上述指令后，启动离线文件转写服务。如果模型指定为ModelScope中model id，会自动从MoldeScope中下载如下模型：
[FSMN-VAD模型](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
[Paraformer-lagre模型](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
[Paraformer-lagre模型](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary),
[CT-Transformer标点预测模型](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
[基于FST的中文ITN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
[Ngram中文语言模型](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)

 runtime/readme.md

@@ -17,6 +17,7 @@
To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.

### Whats-new
- 2024/07/01: File Transcription Service 1.1 GPU released, optimize BladeDISC model compatibility issues, docker image version funasr-runtime-sdk-gpu-0.1.1 (b39c6fb16451)
- 2024/06/27: File Transcription Service 1.0 GPU released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU), ref to([docs](./docs/benchmark_libtorch_cpp.md)) , docker image version funasr-runtime-sdk-gpu-0.1.0 (b86066f4d018)

### Advanced Development Guide

 runtime/readme_cn.md

@@ -19,6 +19,7 @@
为了支持不同用户的需求，针对不同场景，准备了不同的图文教程：

### 最新动态
- 2024/07/01:   中文离线文件转写服务GPU 1.1 发布，优化bladedisc模型兼容性问题，dokcer镜像版本funasr-runtime-sdk-gpu-0.1.1 (b39c6fb16451)
- 2024/06/27:   中文离线文件转写服务GPU 1.0 发布，支持动态batch，支持多路并发，在长音频测试集上单线RTF为0.0076，多线加速比为1200+（CPU为330+），详见([文档](./docs/benchmark_libtorch_cpp.md))，dokcer镜像版本funasr-runtime-sdk-gpu-0.1.0 (b86066f4d018)

### 部署与开发文档

			@@ -29,6 +29,7 @@

			<a name="whats-new"></a>
			## What's new:
			- 2024/07/01: Offline File Transcription Service GPU 1.1 released, optimize BladeDISC model compatibility issues; ref to ([docs](runtime/readme.md))
			- 2024/06/27: Offline File Transcription Service GPU 1.0 released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU); ref to ([docs](runtime/readme.md))
			- 2024/05/15：emotion recognition models are new supported. [emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)，[emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)，[emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary). currently supports the following categories: 0: angry 1: happy 2: neutral 3: sad 4: unknown.
			- 2024/05/15: Offline File Transcription Service 4.5, Offline File Transcription Service of English 1.6，Real-time Transcription Service 1.10 released，adapting to FunASR 1.0 model structure；([docs](runtime/readme.md))

			@@ -33,6 +33,7 @@

			<a name="最新动态"></a>
			## 最新动态
			- 2024/07/01：中文离线文件转写服务GPU版本 1.1发布，优化bladedisc模型兼容性问题；详细信息参阅([部署文档](runtime/readme_cn.md))
			- 2024/06/27：中文离线文件转写服务GPU版本 1.0发布，支持动态batch，支持多路并发，在长音频测试集上单线RTF为0.0076，多线加速比为1200+（CPU为330+）；详细信息参阅([部署文档](runtime/readme_cn.md))
			- 2024/05/15：新增加情感识别模型，[emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)，[emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)，[emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary)，输出情感类别为：生气/angry，开心/happy，中立/neutral，难过/sad。
			- 2024/05/15: 中文离线文件转写服务 4.5、英文离线文件转写服务 1.6、中文实时语音听写服务 1.10 发布，适配FunASR 1.0模型结构；详细信息参阅([部署文档](runtime/readme_cn.md))

			@@ -12,6 +12,7 @@

			\| TIME \| INFO \| IMAGE VERSION \| IMAGE ID \|
			\|------------\|----------------------------------------------------------------------------------------------------------------------------------\|------------------------------\|--------------\|
			\| 2024.07.01 \| Optimize BladeDISC model compatibility issues \| funasr-runtime-sdk-gpu-0.1.1 \| b39c6fb16451 \|
			\| 2024.06.27 \| Offline File Transcription Software Package(GPU) 1.0 released \| funasr-runtime-sdk-gpu-0.1.0 \| b86066f4d018 \|


			@@ -27,9 +28,9 @@
			### Pulling and launching images
			Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
			```shell
			sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
			sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1

			sudo docker run --gpus=all -p 10098:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
			sudo docker run --gpus=all -p 10098:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
			```

			Introduction to command parameters:
			@@ -45,16 +46,17 @@
			nohup bash run_server.sh \
			--download-model-dir /workspace/models \
			--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
			--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
			--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
			--itn-dir thuduj12/fst_itn_zh \
			--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

			*When the service starts for the first time, it will export the TorchScript model, which may take some time. Please be patient*
			# If you want to close ssl，please add：--certfile 0
			# If you want to deploy the timestamp or nn hotword model, please set --model-dir to the corresponding model:
			# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript（timestamp）
			# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript（hotword）
			# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch（timestamp）
			# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404（hotword）
			# If you want to load hotwords on the server side, please configure the hotwords in the host machine file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address: /workspace/models/hotwords.txt):
			# One hotword per line, format (hotword weight): 阿里巴巴 20"
			```
			@@ -67,7 +69,7 @@
			cd /workspace/FunASR/runtime
			nohup bash run_server.sh \
			--download-model-dir /workspace/models \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
			--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
			--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
			--itn-dir thuduj12/fst_itn_zh \
			@@ -104,8 +106,8 @@
			### Modifying Models and Other Parameters
			To replace the currently used model or other parameters, you need to first shut down the FunASR service, make the necessary modifications to the parameters you want to replace, and then restart the FunASR service. The model should be either an ASR/VAD/PUNC model from ModelScope or a fine-tuned model obtained from ModelScope.
			```text
			# For example, to replace the ASR model with damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript, use the following parameter setting --model-dir
			--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript
			# For example, to replace the ASR model with damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch, use the following parameter setting --model-dir
			--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
			# Set the port number using --port
			--port <port number>
			# Set the number of inference threads the server will start using --decoder-thread-num
			@@ -118,7 +120,7 @@

			After executing the above command, the real-time speech transcription service will be started. If the model is specified as a ModelScope model id, the following models will be automatically downloaded from ModelScope:
			[FSMN-VAD](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
			[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
			[Paraformer-lagre](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary),
			[CT-Transformer](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
			[FST-ITN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
			[Ngram lm](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)

			@@ -10,6 +10,7 @@

			\| 时间 \| 详情 \| 镜像版本 \| 镜像ID \|
			\|------------\|---------------------------------------------------\|------------------------------\|--------------\|
			\| 2024.07.01 \| 优化bladedisc模型兼容性问题 \| funasr-runtime-sdk-gpu-0.1.1 \| b39c6fb16451 \|
			\| 2024.06.27 \| 离线文件转写服务GPU版本1.0 发布 \| funasr-runtime-sdk-gpu-0.1.0 \| b86066f4d018 \|

			## 服务器配置
			@@ -39,11 +40,11 @@

			```shell
			sudo docker pull \
			registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
			registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
			mkdir -p ./funasr-runtime-resources/models
			sudo docker run --gpus=all -p 10098:10095 -it --privileged=true \
			-v $PWD/funasr-runtime-resources/models:/workspace/models \
			registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.0
			registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-gpu-0.1.1
			```

			### 服务端启动
			@@ -54,16 +55,17 @@
			nohup bash run_server.sh \
			--download-model-dir /workspace/models \
			--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
			--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
			--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
			--itn-dir thuduj12/fst_itn_zh \
			--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

			*服务首次启动时会导出torchscript模型，耗时较长，请耐心等待*
			# 如果您想关闭ssl，增加参数：--certfile 0
			# 默认加载时间戳模型，如果您想使用nn热词模型进行部署，请设置--model-dir为对应模型：
			# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript（时间戳）
			# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-torchscript（nn热词）
			# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch（时间戳）
			# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404（nn热词）
			# 如果您想在服务端加载热词，请在宿主机文件./funasr-runtime-resources/models/hotwords.txt配置热词（docker映射地址为/workspace/models/hotwords.txt）:
			# 每行一个热词，格式(热词权重)：阿里巴巴 20（注：热词理论上无限制，但为了兼顾性能和效果，建议热词长度不超过10，个数不超过1k，权重1~100）
			```
			@@ -148,7 +150,7 @@
			cd /workspace/FunASR/runtime
			nohup bash run_server.sh \
			--download-model-dir /workspace/models \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-torchscript \
			--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
			--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
			--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
			--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
			@@ -187,8 +189,8 @@
			### 修改模型及其他参数
			替换正在使用的模型或者其他参数，需先关闭FunASR服务，修改需要替换的参数，并重新启动FunASR服务。其中模型需为ModelScope中的ASR/VAD/PUNC模型，或者从ModelScope中模型finetune后的模型。
			```text
			# 例如替换ASR模型为 damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript，则如下设置参数 --model-dir
			--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript
			# 例如替换ASR模型为 damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch，则如下设置参数 --model-dir
			--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
			# 设置端口号 --port
			--port <port number>
			# 设置服务端启动的推理线程数 --decoder-thread-num
			@@ -201,7 +203,7 @@

			执行上述指令后，启动离线文件转写服务。如果模型指定为ModelScope中model id，会自动从MoldeScope中下载如下模型：
			[FSMN-VAD模型](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
			[Paraformer-lagre模型](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-torchscript/summary),
			[Paraformer-lagre模型](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary),
			[CT-Transformer标点预测模型](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary),
			[基于FST的中文ITN](https://www.modelscope.cn/models/thuduj12/fst_itn_zh/summary),
			[Ngram中文语言模型](https://www.modelscope.cn/models/damo/speech_ngram_lm_zh-cn-ai-wesp-fst/summary)

			@@ -17,6 +17,7 @@
			To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.

			### Whats-new
			- 2024/07/01: File Transcription Service 1.1 GPU released, optimize BladeDISC model compatibility issues, docker image version funasr-runtime-sdk-gpu-0.1.1 (b39c6fb16451)
			- 2024/06/27: File Transcription Service 1.0 GPU released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU), ref to([docs](./docs/benchmark_libtorch_cpp.md)) , docker image version funasr-runtime-sdk-gpu-0.1.0 (b86066f4d018)

			### Advanced Development Guide

			@@ -19,6 +19,7 @@
			为了支持不同用户的需求，针对不同场景，准备了不同的图文教程：

			### 最新动态
			- 2024/07/01: 中文离线文件转写服务GPU 1.1 发布，优化bladedisc模型兼容性问题，dokcer镜像版本funasr-runtime-sdk-gpu-0.1.1 (b39c6fb16451)
			- 2024/06/27: 中文离线文件转写服务GPU 1.0 发布，支持动态batch，支持多路并发，在长音频测试集上单线RTF为0.0076，多线加速比为1200+（CPU为330+），详见([文档](./docs/benchmark_libtorch_cpp.md))，dokcer镜像版本funasr-runtime-sdk-gpu-0.1.0 (b86066f4d018)

			### 部署与开发文档