From e22f256ee617c0b6d8af0f020ca7c05df4502d92 Mon Sep 17 00:00:00 2001
From: 游雁 <zhifu.gzf@alibaba-inc.com>
Date: 星期四, 27 七月 2023 14:37:55 +0800
Subject: [PATCH] docs zh
---
egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md | 1
egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md | 1
docs/index.rst | 4
egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md | 1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md | 1
egs_modelscope/punctuation/punc_ct-transformer_cn-en-common-vocab471067-large/README_zh.md | 1
egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README_zh.md | 1
egs_modelscope/tp/TEMPLATE/README.md | 2
egs_modelscope/vad/TEMPLATE/README.md | 2
funasr/runtime/docs/SDK_tutorial_zh.md | 2
egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md | 1
egs_modelscope/asr_vad_punc/TEMPLATE | 1
egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md | 1
egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vadrealtime-vocab272727/README_zh.md | 1
egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md | 1
funasr/runtime/html5/readme.md | 121 +++--
egs_modelscope/punctuation/TEMPLATE/README.md | 2
egs_modelscope/asr/paraformer/speech_paraformer_asr-en-16k-vocab4199-pytorch/README_zh.md | 1
egs_modelscope/punctuation/TEMPLATE/README_zh.md | 112 +++++
egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/README_zh.md | 1
/dev/null | 246 ------------
egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md | 1
egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/README_zh.md | 1
funasr/runtime/docs/SDK_tutorial.md | 460 ++++++++--------------
egs_modelscope/tp/TEMPLATE/README_zh.md | 102 +++++
egs_modelscope/asr/paraformer/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/README_zh.md | 1
egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/README_zh.md | 1
egs_modelscope/vad/TEMPLATE/README_zh.md | 113 +++++
28 files changed, 593 insertions(+), 590 deletions(-)
diff --git a/docs/index.rst b/docs/index.rst
index e2aa87d..b6291d9 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -71,12 +71,10 @@
:maxdepth: 1
:caption: Runtime and Service
- ./funasr/export/README.md
- ./funasr/runtime/python/onnxruntime/README.md
+ ./funasr/runtime/docs/SDK_tutorial.md
./funasr/runtime/python/websocket/README.md
./funasr/runtime/websocket/readme.md
./funasr/runtime/html5/readme.md
- ./funasr/runtime/python/libtorch/README.md
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr-en-16k-vocab4199-pytorch/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr-en-16k-vocab4199-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr-en-16k-vocab4199-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md
deleted file mode 100644
index 8bf63e5..0000000
--- a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md
+++ /dev/null
@@ -1,58 +0,0 @@
-# ModelScope Model
-
-## How to finetune and infer using a pretrained Paraformer-large Model
-
-### Finetune
-
-- Modify finetune training related parameters in `finetune.py`
- - <strong>output_dir:</strong> # result dir
- - <strong>data_dir:</strong> # the dataset dir needs to include files: `train/wav.scp`, `train/text`; `validation/wav.scp`, `validation/text`
- - <strong>dataset_type:</strong> # for dataset larger than 1000 hours, set as `large`, otherwise set as `small`
- - <strong>batch_bins:</strong> # batch size. For dataset_type is `small`, `batch_bins` indicates the feature frames. For dataset_type is `large`, `batch_bins` indicates the duration in ms
- - <strong>max_epoch:</strong> # number of training epoch
- - <strong>lr:</strong> # learning rate
-
-- Then you can run the pipeline to finetune with:
-```python
- python finetune.py
-```
-
-### Inference
-
-Or you can use the finetuned model for inference directly.
-
-- Setting parameters in `infer.sh`
- - <strong>model:</strong> # model name on ModelScope
- - <strong>data_dir:</strong> # the dataset dir needs to include `test/wav.scp`. If `test/text` is also exists, CER will be computed
- - <strong>output_dir:</strong> # result dir
- - <strong>batch_size:</strong> # batchsize of inference
- - <strong>gpu_inference:</strong> # whether to perform gpu decoding, set false for cpu decoding
- - <strong>gpuid_list:</strong> # set gpus, e.g., gpuid_list="0,1"
- - <strong>njob:</strong> # the number of jobs for CPU decoding, if `gpu_inference`=false, use CPU decoding, please set `njob`
-
-- Then you can run the pipeline to infer with:
-```python
- sh infer.sh
-```
-
-- Results
-
-The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
-
-### Inference using local finetuned model
-
-- Modify inference related parameters in `infer_after_finetune.py`
- - <strong>modelscope_model_name: </strong> # model name on ModelScope
- - <strong>output_dir:</strong> # result dir
- - <strong>data_dir:</strong> # the dataset dir needs to include `test/wav.scp`. If `test/text` is also exists, CER will be computed
- - <strong>decoding_model_name:</strong> # set the checkpoint name for decoding, e.g., `valid.cer_ctc.ave.pb`
- - <strong>batch_size:</strong> # batchsize of inference
-
-- Then you can run the pipeline to finetune with:
-```python
- python infer_after_finetune.py
-```
-
-- Results
-
-The decoding results can be found in `$output_dir/decoding_results/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md
new file mode 120000
index 0000000..92088a2
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README.md
@@ -0,0 +1 @@
+../TEMPLATE/README.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README_zh.md b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr/paraformer/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/asr_vad_punc/TEMPLATE b/egs_modelscope/asr_vad_punc/TEMPLATE
new file mode 120000
index 0000000..f969ea0
--- /dev/null
+++ b/egs_modelscope/asr_vad_punc/TEMPLATE
@@ -0,0 +1 @@
+../asr/TEMPLATE
\ No newline at end of file
diff --git a/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md b/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md
deleted file mode 100644
index 83c462d..0000000
--- a/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md
+++ /dev/null
@@ -1,246 +0,0 @@
-# Speech Recognition
-
-> **Note**:
-> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take the typic models as examples to demonstrate the usage.
-
-## Inference
-
-### Quick start
-#### [Paraformer Model](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)
-```python
-from modelscope.pipelines import pipeline
-from modelscope.utils.constant import Tasks
-
-inference_pipeline = pipeline(
- task=Tasks.auto_speech_recognition,
- model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
-)
-
-rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
-print(rec_result)
-```
-#### [Paraformer-online Model](https://www.modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online/summary)
-```python
-inference_pipeline = pipeline(
- task=Tasks.auto_speech_recognition,
- model='damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8404-online',
- )
-import soundfile
-speech, sample_rate = soundfile.read("example/asr_example.wav")
-
-param_dict = {"cache": dict(), "is_final": False}
-chunk_stride = 7680# 480ms
-# first chunk, 480ms
-speech_chunk = speech[0:chunk_stride]
-rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
-print(rec_result)
-# next chunk, 480ms
-speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
-rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
-print(rec_result)
-```
-Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/241)
-
-#### [UniASR Model](https://www.modelscope.cn/models/damo/speech_UniASR_asr_2pass-zh-cn-8k-common-vocab3445-pytorch-online/summary)
-There are three decoding mode for UniASR model(`fast`銆乣normal`銆乣offline`), for more model detailes, please refer to [docs](https://www.modelscope.cn/models/damo/speech_UniASR_asr_2pass-zh-cn-8k-common-vocab3445-pytorch-online/summary)
-```python
-decoding_model = "fast" # "fast"銆�"normal"銆�"offline"
-inference_pipeline = pipeline(
- task=Tasks.auto_speech_recognition,
- model='damo/speech_UniASR_asr_2pass-minnan-16k-common-vocab3825',
- param_dict={"decoding_model": decoding_model})
-
-rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
-print(rec_result)
-```
-The decoding mode of `fast` and `normal` is fake streaming, which could be used for evaluating of recognition accuracy.
-Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/151)
-#### [RNN-T-online model]()
-Undo
-
-#### [MFCCA Model](https://www.modelscope.cn/models/NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950/summary)
-For more model detailes, please refer to [docs](https://www.modelscope.cn/models/NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950/summary)
-```python
-from modelscope.pipelines import pipeline
-from modelscope.utils.constant import Tasks
-
-inference_pipeline = pipeline(
- task=Tasks.auto_speech_recognition,
- model='NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950',
- model_revision='v3.0.0'
-)
-
-rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
-print(rec_result)
-```
-
-#### API-reference
-##### Define pipeline
-- `task`: `Tasks.auto_speech_recognition`
-- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
-- `ngpu`: `1` (Default), decoding on GPU. If ngpu=0, decoding on CPU
-- `ncpu`: `1` (Default), sets the number of threads used for intraop parallelism on CPU
-- `output_dir`: `None` (Default), the output path of results if set
-- `batch_size`: `1` (Default), batch size when decoding
-##### Infer pipeline
-- `audio_in`: the input to decode, which could be:
- - wav_path, `e.g.`: asr_example.wav,
- - pcm_path, `e.g.`: asr_example.pcm,
- - audio bytes stream, `e.g.`: bytes data from a microphone
- - audio sample point锛宍e.g.`: `audio, rate = soundfile.read("asr_example_zh.wav")`, the dtype is numpy.ndarray or torch.Tensor
- - wav.scp, kaldi style wav list (`wav_id \t wav_path`), `e.g.`:
- ```text
- asr_example1 ./audios/asr_example1.wav
- asr_example2 ./audios/asr_example2.wav
- ```
- In this case of `wav.scp` input, `output_dir` must be set to save the output results
-- `audio_fs`: audio sampling rate, only set when audio_in is pcm audio
-- `output_dir`: None (Default), the output path of results if set
-
-### Inference with multi-thread CPUs or multi GPUs
-FunASR also offer recipes [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.
-
-- Setting parameters in `infer.sh`
- - `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
- - `data_dir`: the dataset dir needs to include `wav.scp`. If `${data_dir}/text` is also exists, CER will be computed
- - `output_dir`: output dir of the recognition results
- - `batch_size`: `64` (Default), batch size of inference on gpu
- - `gpu_inference`: `true` (Default), whether to perform gpu decoding, set false for CPU inference
- - `gpuid_list`: `0,1` (Default), which gpu_ids are used to infer
- - `njob`: only used for CPU inference (`gpu_inference`=`false`), `64` (Default), the number of jobs for CPU decoding
- - `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
- - `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer
- - `decoding_mode`: `normal` (Default), decoding mode for UniASR model(fast銆乶ormal銆乷ffline)
- - `hotword_txt`: `None` (Default), hotword file for contextual paraformer model(the hotword file name ends with .txt")
-
-- Decode with multi GPUs:
-```shell
- bash infer.sh \
- --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
- --data_dir "./data/test" \
- --output_dir "./results" \
- --batch_size 64 \
- --gpu_inference true \
- --gpuid_list "0,1"
-```
-- Decode with multi-thread CPUs:
-```shell
- bash infer.sh \
- --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
- --data_dir "./data/test" \
- --output_dir "./results" \
- --gpu_inference false \
- --njob 64
-```
-
-- Results
-
-The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.
-
-If you decode the SpeechIO test sets, you can use textnorm with `stage`=3, and `DETAILS.txt`, `RESULTS.txt` record the results and CER after text normalization.
-
-
-## Finetune with pipeline
-
-### Quick start
-[finetune.py](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/finetune.py)
-```python
-import os
-from modelscope.metainfo import Trainers
-from modelscope.trainers import build_trainer
-from modelscope.msdatasets.audio.asr_dataset import ASRDataset
-
-def modelscope_finetune(params):
- if not os.path.exists(params.output_dir):
- os.makedirs(params.output_dir, exist_ok=True)
- # dataset split ["train", "validation"]
- ds_dict = ASRDataset.load(params.data_path, namespace='speech_asr')
- kwargs = dict(
- model=params.model,
- data_dir=ds_dict,
- dataset_type=params.dataset_type,
- work_dir=params.output_dir,
- batch_bins=params.batch_bins,
- max_epoch=params.max_epoch,
- lr=params.lr)
- trainer = build_trainer(Trainers.speech_asr_trainer, default_args=kwargs)
- trainer.train()
-
-
-if __name__ == '__main__':
- from funasr.utils.modelscope_param import modelscope_args
- params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
- params.output_dir = "./checkpoint" # 妯″瀷淇濆瓨璺緞
- params.data_path = "speech_asr_aishell1_trainsets" # 鏁版嵁璺緞锛屽彲浠ヤ负modelscope涓凡涓婁紶鏁版嵁锛屼篃鍙互鏄湰鍦版暟鎹�
- params.dataset_type = "small" # 灏忔暟鎹噺璁剧疆small锛岃嫢鏁版嵁閲忓ぇ浜�1000灏忔椂锛岃浣跨敤large
- params.batch_bins = 2000 # batch size锛屽鏋渄ataset_type="small"锛宐atch_bins鍗曚綅涓篺bank鐗瑰緛甯ф暟锛屽鏋渄ataset_type="large"锛宐atch_bins鍗曚綅涓烘绉掞紝
- params.max_epoch = 50 # 鏈�澶ц缁冭疆鏁�
- params.lr = 0.00005 # 璁剧疆瀛︿範鐜�
-
- modelscope_finetune(params)
-```
-
-```shell
-python finetune.py &> log.txt &
-```
-
-### Finetune with your data
-
-- Modify finetune training related parameters in [finetune.py](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/finetune.py)
- - `output_dir`: result dir
- - `data_dir`: the dataset dir needs to include files: `train/wav.scp`, `train/text`; `validation/wav.scp`, `validation/text`
- - `dataset_type`: for dataset larger than 1000 hours, set as `large`, otherwise set as `small`
- - `batch_bins`: batch size. For dataset_type is `small`, `batch_bins` indicates the feature frames. For dataset_type is `large`, `batch_bins` indicates the duration in ms
- - `max_epoch`: number of training epoch
- - `lr`: learning rate
-
-- Training data formats锛�
-```sh
-cat ./example_data/text
-BAC009S0002W0122 鑰� 瀵� 妤� 甯� 鎴� 浜� 鎶� 鍒� 浣� 鐢� 鏈� 澶� 鐨� 闄� 璐�
-BAC009S0002W0123 涔� 鎴� 涓� 鍦� 鏂� 鏀� 搴� 鐨� 鐪� 涓� 閽�
-english_example_1 hello world
-english_example_2 go swim 鍘� 娓� 娉�
-
-cat ./example_data/wav.scp
-BAC009S0002W0122 /mnt/data/wav/train/S0002/BAC009S0002W0122.wav
-BAC009S0002W0123 /mnt/data/wav/train/S0002/BAC009S0002W0123.wav
-english_example_1 /mnt/data/wav/train/S0002/english_example_1.wav
-english_example_2 /mnt/data/wav/train/S0002/english_example_2.wav
-```
-
-- Then you can run the pipeline to finetune with:
-```shell
-python finetune.py
-```
-If you want finetune with multi-GPUs, you could:
-```shell
-CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
-```
-## Inference with your finetuned model
-
-- Setting parameters in [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) is the same with [docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/egs_modelscope/asr/TEMPLATE#inference-with-multi-thread-cpus-or-multi-gpus), `model` is the model name from modelscope, which you finetuned.
-
-- Decode with multi GPUs:
-```shell
- bash infer.sh \
- --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
- --data_dir "./data/test" \
- --output_dir "./results" \
- --batch_size 64 \
- --gpu_inference true \
- --gpuid_list "0,1" \
- --checkpoint_dir "./checkpoint" \
- --checkpoint_name "valid.cer_ctc.ave.pb"
-```
-- Decode with multi-thread CPUs:
-```shell
- bash infer.sh \
- --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
- --data_dir "./data/test" \
- --output_dir "./results" \
- --gpu_inference false \
- --njob 64 \
- --checkpoint_dir "./checkpoint" \
- --checkpoint_name "valid.cer_ctc.ave.pb"
-```
diff --git a/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md b/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md
new file mode 120000
index 0000000..92088a2
--- /dev/null
+++ b/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README.md
@@ -0,0 +1 @@
+../TEMPLATE/README.md
\ No newline at end of file
diff --git a/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md b/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/asr_vad_punc/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/punctuation/TEMPLATE/README.md b/egs_modelscope/punctuation/TEMPLATE/README.md
index 08814ea..ac8a007 100644
--- a/egs_modelscope/punctuation/TEMPLATE/README.md
+++ b/egs_modelscope/punctuation/TEMPLATE/README.md
@@ -1,3 +1,5 @@
+([绠�浣撲腑鏂嘳(./README_zh.md)|English)
+
# Punctuation Restoration
> **Note**:
diff --git a/egs_modelscope/punctuation/TEMPLATE/README_zh.md b/egs_modelscope/punctuation/TEMPLATE/README_zh.md
new file mode 100644
index 0000000..a6360c4
--- /dev/null
+++ b/egs_modelscope/punctuation/TEMPLATE/README_zh.md
@@ -0,0 +1,112 @@
+(绠�浣撲腑鏂噟[English](./README.md))
+# 鏍囩偣棰勬祴
+
+> **Note**:
+> Pipeline 鏀寔鍦╗modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)涓殑鎵�鏈夋ā鍨嬭繘琛屾帹鐞嗗拰寰皟銆傚湪杩欓噷锛屾垜浠互 CT-Transformer 妯″瀷涓轰緥鏉ユ紨绀轰娇鐢ㄦ柟娉曘��
+
+## 鎺ㄧ悊
+
+### 蹇�熶娇鐢�
+#### [CT-Transformer 妯″瀷](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary)
+```python
+from modelscope.pipelines import pipeline
+from modelscope.utils.constant import Tasks
+
+inference_pipeline = pipeline(
+ task=Tasks.punctuation,
+ model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch',
+ model_revision=None)
+
+rec_result = inference_pipeline(text_in='example/punc_example.txt')
+print(rec_result)
+```
+- text浜岃繘鍒舵暟鎹紝渚嬪锛氱敤鎴风洿鎺ヤ粠鏂囦欢閲岃鍑篵ytes鏁版嵁
+```python
+rec_result = inference_pipeline(text_in='鎴戜滑閮芥槸鏈ㄥご浜轰笉浼氳璇濅笉浼氬姩')
+```
+- text鏂囦欢url锛屼緥濡傦細https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/punc_example.txt
+```python
+rec_result = inference_pipeline(text_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/punc_example.txt')
+```
+
+#### [CT-Transformer 瀹炴椂妯″瀷](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727/summary)
+```python
+from modelscope.pipelines import pipeline
+from modelscope.utils.constant import Tasks
+
+inference_pipeline = pipeline(
+ task=Tasks.punctuation,
+ model='damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727',
+ model_revision=None,
+)
+
+inputs = "璺ㄥ娌虫祦鏄吇鑲叉部宀竱浜烘皯鐨勭敓鍛戒箣婧愰暱鏈熶互鏉ヤ负甯姪涓嬫父鍦板尯闃茬伨鍑忕伨涓柟鎶�鏈汉鍛榺鍦ㄤ笂娓稿湴鍖烘瀬涓烘伓鍔g殑鑷劧鏉′欢涓嬪厠鏈嶅法澶у洶闅剧敋鑷冲啋鐫�鐢熷懡鍗遍櫓|鍚戝嵃鏂规彁渚涙睕鏈熸按鏂囪祫鏂欏鐞嗙揣鎬ヤ簨浠朵腑鏂归噸瑙嗗嵃鏂瑰湪璺ㄥ娌虫祦闂涓婄殑鍏冲垏|鎰挎剰杩涗竴姝ュ畬鍠勫弻鏂硅仈鍚堝伐浣滄満鍒秥鍑℃槸|涓柟鑳藉仛鐨勬垜浠瑋閮戒細鍘诲仛鑰屼笖浼氬仛寰楁洿濂芥垜璇峰嵃搴︽湅鍙嬩滑鏀惧績涓浗鍦ㄤ笂娓哥殑|浠讳綍寮�鍙戝埄鐢ㄩ兘浼氱粡杩囩瀛瑙勫垝鍜岃璇佸吋椤句笂涓嬫父鐨勫埄鐩�"
+vads = inputs.split("|")
+rec_result_all="outputs:"
+param_dict = {"cache": []}
+for vad in vads:
+ rec_result = inference_pipeline(text_in=vad, param_dict=param_dict)
+ rec_result_all += rec_result['text']
+
+print(rec_result_all)
+```
+婕旂ず渚嬪瓙瀹屾暣浠g爜璇峰弬鑰� [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/238)
+
+### API鎺ュ彛璇存槑
+#### pipeline瀹氫箟
+- `task`: `Tasks.punctuation`
+- `model`: [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `ngpu`: `1`锛堥粯璁わ級锛屼娇鐢� GPU 杩涜鎺ㄧ悊銆傚鏋� ngpu=0锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
+- `ncpu`: `1` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
+- `output_dir`: `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `model_revision`: `None`锛堥粯璁わ級锛宮odelscope涓増鏈増鏈彿
+
+
+#### pipeline鎺ㄧ悊
+- `text_in`: 闇�瑕佽繘琛屾帹鐞嗙殑杈撳叆锛屾敮鎸佷竴涓嬭緭鍏ワ細
+ - 鏂囨湰瀛楃锛屼緥濡傦細"鎴戜滑閮芥槸鏈ㄥご浜轰笉浼氳璇濅笉浼氬姩"
+ - 鏂囨湰鏂囦欢锛屼緥濡傦細example/punc_example.txt銆�
+ 鍦ㄤ娇鐢ㄦ枃鏈枃浠� 杈撳叆鏃讹紝蹇呴』璁剧疆 `output_dir` 浠ヤ繚瀛樿緭鍑虹粨鏋溿��
+- `param_dict`: 鍦ㄥ疄鏃舵ā寮忎笅蹇呰鐨勭紦瀛樸��
+
+### Inference with multi-thread CPUs or multi GPUs
+FunASR 杩樻彁渚涗簡 [egs_modelscope/punctuation/TEMPLATE/infer.sh](infer.sh) 鑴氭湰锛屼互浣跨敤澶氱嚎绋� CPU 鎴栧涓� GPU 杩涜瑙g爜銆�
+
+#### `infer.sh` 璁剧疆
+- `model`: [modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `data_dir`: 鏁版嵁闆嗙洰褰曢渶瑕佸寘鎷� `punc.txt` 鏂囦欢
+- `output_dir`: 璇嗗埆缁撴灉鐨勮緭鍑虹洰褰�
+- `batch_size`: `1`锛堥粯璁わ級锛屽湪 GPU 涓婅繘琛屾帹鐞嗙殑鎵瑰鐞嗗ぇ灏�
+- `gpu_inference`: `true` 锛堥粯璁わ級锛屾槸鍚︽墽琛� GPU 瑙g爜锛屽鏋滆繘琛� CPU 鎺ㄧ悊锛屽垯璁剧疆涓� `false`
+- `gpuid_list`: `0,1` 锛堥粯璁わ級锛岀敤浜庢帹鐞嗙殑 GPU ID
+- `njob`: 浠呯敤浜� CPU 鎺ㄧ悊锛坄gpu_inference=false`锛夛紝`64`锛堥粯璁わ級锛孋PU 瑙g爜鐨勪綔涓氭暟
+
+
+#### 浣跨敤澶氫釜 GPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --batch_size 1 \
+ --gpu_inference true \
+ --gpuid_list "0,1"
+```
+#### 浣跨敤澶氱嚎绋� CPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --gpu_inference false \
+ --njob 1
+```
+
+## Finetune with pipeline
+
+### Quick start
+
+### Finetune with your data
+
+## Inference with your finetuned model
+
diff --git a/egs_modelscope/punctuation/punc_ct-transformer_cn-en-common-vocab471067-large/README_zh.md b/egs_modelscope/punctuation/punc_ct-transformer_cn-en-common-vocab471067-large/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/punctuation/punc_ct-transformer_cn-en-common-vocab471067-large/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vadrealtime-vocab272727/README_zh.md b/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vadrealtime-vocab272727/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vadrealtime-vocab272727/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/README_zh.md b/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/punctuation/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/tp/TEMPLATE/README.md b/egs_modelscope/tp/TEMPLATE/README.md
index 3c7129f..ad5a9de 100644
--- a/egs_modelscope/tp/TEMPLATE/README.md
+++ b/egs_modelscope/tp/TEMPLATE/README.md
@@ -1,3 +1,5 @@
+([绠�浣撲腑鏂嘳(./README_zh.md)|English)
+
# Timestamp Prediction (FA)
## Inference
diff --git a/egs_modelscope/tp/TEMPLATE/README_zh.md b/egs_modelscope/tp/TEMPLATE/README_zh.md
new file mode 100644
index 0000000..c8b1bc4
--- /dev/null
+++ b/egs_modelscope/tp/TEMPLATE/README_zh.md
@@ -0,0 +1,102 @@
+(绠�浣撲腑鏂噟[English](./README.md))
+
+# 鏃堕棿鎴抽娴�
+
+## 鎺ㄧ悊
+
+### 蹇�熶娇鐢�
+#### [TP-Aligner 妯″瀷](https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary)
+```python
+from modelscope.pipelines import pipeline
+from modelscope.utils.constant import Tasks
+
+inference_pipeline = pipeline(
+ task=Tasks.speech_timestamp,
+ model='damo/speech_timestamp_prediction-v1-16k-offline',
+ model_revision='v1.1.0')
+
+rec_result = inference_pipeline(
+ audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
+ text_in='涓� 涓� 涓� 澶� 骞� 娲� 鍥� 瀹� 涓� 浠� 涔� 璺� 鍒� 瑗� 澶� 骞� 娲� 鏉� 浜� 鍛�',)
+print(rec_result)
+```
+
+Timestamp pipeline can also be used after ASR pipeline to compose complete ASR function, ref to [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/246).
+
+### API鎺ュ彛璇存槑
+#### pipeline瀹氫箟
+- `task`: `Tasks.speech_timestamp`
+- `model`: [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `ngpu`: `1`锛堥粯璁わ級锛屼娇鐢� GPU 杩涜鎺ㄧ悊銆傚鏋� ngpu=0锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
+- `ncpu`: `1` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
+- `output_dir`: `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `batch_size`: `1` 锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊澶у皬
+
+
+#### Infer pipeline
+- `audio_in`: 寰呴娴嬬殑杈撳叆璇煶锛屽彲浠ユ槸锛�
+ - wav鏂囦欢璺緞锛屼緥濡傦細asr_example.wav锛堟湰鍦版垨 URL 涓婄殑 wav 鏂囦欢锛�
+ - wav.scp锛宬aldi 椋庢牸鐨� wav 鍒楄〃 (`wav_id wav_path`)锛屼緥濡�:
+ ```text
+ asr_example1 ./audios/asr_example1.wav
+ asr_example2 ./audios/asr_example2.wav
+ ```
+ 鍦ㄤ娇鐢� `wav.scp` 杈撳叆鏃讹紝蹇呴』璁剧疆 `output_dir` 浠ヤ繚瀛樿緭鍑虹粨鏋溿��
+- `text_in`: 寰呴娴嬬殑杈撳叆鏂囨湰锛屼娇鐢ㄧ┖鏍煎垎闅旓紝鍙互鏄細
+ - 鏂囨湰瀛楃涓诧紝渚嬪锛歚浠� 澶� 澶� 姘� 鎬� 涔� 鏍穈
+ - text.scp锛宬aldi 椋庢牸鐨勬枃鏈枃浠讹紙`wav_id transcription`锛夛紝渚嬪锛�
+ ```text
+ asr_example1 浠� 澶� 澶� 姘� 鎬� 涔� 鏍�
+ asr_example2 娆� 杩� 浣� 楠� 杈� 鎽� 闄� 璇� 闊� 璇� 鍒� 妯� 鍨�
+ ```
+- `audio_fs`: 闊抽閲囨牱鐜囷紝浠呭湪杈撳叆涓� PCM 闊抽鏃惰缃�
+- `output_dir`: 榛樿涓� None锛屽鏋滆缃紝鍒欎负缁撴灉鐨勮緭鍑鸿矾寰勶紝鍖呭惈锛�
+ - output_dir/timestamp_prediction/tp_sync锛屽甫鏈夐潤闊虫鐨勪互绉掍负鍗曚綅鐨勬椂闂存埑锛宍wav_id# token1 start_time end_time;`锛屼緥濡傦細
+ ```text
+ test_wav1# <sil> 0.000 0.500;娓� 0.500 0.680;宸� 0.680 0.840;鍖� 0.840 1.040;宸� 1.040 1.280;浠� 1.280 1.520;<sil> 1.520 1.680;搴� 1.680 1.920;<sil> 1.920 2.160;璧� 2.160 2.380;鐏� 2.380 2.580;娈� 2.580 2.760;鍙� 2.760 2.920;闄� 2.920 3.100;杩� 3.100 3.340;<sil> 3.340 3.400;娌� 3.400 3.640;<sil> 3.640 3.700;娴� 3.700 3.940;<sil> 3.940 4.240;澶� 4.240 4.400;閲� 4.400 4.520;姝� 4.520 4.680;楸� 4.680 4.920;<sil> 4.920 4.940;婕� 4.940 5.120;娴� 5.120 5.300;娌� 5.300 5.500;闈� 5.500 5.900;<sil> 5.900 6.240;
+ ```
+ - output_dir/timestamp_prediction/tp_time锛屾棤闈欓煶鐨勬椂闂存埑鍒楄〃锛屼互姣涓哄崟浣嶏紝涓庤緭鍏ユ枃鏈暱搴︾浉鍚岋紝`wav_id# [[start_time, end_time],]`锛屼緥濡傦細
+ ```text
+ test_wav1# [[500, 680], [680, 840], [840, 1040], [1040, 1280], [1280, 1520], [1680, 1920], [2160, 2380], [2380, 2580], [2580, 2760], [2760, 2920], [2920, 3100], [3100, 3340], [3400, 3640], [3700, 3940], [4240, 4400], [4400, 4520], [4520, 4680], [4680, 4920], [4940, 5120], [5120, 5300], [5300, 5500], [5500, 5900]]
+ ```
+
+### 浣跨敤澶氱嚎绋� CPU 鎴栧涓� GPU 杩涜鎺ㄧ悊
+FunASR 杩樻彁渚涗簡 [egs_modelscope/tp/TEMPLATE/infer.sh](infer.sh) 鑴氭湰锛屼互浣跨敤澶氱嚎绋� CPU 鎴栧涓� GPU 杩涜瑙g爜銆�
+
+#### `infer.sh` 璁剧疆
+- `model`: [modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `data_dir`: 鏁版嵁闆嗙洰褰曢渶瑕佸寘鎷� `wav.scp` 鏂囦欢銆傚鏋� `${data_dir}/text` 涔熷瓨鍦紝鍒欏皢璁$畻 CER
+- `output_dir`: 璇嗗埆缁撴灉鐨勮緭鍑虹洰褰�
+- `batch_size`: `1`锛堥粯璁わ級锛屽湪 GPU 涓婅繘琛屾帹鐞嗙殑鎵瑰鐞嗗ぇ灏�
+- `gpu_inference`: `true` 锛堥粯璁わ級锛屾槸鍚︽墽琛� GPU 瑙g爜锛屽鏋滆繘琛� CPU 鎺ㄧ悊锛屽垯璁剧疆涓� `false`
+- `gpuid_list`: `0,1` 锛堥粯璁わ級锛岀敤浜庢帹鐞嗙殑 GPU ID
+- `njob`: 浠呯敤浜� CPU 鎺ㄧ悊锛坄gpu_inference=false`锛夛紝`64`锛堥粯璁わ級锛孋PU 瑙g爜鐨勪綔涓氭暟
+
+#### 浣跨敤澶氫釜 GPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --batch_size 1 \
+ --gpu_inference true \
+ --gpuid_list "0,1"
+```
+#### 浣跨敤澶氱嚎绋� CPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --gpu_inference false \
+ --njob 1
+```
+
+## Finetune with pipeline
+
+### Quick start
+
+### Finetune with your data
+
+## Inference with your finetuned model
+
diff --git a/egs_modelscope/vad/TEMPLATE/README.md b/egs_modelscope/vad/TEMPLATE/README.md
index 0ad9fb3..3539b3d 100644
--- a/egs_modelscope/vad/TEMPLATE/README.md
+++ b/egs_modelscope/vad/TEMPLATE/README.md
@@ -1,3 +1,5 @@
+([绠�浣撲腑鏂嘳(./README_zh.md)|English)
+
# Voice Activity Detection
> **Note**:
diff --git a/egs_modelscope/vad/TEMPLATE/README_zh.md b/egs_modelscope/vad/TEMPLATE/README_zh.md
new file mode 100644
index 0000000..a1b1916
--- /dev/null
+++ b/egs_modelscope/vad/TEMPLATE/README_zh.md
@@ -0,0 +1,113 @@
+(绠�浣撲腑鏂噟[English](./README.md))
+
+# 璇煶绔偣妫�娴�
+
+> **娉ㄦ剰**:
+> Pipeline 鏀寔鍦╗modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)涓殑鎵�鏈夋ā鍨嬭繘琛屾帹鐞嗗拰寰皟銆傚湪杩欓噷锛屾垜浠互 FSMN-VAD 妯″瀷涓轰緥鏉ユ紨绀轰娇鐢ㄦ柟娉曘��
+
+## 鎺ㄧ悊
+
+### 蹇�熶娇鐢�
+#### [FSMN-VAD 妯″瀷](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary)
+```python
+from modelscope.pipelines import pipeline
+from modelscope.utils.constant import Tasks
+
+inference_pipeline = pipeline(
+ task=Tasks.voice_activity_detection,
+ model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
+)
+
+segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
+print(segments_result)
+```
+#### [FSMN-VAD-瀹炴椂妯″瀷](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary)
+```python
+inference_pipeline = pipeline(
+ task=Tasks.auto_speech_recognition,
+ model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
+ )
+import soundfile
+speech, sample_rate = soundfile.read("example/asr_example.wav")
+
+param_dict = {"in_cache": dict(), "is_final": False}
+chunk_stride = 1600# 100ms
+# first chunk, 100ms
+speech_chunk = speech[0:chunk_stride]
+rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
+print(rec_result)
+# next chunk, 480ms
+speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
+rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
+print(rec_result)
+```
+婕旂ず绀轰緥锛屽畬鏁翠唬鐮佽鍙傝�� [demo](https://github.com/alibaba-damo-academy/FunASR/discussions/236)
+
+
+
+### API鎺ュ彛璇存槑
+#### pipeline瀹氫箟
+- `task`: `Tasks.voice_activity_detection`
+- `model`: [妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope) 涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `ngpu`: `1`锛堥粯璁わ級锛屼娇鐢� GPU 杩涜鎺ㄧ悊銆傚鏋� ngpu=0锛屽垯浣跨敤 CPU 杩涜鎺ㄧ悊
+- `ncpu`: `1` 锛堥粯璁わ級锛岃缃敤浜� CPU 鍐呴儴鎿嶄綔骞惰鎬х殑绾跨▼鏁�
+- `output_dir`: `None` 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+- `batch_size`: `1` 锛堥粯璁わ級锛岃В鐮佹椂鐨勬壒澶勭悊澶у皬
+#### pipeline 鎺ㄧ悊
+- `audio_in`: 瑕佽В鐮佺殑杈撳叆锛屽彲浠ユ槸锛�
+ - wav鏂囦欢璺緞, 渚嬪: asr_example.wav,
+ - pcm鏂囦欢璺緞, 渚嬪: asr_example.pcm,
+ - 闊抽瀛楄妭鏁版祦锛屼緥濡傦細楹﹀厠椋庣殑瀛楄妭鏁版暟鎹�
+ - 闊抽閲囨牱鐐癸紝渚嬪锛歚audio, rate = soundfile.read("asr_example_zh.wav")`, 鏁版嵁绫诲瀷涓� numpy.ndarray 鎴栬�� torch.Tensor
+ - wav.scp锛宬aldi 鏍峰紡鐨� wav 鍒楄〃 (`wav_id \t wav_path`), 渚嬪:
+ ```text
+ asr_example1 ./audios/asr_example1.wav
+ asr_example2 ./audios/asr_example2.wav
+ ```
+ 鍦ㄨ繖绉嶈緭鍏� `wav.scp` 鐨勬儏鍐典笅锛屽繀椤昏缃� `output_dir` 浠ヤ繚瀛樿緭鍑虹粨鏋�
+- `audio_fs`: 闊抽閲囨牱鐜囷紝浠呭湪 audio_in 涓� pcm 闊抽鏃惰缃�
+- `output_dir`: None 锛堥粯璁わ級锛屽鏋滆缃紝杈撳嚭缁撴灉鐨勮緭鍑鸿矾寰�
+
+
+### 浣跨敤澶氱嚎绋� CPU 鎴栧涓� GPU 杩涜鎺ㄧ悊
+FunASR 杩樻彁渚涗簡 [egs_modelscope/vad/TEMPLATE/infer.sh](infer.sh) 鑴氭湰锛屼互浣跨敤澶氱嚎绋� CPU 鎴栧涓� GPU 杩涜瑙g爜銆�
+
+#### `infer.sh` 璁剧疆
+- `model`: [modelscope妯″瀷浠撳簱](https://alibaba-damo-academy.github.io/FunASR/en/model_zoo/modelscope_models.html#pretrained-models-on-modelscope)涓殑妯″瀷鍚嶇О锛屾垨鏈湴纾佺洏涓殑妯″瀷璺緞
+- `data_dir`: 鏁版嵁闆嗙洰褰曢渶瑕佸寘鎷� `wav.scp` 鏂囦欢銆傚鏋� `${data_dir}/text` 涔熷瓨鍦紝鍒欏皢璁$畻 CER
+- `output_dir`: 璇嗗埆缁撴灉鐨勮緭鍑虹洰褰�
+- `batch_size`: `1`锛堥粯璁わ級锛屽湪 GPU 涓婅繘琛屾帹鐞嗙殑鎵瑰鐞嗗ぇ灏�
+- `gpu_inference`: `true` 锛堥粯璁わ級锛屾槸鍚︽墽琛� GPU 瑙g爜锛屽鏋滆繘琛� CPU 鎺ㄧ悊锛屽垯璁剧疆涓� `false`
+- `gpuid_list`: `0,1` 锛堥粯璁わ級锛岀敤浜庢帹鐞嗙殑 GPU ID
+- `njob`: 浠呯敤浜� CPU 鎺ㄧ悊锛坄gpu_inference=false`锛夛紝`64`锛堥粯璁わ級锛孋PU 瑙g爜鐨勪綔涓氭暟
+
+#### 浣跨敤澶氫釜 GPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --batch_size 1 \
+ --gpu_inference true \
+ --gpuid_list "0,1"
+```
+#### 浣跨敤澶氱嚎绋� CPU 杩涜瑙g爜锛�
+```shell
+ bash infer.sh \
+ --model "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch" \
+ --data_dir "./data/test" \
+ --output_dir "./results" \
+ --gpu_inference false \
+ --njob 64
+```
+
+
+
+## Finetune with pipeline
+
+### Quick start
+
+### Finetune with your data
+
+## Inference with your finetuned model
+
diff --git a/egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md b/egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/vad/speech_fsmn_vad_zh-cn-16k-common/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md b/egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md
new file mode 120000
index 0000000..b88b7fb
--- /dev/null
+++ b/egs_modelscope/vad/speech_fsmn_vad_zh-cn-8k-common/README_zh.md
@@ -0,0 +1 @@
+../TEMPLATE/README_zh.md
\ No newline at end of file
diff --git a/funasr/runtime/docs/SDK_tutorial.md b/funasr/runtime/docs/SDK_tutorial.md
index c8f9971..0624af3 100644
--- a/funasr/runtime/docs/SDK_tutorial.md
+++ b/funasr/runtime/docs/SDK_tutorial.md
@@ -1,328 +1,206 @@
-# FunASR File Transcription Service Convenient Deployment Tutorial
+([绠�浣撲腑鏂嘳(./SDK_tutorial_zh.md)|English)
-FunASR provides offline file transcription services that can be conveniently deployed on local or cloud servers. The core of the service is based on the open-source runtime-SDK of FunASR. It integrates various related capabilities, such as voice endpoint detection (VAD) and Paraformer-large speech recognition (ASR), as well as punctuation recovery (PUNC), which have been open-sourced by the speech laboratory of DAMO Academy on the Modelscope community. With these capabilities, the service can transcribe audio accurately and efficiently under high concurrency.
+# FunASR Offline File Transcription Service Convenient Deployment Tutorial
-## Installation and Start Service
+FunASR provides an offline file transcription service that can be easily deployed on a local or cloud server. The core is the FunASR open-source runtime-SDK. It integrates various capabilities such as speech endpoint detection (VAD) and Paraformer-large speech recognition (ASR) and punctuation restoration (PUNC) released by the speech laboratory of the Damo Academy in the Modelscope community. It has a complete speech recognition chain and can recognize audio or video of tens of hours into punctuated text. Moreover, it supports transcription for hundreds of simultaneous requests.
-Environment Preparation and Configuration锛圼docs](./aliyun_server_tutorial.md)锛�
+## Server Configuration
-### Downloading Tools and Deployment
+Users can choose appropriate server configurations based on their business needs. The recommended configurations are:
+- Configuration 1: (X86, computing-type) 4-core vCPU, 8GB memory, and a single machine can support about 32 requests.
+- Configuration 2: (X86, computing-type) 16-core vCPU, 32GB memory, and a single machine can support about 64 requests.
+- Configuration 3: (X86, computing-type) 64-core vCPU, 128GB memory, and a single machine can support about 200 requests.
-Run the following command to perform a one-click deployment of the FunASR runtime-SDK service. Follow the prompts to complete the deployment and running of the service. Currently, only Linux environments are supported, and for other environments, please refer to the Advanced SDK Development Guide ([docs](./SDK_advanced_guide_offline.md)).
+Detailed performance [report](./benchmark_onnx_cpp.md)
-[//]: # (Due to network restrictions, the download of the funasr-runtime-deploy.sh one-click deployment tool may not proceed smoothly. If the tool has not been downloaded and entered into the one-click deployment tool after several seconds, please terminate it with Ctrl + C and run the following command again.)
+Cloud service providers offer a 3-month free trial for new users. Application tutorial ([docs](./aliyun_server_tutorial.md)).
+
+## Quick Start
+
+### Server Startup
+
+`Note`: The one-click deployment tool process includes installing Docker, downloading Docker images, and starting the service. If the user wants to start from the FunASR Docker image, please refer to the development guide ([docs](./SDK_advanced_guide_offline.md).
+
+Download the deployment tool `funasr-runtime-deploy-offline-cpu-zh.sh`
```shell
-curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR-APP/main/TransAudio/funasr-runtime-deploy.sh; sudo bash funasr-runtime-deploy.sh install
-# For the users in China, you could install with the command:
-# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy.sh; sudo bash funasr-runtime-deploy.sh install
+curl -O https://raw.githubusercontent.com/alibaba-damo-academy/FunASR/main/funasr/runtime/deploy_tools/funasr-runtime-deploy-offline-cpu-en.sh;
+# If there is a network problem, users in mainland China can use the following command:
+# curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-en.sh;
```
-#### Details of Configuration
+Execute the deployment tool and press the Enter key at the prompt to complete the installation and deployment of the server. Currently, the convenient deployment tool only supports Linux environments. For other environments, please refer to the development guide ([docs](./SDK_advanced_guide_offline.md)).
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh install --workspace /root/funasr-runtime-resources
+```
-##### Choosing FunASR Docker Image
+### Client Testing and Usage
-We recommend selecting the "latest" tag to use our latest image, but you can also choose from our historical versions.
+After running the above installation instructions, the client testing tool directory samples will be downloaded in the default installation directory /root/funasr-runtime-resources ([download click](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz)).
+We take the Python language client as an example to explain that it supports multiple audio format inputs (such as .wav, .pcm, .mp3, etc.), video inputs (.mp4, etc.), and multiple file list wav.scp inputs. For other client versions, please refer to the [documentation](#Detailed-Description-of-Client-Usage).
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+## Detailed Description of Client Usage
+
+After completing the FunASR runtime-SDK service deployment on the server, you can test and use the offline file transcription service through the following steps. Currently, the following programming language client versions are supported:
+
+- [Python](#python-client)
+- [CPP](#cpp-client)
+- [html](#html-client)
+- [java](#java-client)
+
+For more client version support, please refer to the [development guide](./SDK_advanced_guide_offline_zh.md).
+
+### python-client
+If you want to run the client directly for testing, you can refer to the following simple instructions, using the Python version as an example:
+
+```shell
+python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
+```
+
+Command parameter instructions:
+```text
+--host is the IP address of the FunASR runtime-SDK service deployment machine, which defaults to the local IP address (127.0.0.1). If the client and the service are not on the same server, it needs to be changed to the deployment machine IP address.
+--port 10095 deployment port number
+--mode offline represents offline file transcription
+--audio_in is the audio file that needs to be transcribed, supporting file paths and file list wav.scp
+--thread_num sets the number of concurrent sending threads, default is 1
+--ssl sets whether to enable SSL certificate verification, default is 1 to enable, and 0 to disable
+```
+
+### cpp-client
+
+After entering the samples/cpp directory, you can test it with CPP. The command is as follows:
+```shell
+./funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path ../audio/asr_example.wav
+```
+
+Command parameter description:
+```text
+--server-ip specifies the IP address of the machine where the FunASR runtime-SDK service is deployed. The default value is the local IP address (127.0.0.1). If the client and the service are not on the same server, the IP address needs to be changed to the IP address of the deployment machine.
+--port specifies the deployment port number as 10095.
+--wav-path specifies the audio file to be transcribed, and supports file paths.
+--thread_num sets the number of concurrent send threads, with a default value of 1.
+--ssl sets whether to enable SSL certificate verification, with a default value of 1 for enabling and 0 for disabling.
+```
+
+### html-client
+
+To experience it directly, open `html/static/index.html` in your browser. You will see the following page, which supports microphone input and file upload.
+<img src="images/html.png" width="900"/>
+
+### java-client
+
+```shell
+FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
+```
+For more details, please refer to the [docs](../java/readme.md)
+
+## Server Usage Details
+
+### Start the deployed FunASR service
+
+If you have restarted the computer or shut down Docker after one-click deployment, you can start the FunASR service directly with the following command. The startup configuration is the same as the last one-click deployment.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh start
+```
+
+### Stop the FunASR service
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh stop
+```
+
+### Release the FunASR service
+
+Release the deployed FunASR service.
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh remove
+```
+
+### Restart the FunASR service
+
+Restart the FunASR service with the same configuration as the last one-click deployment.
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh restart
+```
+
+### Replace the model and restart the FunASR service
+
+Replace the currently used model, and restart the FunASR service. The model must be an ASR/VAD/PUNC model in ModelScope, or a finetuned model from ModelScope.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--asr_model | --vad_model | --punc_model] <model_id or local model path>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --asr_model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
+```
+
+### Update parameters and restart the FunASR service
+
+Update the configured parameters and restart the FunASR service to take effect. The parameters that can be updated include the host and Docker port numbers, as well as the number of inference and IO threads.
+
+```shell
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--host_port | --docker_port] <port number>
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--decode_thread_num | --io_thread_num] <the number of threads>
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--workspace] <workspace in local>
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update [--ssl] <0: close SSL; 1: open SSL, default:1>
+
+e.g
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --decode_thread_num 32
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh update --workspace /root/funasr-runtime-resources
+```
+
+
+## Detailed Configuration of Server Startup Process
+
+### Select FunASR Docker image
+We recommend choosing to use our latest released image, but you can also choose historical versions.
```text
-[1/9]
+[1/5]
+ Getting the list of docker images, please wait a few seconds.
+ [DONE]
+
Please choose the Docker image.
- 1) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
- 2) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
- Enter your choice: 1
- You have chosen the Docker image: registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
+ 1) registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
+ Enter your choice, default(1):
+ You have chosen the Docker image: registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0
```
-##### Choosing ASR/VAD/PUNC Models
-You can choose a model from ModelScope by name, or fill in the name of a model in ModelScope as <model_name>. The model will be automatically downloaded during Docker runtime. You can also select <model_path> to fill in the local model path on the host machine.
+### Set the port provided by the host for FunASR
+Set the host port provided to Docker, which is 10095 by default. Please make sure that this port is available.
```text
-[2/9]
- Please input [Y/n] to confirm whether to automatically download model_id in ModelScope or use a local model.
- [y] With the model in ModelScope, the model will be automatically downloaded to Docker(/workspace/models).
- If you select both the local model and the model in ModelScope, select [y].
- [n] Use the models on the localhost, the directory where the model is located will be mapped to Docker.
- Setting confirmation[Y/n]:
- You have chosen to use the model in ModelScope, please set the model ID in the next steps, and the model will be automatically downloaded in (/workspace/models) during the run.
-
- Please enter the local path to download models, the corresponding path in Docker is /workspace/models.
- Setting the local path to download models, default(/root/models):
- The local path(/root/models) set will store models during the run.
-
- [2.1/9]
- Please select ASR model_id in ModelScope from the list below.
- 1) damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
- 2) model_name
- 3) model_path
- Enter your choice: 1
- The model ID is damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
- The model dir in Docker is /workspace/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
-
- [2.2/9]
- Please select VAD model_id in ModelScope from the list below.
- 1) damo/speech_fsmn_vad_zh-cn-16k-common-onnx
- 2) model_name
- 3) model_path
- Enter your choice: 1
- The model ID is damo/speech_fsmn_vad_zh-cn-16k-common-onnx
- The model dir in Docker is /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx
-
- [2.3/9]
- Please select PUNC model_id in ModelScope from the list below.
- 1) damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
- 2) model_name
- 3) model_path
- Enter your choice: 1
- The model ID is damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
- The model dir in Docker is /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
-```
-
-##### Enter the executable path of the FunASR service on the host machine
-
-Enter the host path of the executable of the FunASR service. It will be automatically mounted and run in Docker at runtime. If left blank, the default path in Docker will be set to /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server.
-
-```text
-[3/9]
- Please enter the path to the excutor of the FunASR service on the localhost.
- If not set, the default /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server in Docker is used.
- Setting the path to the excutor of the FunASR service on the localhost:
- Corresponding, the path of FunASR in Docker is /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server
-```
-
-##### Setting the port on the host machine for FunASR
-
-Setting the port on the host machine for Docker. The default port is 10095. Please ensure that this port is available.
-
-```text
-[4/9]
+[2/5]
Please input the opened port in the host used for FunASR server.
- Default: 10095
- Setting the opened host port [1-65535]:
+ Setting the opened host port [1-65535], default(10095):
The port of the host is 10095
The port in Docker for FunASR server is 10095
```
+### Set SSL
-##### Setting the number of inference threads for the FunASR service
-
-Setting the number of inference threads for the FunASR service. The default value is the number of cores on the host machine. The number of I/O threads for the service will also be automatically set to one-quarter of the number of inference threads.
-
-```text
-[5/9]
- Please input thread number for FunASR decoder.
- Default: 1
- Setting the number of decoder thread:
-
- The number of decoder threads is 1
- The number of IO threads is 1
-```
-
-##### Displaying all set parameters for confirmation
-
-Displaying the parameters set in the previous 6 steps. Confirming will save all parameters to /var/funasr/config and start Docker. Otherwise, users will be prompted to reset the parameters.
-
-```text
-
-[6/9]
- Show parameters of FunASR server setting and confirm to run ...
-
- The current Docker image is : registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest
- The model is downloaded or stored to this directory in local : /root/models
- The model will be automatically downloaded to the directory : /workspace/models
- The ASR model_id used : damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
- The ASR model directory corresponds to the directory in Docker : /workspace/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx
- The VAD model_id used : damo/speech_fsmn_vad_zh-cn-16k-common-onnx
- The VAD model directory corresponds to the directory in Docker : /workspace/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx
- The PUNC model_id used : damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
- The PUNC model directory corresponds to the directory in Docker: /workspace/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx
-
- The path in the docker of the FunASR service executor : /workspace/FunASR/funasr/runtime/websocket/build/bin/funasr-wss-server
- Set the host port used for use by the FunASR service : 10095
- Set the docker port used by the FunASR service : 10095
- Set the number of threads used for decoding the FunASR service : 1
- Set the number of threads used for IO the FunASR service : 1
-
- Please input [Y/n] to confirm the parameters.
- [y] Verify that these parameters are correct and that the service will run.
- [n] The parameters set are incorrect, it will be rolled out, please rerun.
- read confirmation[Y/n]:
-
- Will run FunASR server later ...
- Parameters are stored in the file /var/funasr/config
-```
-
-##### Checking the Docker service
-
-Checking if Docker service is installed on the host machine. If not installed, installing and starting Docker
-
-```text
-[7/9]
- Start install docker for ubuntu
- Get docker installer: curl -fsSL https://test.docker.com -o test-docker.sh
- Get docker run: sudo sh test-docker.sh
-# Executing docker install script, commit: c2de0811708b6d9015ed1a2c80f02c9b70c8ce7b
-+ sh -c apt-get update -qq >/dev/null
-+ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
-+ sh -c install -m 0755 -d /etc/apt/keyrings
-+ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | gpg --dearmor --yes -o /etc/apt/keyrings/docker.gpg
-+ sh -c chmod a+r /etc/apt/keyrings/docker.gpg
-+ sh -c echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu focal test" > /etc/apt/sources.list.d/docker.list
-+ sh -c apt-get update -qq >/dev/null
-+ sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq docker-ce docker-ce-cli containerd.io docker-compose-plugin docker-ce-rootless-extras docker-buildx-plugin >/dev/null
-+ sh -c docker version
-Client: Docker Engine - Community
- Version: 24.0.2
-
- ...
- ...
-
- Docker install success, start docker server.
-```
-
-##### Downloading the FunASR Docker image
-
-Downloading and updating the FunASR Docker image selected in step 1.1
-
-```text
-[8/9]
- Pull docker image(registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-latest)...
-funasr-runtime-cpu-0.0.1: Pulling from funasr_repo/funasr
-7608715873ec: Pull complete
-3e1014c56f38: Pull complete
-
- ...
- ...
-```
-
-##### Starting the FunASR Docker
-
-Starting the FunASR Docker and waiting for the model selected in step 1.2 to finish downloading and start the FunASR service
-
-```text
-[9/9]
- Construct command and run docker ...
-943d8f02b4e5011b71953a0f6c1c1b9bc5aff63e5a96e7406c83e80943b23474
-
- Loading models:
- [ASR ][Done ][==================================================][100%][1.10MB/s][v1.2.1]
- [VAD ][Done ][==================================================][100%][7.26MB/s][v1.2.0]
- [PUNC][Done ][==================================================][100%][ 474kB/s][v1.1.7]
- The service has been started.
- If you want to see an example of how to use the client, you can run sudo bash funasr-runtime-deploy.sh -c .
-```
-
-#### Starting the deployed FunASR service
-
-If the computer is restarted or Docker is closed after one-click deployment, the following command can be used to start the FunASR service directly with the settings from the last one-click deployment.
-
+SSL verification is enabled by default. If you need to disable it, you can set it when starting.
```shell
-sudo bash funasr-runtime-deploy.sh start
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh --ssl 0
```
-#### Shutting down the FunASR service
+## Contact Us
-```shell
-sudo bash funasr-runtime-deploy.sh stop
-```
+If you encounter any problems during use, please join our user group for feedback.
-#### Restarting the FunASR service
-Restarting the FunASR service with the settings from the last one-click deployment
+| DingDing Group | Wechat |
+|:----------------------------------------------------------------------------:|:--------------------------------------------------------------:|
+| <div align="left"><img src="../../../docs/images/dingding.jpg" width="250"/> | <img src="../../../docs/images/wechat.png" width="232"/></div> |
-```shell
-sudo bash funasr-runtime-deploy.sh restart
-```
-#### Replacing the model and restarting the FunASR service
-
-Replacing the currently used model and restarting the FunASR service. The model must be an ASR/VAD/PUNC model from ModelScope.
-
-```shell
-sudo bash scripts/funasr-runtime-deploy.sh update model <model ID in ModelScope>
-
-e.g
-sudo bash scripts/funasr-runtime-deploy.sh update model damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
-```
-
-### How to test and use the offline file transcription service
-
-After completing the FunASR service deployment on the server, you can test and use the offline file transcription service by following these steps. Currently, command line running is supported for Python, C++, and Java client versions, as well as an HTML web page version that can be directly experienced in the browser. For more client language support, please refer to the "FunASR Advanced Development Guide" documentation.
-After the funasr-runtime-deploy.sh script finishes running, you can use the following command to automatically download the test samples to the funasr_samples directory in the current directory and run the program with the set parameters in an interactive manner:
-
-```shell
-sudo bash funasr-runtime-deploy.sh client
-```
-
-You can choose from the provided Python and Linux C++ sample programs. Taking the Python sample as an example:
-
-```text
-Will download sample tools for the client to show how speech recognition works.
- Please select the client you want to run.
- 1) Python
- 2) Linux_Cpp
- Enter your choice: 1
-
- Please enter the IP of server, default(127.0.0.1):
- Please enter the port of server, default(10095):
- Please enter the audio path, default(/root/funasr_samples/audio/asr_example.wav):
-
- Run pip3 install click>=8.0.4
-Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
-Requirement already satisfied: click>=8.0.4 in /usr/local/lib/python3.8/dist-packages (8.1.3)
-
- Run pip3 install -r /root/funasr_samples/python/requirements_client.txt
-Looking in indexes: http://mirrors.cloud.aliyuncs.com/pypi/simple/
-Requirement already satisfied: websockets in /usr/local/lib/python3.8/dist-packages (from -r /root/funasr_samples/python/requirements_client.txt (line 1)) (11.0.3)
-
- Run python3 /root/funasr_samples/python/funasr_wss_client.py --host 127.0.0.1 --port 10095 --mode offline --audio_in /root/funasr_samples/audio/asr_example.wav --send_without_sleep --output_dir ./funasr_samples/python
-
- ...
- ...
-
- pid0_0: 娆㈣繋澶у鏉ヤ綋楠岃揪鎽╅櫌鎺ㄥ嚭鐨勮闊宠瘑鍒ā鍨嬨��
-Exception: sent 1000 (OK); then received 1000 (OK)
-end
-
- If failed, you can try (python3 /root/funasr_samples/python/funasr_wss_client.py --host 127.0.0.1 --port 10095 --mode offline --audio_in /root/funasr_samples/audio/asr_example.wav --send_without_sleep --output_dir ./funasr_samples/python) in your Shell.
-
-```
-
-#### python-client
-
-If you want to directly run the client for testing, you can refer to the following simple instructions, taking the Python version as an example:
-```shell
-python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav" --send_without_sleep --output_dir "./results"
-```
-
-Command parameter instructions:
-
-```text
---host: The IP address of the machine where the FunASR runtime-SDK service is deployed. The default is the local IP address (127.0.0.1). If the client and service are not on the same server, the IP address should be changed to that of the deployment machine.
---port 10095: The deployment port number.
---mode offline: Indicates offline file transcription.
---audio_in: The audio file(s) to be transcribed, which can be a file path or a file list (wav.scp).
---output_dir: The path to save the recognition results.
-```
-
-#### cpp-client
-
-```shell
-export LD_LIBRARY_PATH=/root/funasr_samples/cpp/libs:$LD_LIBRARY_PATH
-/root/funasr_samples/cpp/funasr-wss-client --server-ip 127.0.0.1 --port 10095 --wav-path /root/funasr_samples/audio/asr_example.wav
-```
-
-Command parameter instructions:
-
-```text
---server-ip: The IP address of the machine where the FunASR runtime-SDK service is deployed. The default is the local IP address (127.0.0.1). If the client and service are not on the same server, the IP address should be changed to that of the deployment machine.
---port 10095: The deployment port number.
---wav-path: The audio file(s) to be transcribed, which can be a file path.
-```
-
-### Video demo
-
-[demo]()
diff --git a/funasr/runtime/docs/SDK_tutorial_zh.md b/funasr/runtime/docs/SDK_tutorial_zh.md
index 8c476a3..6d4454c 100644
--- a/funasr/runtime/docs/SDK_tutorial_zh.md
+++ b/funasr/runtime/docs/SDK_tutorial_zh.md
@@ -1,3 +1,5 @@
+(绠�浣撲腑鏂噟[English](./SDK_tutorial.md))
+
# FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟渚挎嵎閮ㄧ讲鏁欑▼
FunASR鎻愪緵鍙究鎹锋湰鍦版垨鑰呬簯绔湇鍔″櫒閮ㄧ讲鐨勭绾挎枃浠惰浆鍐欐湇鍔★紝鍐呮牳涓篎unASR宸插紑婧恟untime-SDK銆�
diff --git a/funasr/runtime/html5/readme.md b/funasr/runtime/html5/readme.md
index e46ab92..2cde826 100644
--- a/funasr/runtime/html5/readme.md
+++ b/funasr/runtime/html5/readme.md
@@ -1,72 +1,93 @@
([绠�浣撲腑鏂嘳(./readme_zh.md)|English)
-# Html5 server for asr service
+# Speech Recognition Service Html5 Client Access Interface
-## Requirement
-#### Install the modelscope and funasr
+The server deployment uses the websocket protocol. The client can support html5 webpage access and microphone input or file input. There are two ways to access the service:
+- Method 1:
+
+ Directly connect to the html client, manually download the client ([click here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/html5/static)) to the local computer, and open the index.html webpage to enter the wss address and port number.
+
+- Method 2:
+
+ Html5 server, automatically download the client to the local computer, and support access by mobile phones and other devices.
+
+## Starting Speech Recognition Service
+
+Support the deployment of Python and C++ versions, where
+
+- Python version
+
+ Directly deploy the Python pipeline, support streaming real-time speech recognition models, offline speech recognition models, streaming offline integrated error correction models, and output text with punctuation marks. Single server, supporting a single client.
+
+- C++ version
+
+ funasr-runtime-sdk, supports one-key deployment, version 0.1.0, supports offline file transcription. Single server, supporting requests from hundreds of clients.
+
+### Starting Python Version Service
+
+#### Install Dependencies
+
```shell
-pip install -U modelscope funasr
-# For the users in China, you could install with the command:
-# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
+pip3 install -U modelscope funasr flask
+# Users in mainland China, if encountering network issues, can install with the following command:
+# pip3 install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
git clone https://github.com/alibaba/FunASR.git && cd FunASR
```
-#### Install the requirements for server
-```shell
-pip install flask
-# pip install gevent (Optional)
-# pip install pyOpenSSL (Optional)
-```
-### javascript (Optional)
-[html5 recorder.js](https://github.com/xiangyuecn/Recorder)
-```shell
-Recorder
-```
+#### Start ASR Service
-## demo
-<div align="center"><img src="./demo.gif" width="150"/> </div>
-
-## Steps
-### Html5 demo
+#### wss Method
```shell
-usage: h5Server.py [-h] [--host HOST] [--port PORT] [--certfile CERTFILE] [--keyfile KEYFILE]
-```
-`e.g.`
-```shell
-cd funasr/runtime/html5
-python h5Server.py --host 0.0.0.0 --port 1337
-```
-### asr service
-[detail for asr](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket)
-
-`Tips:` asr service and html5 service should be deployed on the same device.
-```shell
-cd ../python/websocket
+cd funasr/runtime/python/websocket
python funasr_wss_server.py --port 10095
```
+For detailed parameter configuration and analysis, please click [here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/websocket).
-### open browser to access html5 demo
+#### Html5 Service (Optional)
+
+If you need to use the client method mentioned above to access it, you can start the html5 service
+
+```shell
+h5Server.py [-h] [--host HOST] [--port PORT] [--certfile CERTFILE] [--keyfile KEYFILE]
+```
+As shown in the example below, pay attention to the IP address. If accessing from another device (such as a mobile phone), you need to set the IP address to the real public IP address.
+```shell
+cd funasr/runtime/html5
+python h5Server.py --host 0.0.0.0 --port 1337
+```
+
+After starting, enter ([https://127.0.0.1:1337/static/index.html](https://127.0.0.1:1337/static/index.html)) in the browser to access it.
+
+### Starting C++ Version Service
+
+Since there are many dependencies for C++, it is recommended to deploy it using docker, which supports one-key start of the service.
+
+
+```shell
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/funasr-runtime-deploy-offline-cpu-zh.sh;
+sudo bash funasr-runtime-deploy-offline-cpu-zh.sh install --workspace /root/funasr-runtime-resources
+```
+For detailed parameter configuration and analysis, please click [here](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/docs/SDK_tutorial_zh.md).
+
+## Client Testing
+
+### Method 1
+
+Directly connect to the html client, manually download the client ([click here](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/html5/static)) to the local computer, and open the index.html webpage, enter the wss address and port number to use.
+
+### Method 2
+
+Html5 server, automatically download the client to the local computer, and support access by mobile phones and other devices. The IP address needs to be consistent with the html5 server. If it is a local computer, you can use 127.0.0.1.
+
```shell
https://127.0.0.1:1337/static/index.html
-# https://30.220.136.139:1337/static/index.html
```
-### open browser to open html5 file directly without h5Server
-you can run html5 client by just clicking the index.html file directly in your computer.
-1) lauch asr service without ssl, it must be in ws mode as ssl protocol will prohibit such access.
-2) copy whole directory /funasr/runtime/html5/static to your computer
-3) open /funasr/runtime/html5/static/index.html by browser
-4) enter asr service ws address and connect
-
-
-```shell
-
-```
-
+Enter the wss address and port number to use.
## Acknowledge
1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
-2. We acknowledge [AiHealthx](http://www.aihealthx.com/) for contributing the html5 demo.
\ No newline at end of file
+2. We acknowledge [AiHealthx](http://www.aihealthx.com/) for contributing the html5 demo.
--
Gitblit v1.9.1