python/FunASR-XL.git

parent: c699b484 | 补丁 | 提交 | show whitespace

游雁

2023-04-27 7ebfaac337c3cb43052f4759aa6bfd4eec596e04

docs

5个文件已修改

	egs_modelscope/asr/TEMPLATE/README.md	8 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	egs_modelscope/punctuation/TEMPLATE/README.md	7 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	egs_modelscope/tp/TEMPLATE/README.md	6 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	egs_modelscope/vad/TEMPLATE/README.md	6 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	funasr/runtime/python/websocket/README.md	9 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

 egs_modelscope/asr/TEMPLATE/README.md

@@ -102,7 +102,7 @@
### Inference with multi-thread CPUs or multi GPUs
FunASR also offer recipes [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

- Setting parameters in `infer.sh`
#### Settings of `infer.sh`
    - `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
    - `data_dir`: the dataset dir needs to include `wav.scp`. If `${data_dir}/text` is also exists, CER will be computed
    - `output_dir`: output dir of the recognition results
@@ -115,7 +115,7 @@
    - `decoding_mode`: `normal` (Default), decoding mode for UniASR model(fast、normal、offline)
    - `hotword_txt`: `None` (Default), hotword file for contextual paraformer model(the hotword file name ends with .txt")

- Decode with multi GPUs:
#### Decode with multi GPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
@@ -125,7 +125,7 @@
    --gpu_inference true \
    --gpuid_list "0,1"
```
- Decode with multi-thread CPUs:
#### Decode with multi-thread CPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
@@ -135,7 +135,7 @@
    --njob 64
```

- Results
#### Results

The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.


 egs_modelscope/punctuation/TEMPLATE/README.md

@@ -70,7 +70,7 @@
### Inference with multi-thread CPUs or multi GPUs
FunASR also offer recipes [egs_modelscope/punctuation/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/punctuation/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs. It is an offline recipe and only support offline model.

- Setting parameters in `infer.sh`
#### Settings of `infer.sh`
    - `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
    - `data_dir`: the dataset dir needs to include `punc.txt`
    - `output_dir`: output dir of the recognition results
@@ -80,7 +80,7 @@
    - `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
    - `checkpoint_name`: only used for infer finetuned models, `punc.pb` (Default), which checkpoint is used to infer

- Decode with multi GPUs:
#### Decode with multi GPUs:
```shell
    bash infer.sh \
    --model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
@@ -90,7 +90,7 @@
    --gpu_inference true \
    --gpuid_list "0,1"
```
- Decode with multi-thread CPUs:
#### Decode with multi-thread CPUs:
```shell
    bash infer.sh \
    --model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
@@ -99,7 +99,6 @@
    --gpu_inference false \
    --njob 1
```


## Finetune with pipeline


 egs_modelscope/tp/TEMPLATE/README.md

@@ -61,7 +61,7 @@
### Inference with multi-thread CPUs or multi GPUs
FunASR also offer recipes [egs_modelscope/tp/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/tp/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

- Setting parameters in `infer.sh`
#### Settings of `infer.sh`
    - `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
    - `data_dir`: the dataset dir **must** include `wav.scp` and `text.txt`
    - `output_dir`: output dir of the recognition results
@@ -72,7 +72,7 @@
    - `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
    - `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer

- Decode with multi GPUs:
#### Decode with multi GPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
@@ -82,7 +82,7 @@
    --gpu_inference true \
    --gpuid_list "0,1"
```
- Decode with multi-thread CPUs:
#### Decode with multi-thread CPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \

 egs_modelscope/vad/TEMPLATE/README.md

@@ -69,7 +69,7 @@
### Inference with multi-thread CPUs or multi GPUs
FunASR also offer recipes [egs_modelscope/vad/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/vad/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

- Setting parameters in `infer.sh`
#### Settings of `infer.sh`
    - `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
    - `data_dir`: the dataset dir needs to include `wav.scp`
    - `output_dir`: output dir of the recognition results
@@ -80,7 +80,7 @@
    - `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
    - `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer

- Decode with multi GPUs:
#### Decode with multi GPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
@@ -90,7 +90,7 @@
    --gpu_inference true \
    --gpuid_list "0,1"
```
- Decode with multi-thread CPUs:
#### Decode with multi-thread CPUs:
```shell
    bash infer.sh \
    --model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \

 funasr/runtime/python/websocket/README.md

@@ -51,12 +51,17 @@
pip install -r requirements_client.txt
```

Start client

### Start client
#### Recording from mircrophone
```shell
# --chunk_size, "5,10,5"=600ms, "8,8,4"=480ms
python ws_client.py --host "127.0.0.1" --port 10096 --chunk_size "5,10,5"
```
#### Loadding from wav.scp(kaldi style)
```shell
# --chunk_size, "5,10,5"=600ms, "8,8,4"=480ms
python ws_client.py --host "127.0.0.1" --port 10096 --chunk_size "5,10,5" --audio_in "./data/wav.scp"
```

## Acknowledge
1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).

			@@ -102,7 +102,7 @@
			### Inference with multi-thread CPUs or multi GPUs
			FunASR also offer recipes [egs_modelscope/asr/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

			- Setting parameters in `infer.sh`
			#### Settings of `infer.sh`
			- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
			- `data_dir`: the dataset dir needs to include `wav.scp`. If `${data_dir}/text` is also exists, CER will be computed
			- `output_dir`: output dir of the recognition results
			@@ -115,7 +115,7 @@
			- `decoding_mode`: `normal` (Default), decoding mode for UniASR model(fast、normal、offline)
			- `hotword_txt`: `None` (Default), hotword file for contextual paraformer model(the hotword file name ends with .txt")

			- Decode with multi GPUs:
			#### Decode with multi GPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
			@@ -125,7 +125,7 @@
			--gpu_inference true \
			--gpuid_list "0,1"
			```
			- Decode with multi-thread CPUs:
			#### Decode with multi-thread CPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
			@@ -135,7 +135,7 @@
			--njob 64
			```

			- Results
			#### Results

			The decoding results can be found in `$output_dir/1best_recog/text.cer`, which includes recognition results of each sample and the CER metric of the whole test set.

			@@ -70,7 +70,7 @@
			### Inference with multi-thread CPUs or multi GPUs
			FunASR also offer recipes [egs_modelscope/punctuation/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/punctuation/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs. It is an offline recipe and only support offline model.

			- Setting parameters in `infer.sh`
			#### Settings of `infer.sh`
			- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
			- `data_dir`: the dataset dir needs to include `punc.txt`
			- `output_dir`: output dir of the recognition results
			@@ -80,7 +80,7 @@
			- `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
			- `checkpoint_name`: only used for infer finetuned models, `punc.pb` (Default), which checkpoint is used to infer

			- Decode with multi GPUs:
			#### Decode with multi GPUs:
			```shell
			bash infer.sh \
			--model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
			@@ -90,7 +90,7 @@
			--gpu_inference true \
			--gpuid_list "0,1"
			```
			- Decode with multi-thread CPUs:
			#### Decode with multi-thread CPUs:
			```shell
			bash infer.sh \
			--model "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch" \
			@@ -99,7 +99,6 @@
			--gpu_inference false \
			--njob 1
			```


			## Finetune with pipeline

			@@ -61,7 +61,7 @@
			### Inference with multi-thread CPUs or multi GPUs
			FunASR also offer recipes [egs_modelscope/tp/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/tp/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

			- Setting parameters in `infer.sh`
			#### Settings of `infer.sh`
			- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
			- `data_dir`: the dataset dir must include `wav.scp` and `text.txt`
			- `output_dir`: output dir of the recognition results
			@@ -72,7 +72,7 @@
			- `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
			- `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer

			- Decode with multi GPUs:
			#### Decode with multi GPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
			@@ -82,7 +82,7 @@
			--gpu_inference true \
			--gpuid_list "0,1"
			```
			- Decode with multi-thread CPUs:
			#### Decode with multi-thread CPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \

			@@ -69,7 +69,7 @@
			### Inference with multi-thread CPUs or multi GPUs
			FunASR also offer recipes [egs_modelscope/vad/TEMPLATE/infer.sh](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/vad/TEMPLATE/infer.sh) to decode with multi-thread CPUs, or multi GPUs.

			- Setting parameters in `infer.sh`
			#### Settings of `infer.sh`
			- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
			- `data_dir`: the dataset dir needs to include `wav.scp`
			- `output_dir`: output dir of the recognition results
			@@ -80,7 +80,7 @@
			- `checkpoint_dir`: only used for infer finetuned models, the path dir of finetuned models
			- `checkpoint_name`: only used for infer finetuned models, `valid.cer_ctc.ave.pb` (Default), which checkpoint is used to infer

			- Decode with multi GPUs:
			#### Decode with multi GPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \
			@@ -90,7 +90,7 @@
			--gpu_inference true \
			--gpuid_list "0,1"
			```
			- Decode with multi-thread CPUs:
			#### Decode with multi-thread CPUs:
			```shell
			bash infer.sh \
			--model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" \

			@@ -51,12 +51,17 @@
			pip install -r requirements_client.txt
			```

			Start client

			### Start client
			#### Recording from mircrophone
			```shell
			# --chunk_size, "5,10,5"=600ms, "8,8,4"=480ms
			python ws_client.py --host "127.0.0.1" --port 10096 --chunk_size "5,10,5"
			```
			#### Loadding from wav.scp(kaldi style)
			```shell
			# --chunk_size, "5,10,5"=600ms, "8,8,4"=480ms
			python ws_client.py --host "127.0.0.1" --port 10096 --chunk_size "5,10,5" --audio_in "./data/wav.scp"
			```

			## Acknowledge
			1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).