python/FunASR-XL.git

parent: 2144444b | 补丁 | 提交 | show whitespace

游雁

2023-07-26 e7124915b6db38e63e08fb64dbad79386ba26d49

docs zh

3个文件已修改

3个文件已添加

	docs/installation/docker.md	2 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/installation/docker_zh.md	72 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/installation/installation.md	6 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/installation/installation_zh.md	75 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/modelscope_pipeline/quick_start.md	3 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/modelscope_pipeline/quick_start_zh.md	227 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

 docs/installation/docker.md

@@ -1,3 +1,5 @@
([简体中文](./docker_zh.md)|English)

# Docker

## Install Docker

 docs/installation/docker_zh.md

New file
@@ -0,0 +1,72 @@
(简体中文|[English](./docker.md))

# Docker

## 安装Docker

### Ubuntu
```shell
curl -fsSL https://test.docker.com -o test-docker.sh
sudo sh test-docker.sh
```
### Debian
```shell
 curl -fsSL https://get.docker.com -o get-docker.sh
 sudo sh get-docker.sh
```

### CentOS
```shell
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
```

### MacOS
```shell
brew install --cask --appdir=/Applications docker
```

### Windows
请参考[文档](https://docs.docker.com/desktop/install/windows-install/)

## 启动Docker
```shell
sudo systemctl start docker
```
## 下载Docker镜像

### 镜像仓库

#### CPU
`registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0`

#### GPU

`registry.cn-beijing.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.7.0`

### 拉取镜像
```shell
sudo docker pull <image-name>:<tag>
```

### 查看镜像
```shell
sudo docker images
```

## 运行Docker
```shell
# cpu
sudo docker run -itd --name funasr -v <local_dir:dir_in_docker> <image-name>:<tag> /bin/bash
# gpu
sudo docker run -itd --gpus all --name funasr -v <local_dir:dir_in_docker> <image-name>:<tag> /bin/bash

sudo docker exec -it funasr /bin/bash
```

## 停止Docker
```shell
exit
sudo docker ps
sudo docker stop funasr
```


 docs/installation/installation.md

@@ -1,3 +1,5 @@
([简体中文](./installation_zh.md)|English)

<p align="left">
    <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
@@ -13,7 +15,7 @@
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc
conda create -n funasr python=3.7
conda create -n funasr python=3.8
conda activate funasr
```
#### Mac
@@ -60,7 +62,7 @@
```shell
pip3 install -U modelscope
# For the users in China, you could install with the command:
# pip3 install -U modelscope -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html -i https://mirror.sjtu.edu.cn/pypi/web/simple
# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

### FQA

 docs/installation/installation_zh.md

New file
@@ -0,0 +1,75 @@
(简体中文|[English](./installation.md))

<p align="left">
    <a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>

## 安装

### 安装Conda（可选）：

#### Linux
```sh
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc
conda create -n funasr python=3.8
conda activate funasr
```
#### Mac
```sh
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
# For M1 chip
# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
sh Miniconda3-latest-MacOSX*
source ~/.zashrc
conda create -n funasr python=3.8
conda activate funasr
```
#### Windows
Ref to [docs](https://docs.conda.io/en/latest/miniconda.html#windows-installers)

### 安装Pytorch（版本 >= 1.11.0）：

```sh
pip3 install torch torchaudio
```
如果您的环境中存在CUDAs，则应安装与CUDA匹配版本的pytorch，匹配列表可在文档中找到（[文档](https://pytorch.org/get-started/previous-versions/)）。
### 安装funasr

#### 从pip安装

```shell
pip3 install -U funasr
# 对于中国大陆用户，可以使用以下命令进行安装：
# pip3 install -U funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

#### 或者从源代码安装

``` sh
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip3 install -e ./
# 对于中国大陆用户，可以使用以下命令进行安装：
# pip3 install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

### 安装modelscope（可选）

如果您想要使用ModelScope中的预训练模型，则应安装modelscope:

```shell
pip3 install -U modelscope
# 对于中国大陆用户，可以使用以下命令进行安装：
# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

### 常见问题解答
- 在MAC M1芯片上安装时，可能会出现以下错误：
- - _cffi_backend.cpython-38-darwin.so' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e)))
    ```shell
    pip uninstall cffi pycparser
    ARCHFLAGS="-arch arm64" pip install cffi pycparser --compile --no-cache-dir
    ```

 docs/modelscope_pipeline/quick_start.md

@@ -1,3 +1,5 @@
([简体中文](./quick_start_zh.md)|English)

# Quick Start

> **Note**: 
@@ -222,4 +224,3 @@
```shell
CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
```


 docs/modelscope_pipeline/quick_start_zh.md

New file
@@ -0,0 +1,227 @@
(简体中文|[English](./quick_start.md))

# 快速使用

> **注意**: 
> modelscope pipeline支持model zoo中的所有模型进行推理和微调。这里我们以typic模型为例来演示用法。


## 使用pipeline进行推理

### 语音识别
#### Paraformer模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)

rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
```

### 语音端点检测
#### FSMN-VAD模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from modelscope.utils.logger import get_logger
import logging
logger = get_logger(log_level=logging.CRITICAL)
logger.setLevel(logging.CRITICAL)

inference_pipeline = pipeline(
    task=Tasks.voice_activity_detection,
    model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
    )

segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
print(segments_result)
# {'text': [[70, 2340], [2620, 6200], [6480, 23670], [23950, 26250], [26780, 28990], [29950, 31430], [31750, 37600], [38210, 46900], [47310, 49630], [49910, 56460], [56740, 59540], [59820, 70450]]}
```

### 标点恢复
#### CT_Transformer模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.punctuation,
    model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch',
    )

rec_result = inference_pipeline(text_in='我们都是木头人不会讲话不会动')
print(rec_result)
# {'text': '我们都是木头人，不会讲话，不会动。'}
```

### 时间戳预测
#### TP-Aligner模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.speech_timestamp,
    model='damo/speech_timestamp_prediction-v1-16k-offline',)

rec_result = inference_pipeline(
    audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
    text_in='一 个 东 太 平 洋 国 家 为 什 么 跑 到 西 太 平 洋 来 了 呢',)
print(rec_result)
# {'text': '<sil> 0.000 0.380;一 0.380 0.560;个 0.560 0.800;东 0.800 0.980;太 0.980 1.140;平 1.140 1.260;洋 1.260 1.440;国 1.440 1.680;家 1.680 1.920;<sil> 1.920 2.040;为 2.040 2.200;什 2.200 2.320;么 2.320 2.500;跑 2.500 2.680;到 2.680 2.860;西 2.860 3.040;太 3.040 3.200;平 3.200 3.380;洋 3.380 3.500;来 3.500 3.640;了 3.640 3.800;呢 3.800 4.150;<sil> 4.150 4.440;', 'timestamp': [[380, 560], [560, 800], [800, 980], [980, 1140], [1140, 1260], [1260, 1440], [1440, 1680], [1680, 1920], [2040, 2200], [2200, 2320], [2320, 2500], [2500, 2680], [2680, 2860], [2860, 3040], [3040, 3200], [3200, 3380], [3380, 3500], [3500, 3640], [3640, 3800], [3800, 4150]]}
```

### 说话人确认
#### X-vector模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import numpy as np

inference_sv_pipline = pipeline(
    task=Tasks.speaker_verification,
    model='damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch'
)

# embedding extract
spk_embedding = inference_sv_pipline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav')["spk_embedding"]

# speaker verification
rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav'))
print(rec_result["scores"][0])
# 0.8540499500025098
```

### 说话人日志
#### SOND模型
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_diar_pipline = pipeline(
    mode="sond_demo",
    num_workers=0,
    task=Tasks.speaker_diarization,
    diar_model_config="sond.yaml",
    model='damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch',
    model_revision="v1.0.3",
    sv_model="damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch",
    sv_model_revision="v1.0.0",
)

audio_list=[
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/record.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_A.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B1.wav"
]

results = inference_diar_pipline(audio_in=audio_list)
print(results)
# {'text': 'spk1 [(0.8, 1.84), (2.8, 6.16), (7.04, 10.64), (12.08, 12.8), (14.24, 15.6)]\nspk2 [(0.0, 1.12), (1.68, 3.2), (4.48, 7.12), (8.48, 9.04), (10.56, 14.48), (15.44, 16.0)]'}
```

### 常见问题
#### 使用pipeline进行推理，如何在CPU与GPU进行切换

The pipeline defaults to decoding with GPU (`ngpu=1`) when GPU is available. If you want to switch to CPU, you could set `ngpu=0`
```python
inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
    ngpu=0,
)
```

#### 如何从本地模型进行推理（不联网使用）
使用modelscope-sdk将模型下载到本地

```python
from modelscope.hub.snapshot_download import snapshot_download

local_dir_root = "./models_from_modelscope"
model_dir = snapshot_download('damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', cache_dir=local_dir_root)
```

或者使用git将模型下载到本地
```shell
git lfs install
# git clone https://www.modelscope.cn/<namespace>/<model-name>.git
git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git
```

从下载的本地模型进行推理（可以不联网使用）
```python
local_dir_root = "./models_from_modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model=local_dir_root,
)
```

## 使用pipeline进行微调
### 语音识别
#### Paraformer模型

finetune.py
```python
import os
from modelscope.metainfo import Trainers
from modelscope.trainers import build_trainer
from modelscope.msdatasets.audio.asr_dataset import ASRDataset

def modelscope_finetune(params):
    if not os.path.exists(params.output_dir):
        os.makedirs(params.output_dir, exist_ok=True)
    # dataset split ["train", "validation"]
    ds_dict = ASRDataset.load(params.data_path, namespace='speech_asr')
    kwargs = dict(
        model=params.model,
        data_dir=ds_dict,
        dataset_type=params.dataset_type,
        work_dir=params.output_dir,
        batch_bins=params.batch_bins,
        max_epoch=params.max_epoch,
        lr=params.lr)
    trainer = build_trainer(Trainers.speech_asr_trainer, default_args=kwargs)
    trainer.train()


if __name__ == '__main__':
    from funasr.utils.modelscope_param import modelscope_args
    params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
    params.output_dir = "./checkpoint"                      # 模型保存路径
    params.data_path = "speech_asr_aishell1_trainsets"      # 数据路径，可以为modelscope中已上传数据，也可以是本地数据
    params.dataset_type = "small"                           # 小数据量设置small，若数据量大于1000小时，请使用large
    params.batch_bins = 2000                                # batch size，如果dataset_type="small"，batch_bins单位为fbank特征帧数，如果dataset_type="large"，batch_bins单位为毫秒，
    params.max_epoch = 50                                   # 最大训练轮数
    params.lr = 0.00005                                     # 设置学习率
    
    modelscope_finetune(params)
```

```shell
python finetune.py &> log.txt &
```
tail log.txt
```
[bach-gpu011024008134] 2023-04-23 18:59:13,976 (e2e_asr_paraformer:467) INFO: enable sampler in paraformer, sampling_ratio: 0.75
[bach-gpu011024008134] 2023-04-23 18:59:48,924 (trainer:777) INFO: 2epoch:train:1-50batch:50num_updates: iter_time=0.008, forward_time=0.302, loss_att=0.186, acc=0.942, loss_pre=0.005, loss=0.192, backward_time=0.231, optim_step_time=0.117, optim0_lr0=7.484e-06, train_time=0.753
[bach-gpu011024008134] 2023-04-23 19:00:23,869 (trainer:777) INFO: 2epoch:train:51-100batch:100num_updates: iter_time=1.152e-04, forward_time=0.275, loss_att=0.184, acc=0.945, loss_pre=0.005, loss=0.189, backward_time=0.234, optim_step_time=0.117, optim0_lr0=7.567e-06, train_time=0.699
[bach-gpu011024008134] 2023-04-23 19:00:58,463 (trainer:777) INFO: 2epoch:train:101-150batch:150num_updates: iter_time=1.123e-04, forward_time=0.271, loss_att=0.204, acc=0.942, loss_pre=0.005, loss=0.210, backward_time=0.231, optim_step_time=0.116, optim0_lr0=7.651e-06, train_time=0.692
```

### 常见问题
### 多GPU训练

可以使用下面的指令进行多GPU训练
```shell
CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
```

			@@ -1,3 +1,5 @@
			([简体中文](./docker_zh.md)\|English)

			# Docker

			## Install Docker

New file
			@@ -0,0 +1,72 @@
			(简体中文\|[English](./docker.md))

			# Docker

			## 安装Docker

			### Ubuntu
			```shell
			curl -fsSL https://test.docker.com -o test-docker.sh
			sudo sh test-docker.sh
			```
			### Debian
			```shell
			curl -fsSL https://get.docker.com -o get-docker.sh
			sudo sh get-docker.sh
			```

			### CentOS
			```shell
			curl -fsSL https://get.docker.com \| bash -s docker --mirror Aliyun
			```

			### MacOS
			```shell
			brew install --cask --appdir=/Applications docker
			```

			### Windows
			请参考[文档](https://docs.docker.com/desktop/install/windows-install/)

			## 启动Docker
			```shell
			sudo systemctl start docker
			```
			## 下载Docker镜像

			### 镜像仓库

			#### CPU
			`registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.1.0`

			#### GPU

			`registry.cn-beijing.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.7.0`

			### 拉取镜像
			```shell
			sudo docker pull <image-name>:<tag>
			```

			### 查看镜像
			```shell
			sudo docker images
			```

			## 运行Docker
			```shell
			# cpu
			sudo docker run -itd --name funasr -v <local_dir:dir_in_docker> <image-name>:<tag> /bin/bash
			# gpu
			sudo docker run -itd --gpus all --name funasr -v <local_dir:dir_in_docker> <image-name>:<tag> /bin/bash

			sudo docker exec -it funasr /bin/bash
			```

			## 停止Docker
			```shell
			exit
			sudo docker ps
			sudo docker stop funasr
			```

			@@ -1,3 +1,5 @@
			([简体中文](./installation_zh.md)\|English)

			<p align="left">
			<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
			<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
			@@ -13,7 +15,7 @@
			wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
			sh Miniconda3-latest-Linux-x86_64.sh
			source ~/.bashrc
			conda create -n funasr python=3.7
			conda create -n funasr python=3.8
			conda activate funasr
			```
			#### Mac
			@@ -60,7 +62,7 @@
			```shell
			pip3 install -U modelscope
			# For the users in China, you could install with the command:
			# pip3 install -U modelscope -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html -i https://mirror.sjtu.edu.cn/pypi/web/simple
			# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple
			```

			### FQA

New file
			@@ -0,0 +1,75 @@
			(简体中文\|[English](./installation.md))

			<p align="left">
			<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
			<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
			<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
			</p>

			## 安装

			### 安装Conda（可选）：

			#### Linux
			```sh
			wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
			sh Miniconda3-latest-Linux-x86_64.sh
			source ~/.bashrc
			conda create -n funasr python=3.8
			conda activate funasr
			```
			#### Mac
			```sh
			wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
			# For M1 chip
			# wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
			sh Miniconda3-latest-MacOSX*
			source ~/.zashrc
			conda create -n funasr python=3.8
			conda activate funasr
			```
			#### Windows
			Ref to [docs](https://docs.conda.io/en/latest/miniconda.html#windows-installers)

			### 安装Pytorch（版本 >= 1.11.0）：

			```sh
			pip3 install torch torchaudio
			```
			如果您的环境中存在CUDAs，则应安装与CUDA匹配版本的pytorch，匹配列表可在文档中找到（[文档](https://pytorch.org/get-started/previous-versions/)）。
			### 安装funasr

			#### 从pip安装

			```shell
			pip3 install -U funasr
			# 对于中国大陆用户，可以使用以下命令进行安装：
			# pip3 install -U funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
			```

			#### 或者从源代码安装

			``` sh
			git clone https://github.com/alibaba/FunASR.git && cd FunASR
			pip3 install -e ./
			# 对于中国大陆用户，可以使用以下命令进行安装：
			# pip3 install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
			```

			### 安装modelscope（可选）

			如果您想要使用ModelScope中的预训练模型，则应安装modelscope:

			```shell
			pip3 install -U modelscope
			# 对于中国大陆用户，可以使用以下命令进行安装：
			# pip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple
			```

			### 常见问题解答
			- 在MAC M1芯片上安装时，可能会出现以下错误：
			- - _cffi_backend.cpython-38-darwin.so' (mach-o file, but is an incompatible architecture (have (x86_64), need (arm64e)))
			```shell
			pip uninstall cffi pycparser
			ARCHFLAGS="-arch arm64" pip install cffi pycparser --compile --no-cache-dir
			```

			@@ -1,3 +1,5 @@
			([简体中文](./quick_start_zh.md)\|English)

			# Quick Start

			> Note:
			@@ -222,4 +224,3 @@
			```shell
			CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
			```

New file
			@@ -0,0 +1,227 @@
			(简体中文\|[English](./quick_start.md))

			# 快速使用

			> 注意:
			> modelscope pipeline支持model zoo中的所有模型进行推理和微调。这里我们以typic模型为例来演示用法。


			## 使用pipeline进行推理

			### 语音识别
			#### Paraformer模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks

			inference_pipeline = pipeline(
			task=Tasks.auto_speech_recognition,
			model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
			)

			rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
			print(rec_result)
			# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
			```

			### 语音端点检测
			#### FSMN-VAD模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks
			from modelscope.utils.logger import get_logger
			import logging
			logger = get_logger(log_level=logging.CRITICAL)
			logger.setLevel(logging.CRITICAL)

			inference_pipeline = pipeline(
			task=Tasks.voice_activity_detection,
			model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
			)

			segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
			print(segments_result)
			# {'text': [[70, 2340], [2620, 6200], [6480, 23670], [23950, 26250], [26780, 28990], [29950, 31430], [31750, 37600], [38210, 46900], [47310, 49630], [49910, 56460], [56740, 59540], [59820, 70450]]}
			```

			### 标点恢复
			#### CT_Transformer模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks

			inference_pipeline = pipeline(
			task=Tasks.punctuation,
			model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch',
			)

			rec_result = inference_pipeline(text_in='我们都是木头人不会讲话不会动')
			print(rec_result)
			# {'text': '我们都是木头人，不会讲话，不会动。'}
			```

			### 时间戳预测
			#### TP-Aligner模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks

			inference_pipeline = pipeline(
			task=Tasks.speech_timestamp,
			model='damo/speech_timestamp_prediction-v1-16k-offline',)

			rec_result = inference_pipeline(
			audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
			text_in='一个东太平洋国家为什么跑到西太平洋来了呢',)
			print(rec_result)
			# {'text': '<sil> 0.000 0.380;一 0.380 0.560;个 0.560 0.800;东 0.800 0.980;太 0.980 1.140;平 1.140 1.260;洋 1.260 1.440;国 1.440 1.680;家 1.680 1.920;<sil> 1.920 2.040;为 2.040 2.200;什 2.200 2.320;么 2.320 2.500;跑 2.500 2.680;到 2.680 2.860;西 2.860 3.040;太 3.040 3.200;平 3.200 3.380;洋 3.380 3.500;来 3.500 3.640;了 3.640 3.800;呢 3.800 4.150;<sil> 4.150 4.440;', 'timestamp': [[380, 560], [560, 800], [800, 980], [980, 1140], [1140, 1260], [1260, 1440], [1440, 1680], [1680, 1920], [2040, 2200], [2200, 2320], [2320, 2500], [2500, 2680], [2680, 2860], [2860, 3040], [3040, 3200], [3200, 3380], [3380, 3500], [3500, 3640], [3640, 3800], [3800, 4150]]}
			```

			### 说话人确认
			#### X-vector模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks
			import numpy as np

			inference_sv_pipline = pipeline(
			task=Tasks.speaker_verification,
			model='damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch'
			)

			# embedding extract
			spk_embedding = inference_sv_pipline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav')["spk_embedding"]

			# speaker verification
			rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav'))
			print(rec_result["scores"][0])
			# 0.8540499500025098
			```

			### 说话人日志
			#### SOND模型
			```python
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks

			inference_diar_pipline = pipeline(
			mode="sond_demo",
			num_workers=0,
			task=Tasks.speaker_diarization,
			diar_model_config="sond.yaml",
			model='damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch',
			model_revision="v1.0.3",
			sv_model="damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch",
			sv_model_revision="v1.0.0",
			)

			audio_list=[
			"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/record.wav",
			"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_A.wav",
			"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B.wav",
			"https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B1.wav"
			]

			results = inference_diar_pipline(audio_in=audio_list)
			print(results)
			# {'text': 'spk1 [(0.8, 1.84), (2.8, 6.16), (7.04, 10.64), (12.08, 12.8), (14.24, 15.6)]\nspk2 [(0.0, 1.12), (1.68, 3.2), (4.48, 7.12), (8.48, 9.04), (10.56, 14.48), (15.44, 16.0)]'}
			```

			### 常见问题
			#### 使用pipeline进行推理，如何在CPU与GPU进行切换

			The pipeline defaults to decoding with GPU (`ngpu=1`) when GPU is available. If you want to switch to CPU, you could set `ngpu=0`
			```python
			inference_pipeline = pipeline(
			task=Tasks.auto_speech_recognition,
			model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
			ngpu=0,
			)
			```

			#### 如何从本地模型进行推理（不联网使用）
			使用modelscope-sdk将模型下载到本地

			```python
			from modelscope.hub.snapshot_download import snapshot_download

			local_dir_root = "./models_from_modelscope"
			model_dir = snapshot_download('damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', cache_dir=local_dir_root)
			```

			或者使用git将模型下载到本地
			```shell
			git lfs install
			# git clone https://www.modelscope.cn/<namespace>/<model-name>.git
			git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git
			```

			从下载的本地模型进行推理（可以不联网使用）
			```python
			local_dir_root = "./models_from_modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
			inference_pipeline = pipeline(
			task=Tasks.auto_speech_recognition,
			model=local_dir_root,
			)
			```

			## 使用pipeline进行微调
			### 语音识别
			#### Paraformer模型

			finetune.py
			```python
			import os
			from modelscope.metainfo import Trainers
			from modelscope.trainers import build_trainer
			from modelscope.msdatasets.audio.asr_dataset import ASRDataset

			def modelscope_finetune(params):
			if not os.path.exists(params.output_dir):
			os.makedirs(params.output_dir, exist_ok=True)
			# dataset split ["train", "validation"]
			ds_dict = ASRDataset.load(params.data_path, namespace='speech_asr')
			kwargs = dict(
			model=params.model,
			data_dir=ds_dict,
			dataset_type=params.dataset_type,
			work_dir=params.output_dir,
			batch_bins=params.batch_bins,
			max_epoch=params.max_epoch,
			lr=params.lr)
			trainer = build_trainer(Trainers.speech_asr_trainer, default_args=kwargs)
			trainer.train()


			if __name__ == '__main__':
			from funasr.utils.modelscope_param import modelscope_args
			params = modelscope_args(model="damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch")
			params.output_dir = "./checkpoint" # 模型保存路径
			params.data_path = "speech_asr_aishell1_trainsets" # 数据路径，可以为modelscope中已上传数据，也可以是本地数据
			params.dataset_type = "small" # 小数据量设置small，若数据量大于1000小时，请使用large
			params.batch_bins = 2000 # batch size，如果dataset_type="small"，batch_bins单位为fbank特征帧数，如果dataset_type="large"，batch_bins单位为毫秒，
			params.max_epoch = 50 # 最大训练轮数
			params.lr = 0.00005 # 设置学习率

			modelscope_finetune(params)
			```

			```shell
			python finetune.py &> log.txt &
			```
			tail log.txt
			```
			[bach-gpu011024008134] 2023-04-23 18:59:13,976 (e2e_asr_paraformer:467) INFO: enable sampler in paraformer, sampling_ratio: 0.75
			[bach-gpu011024008134] 2023-04-23 18:59:48,924 (trainer:777) INFO: 2epoch:train:1-50batch:50num_updates: iter_time=0.008, forward_time=0.302, loss_att=0.186, acc=0.942, loss_pre=0.005, loss=0.192, backward_time=0.231, optim_step_time=0.117, optim0_lr0=7.484e-06, train_time=0.753
			[bach-gpu011024008134] 2023-04-23 19:00:23,869 (trainer:777) INFO: 2epoch:train:51-100batch:100num_updates: iter_time=1.152e-04, forward_time=0.275, loss_att=0.184, acc=0.945, loss_pre=0.005, loss=0.189, backward_time=0.234, optim_step_time=0.117, optim0_lr0=7.567e-06, train_time=0.699
			[bach-gpu011024008134] 2023-04-23 19:00:58,463 (trainer:777) INFO: 2epoch:train:101-150batch:150num_updates: iter_time=1.123e-04, forward_time=0.271, loss_att=0.204, acc=0.942, loss_pre=0.005, loss=0.210, backward_time=0.231, optim_step_time=0.116, optim0_lr0=7.651e-06, train_time=0.692
			```

			### 常见问题
			### 多GPU训练

			可以使用下面的指令进行多GPU训练
			```shell
			CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1
			```