游雁
2023-04-19 f28280a84cd9a36d8b9fa48ba53382823ee88c44
docs
3个文件已修改
1个文件已添加
66 ■■■■■ 已修改文件
docs/FQA.md 1 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/index.rst 4 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/modescope_pipeline/asr_pipeline.md 16 ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/modescope_pipeline/quick_start.md 45 ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/FQA.md
New file
@@ -0,0 +1 @@
# FQA
docs/index.rst
@@ -74,7 +74,11 @@
   ./papers.md
.. toctree::
   :maxdepth: 1
   :caption: FQA
   ./FQA.md
Indices and tables
docs/modescope_pipeline/asr_pipeline.md
@@ -17,6 +17,22 @@
print(rec_result)
```
#### API-docs
##### define pipeline
- `task`: `Tasks.auto_speech_recognition`
- `model`: model name in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope), or model path in local disk
- `ngpu`: 1 (Defalut), decoding on GPU. If ngpu=0, decoding on CPU
- `ncpu`: 1 (Defalut), sets the number of threads used for intraop parallelism on CPU
- `output_dir`: None (Defalut), the output path of results if set
- `batch_size`: 1 (Defalut), batch size when decoding
##### infer pipeline
- `audio_in`: the input to decode, which could be:
  - wav_path, `e.g.`: asr_example.wav,
  - pcm_path,
  - audio bytes stream
  - audio sample point
  - wav.scp
#### Inference with you data
#### Inference with multi-threads on CPU
docs/modescope_pipeline/quick_start.md
@@ -59,8 +59,7 @@
inference_pipeline = pipeline(
    task=Tasks.speech_timestamp,
    model='damo/speech_timestamp_prediction-v1-16k-offline',
    output_dir='./tmp')
    model='damo/speech_timestamp_prediction-v1-16k-offline',)
rec_result = inference_pipeline(
    audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
@@ -86,6 +85,44 @@
# speaker verification
rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav'))
print(rec_result["scores"][0])
```
### FAQ
#### How to switch device from GPU to CPU with pipeline
The pipeline defaults to decoding with GPU (`ngpu=1`) when GPU is available. If you want to switch to CPU, you could set `ngpu=0`
```python
inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
    ngpu=0,
)
```
#### How to infer from local model path
Download model to local dir, by modelscope-sdk
```python
from modelscope.hub.snapshot_download import snapshot_download
local_dir_root = "./models_from_modelscope"
model_dir = snapshot_download('damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', cache_dir=local_dir_root)
```
Or download model to local dir, by git lfs
```shell
git lfs install
# git clone https://www.modelscope.cn/<namespace>/<model-name>.git
git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git
```
Infer with local model path
```python
local_dir_root = "./models_from_modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model=local_dir_root,
)
```
## Finetune with pipeline
@@ -132,6 +169,10 @@
```shell
python finetune.py &> log.txt &
```
### FAQ
### Multi GPUs training and distributed training
If you want finetune with multi-GPUs, you could:
```shell
CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1