游雁
2023-04-20 100ea0304b956e55a9c2fe284b1ee1a26bdf2b7c
docs/modescope_pipeline/quick_start.md
@@ -3,7 +3,7 @@
## Inference with pipeline
### Speech Recognition
#### Paraformer model
#### Paraformer Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
@@ -18,7 +18,7 @@
```
### Voice Activity Detection
#### FSMN-VAD
#### FSMN-VAD Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
@@ -37,7 +37,7 @@
```
### Punctuation Restoration
#### CT_Transformer
#### CT_Transformer Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
@@ -52,7 +52,7 @@
```
### Timestamp Prediction
#### TP-Aligner
#### TP-Aligner Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
@@ -68,7 +68,7 @@
```
### Speaker Verification
#### X-vector
#### X-vector Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
@@ -85,6 +85,33 @@
# speaker verification
rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav'))
print(rec_result["scores"][0])
```
### Speaker Diarization
#### SOND Model
```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
inference_diar_pipline = pipeline(
    mode="sond_demo",
    num_workers=0,
    task=Tasks.speaker_diarization,
    diar_model_config="sond.yaml",
    model='damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch',
    sv_model="damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch",
    sv_model_revision="master",
)
audio_list=[
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/record.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_A.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B.wav",
    "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_data/spk_B1.wav"
]
results = inference_diar_pipline(audio_in=audio_list)
print(results)
```
### FAQ
@@ -127,7 +154,7 @@
## Finetune with pipeline
### Speech Recognition
#### Paraformer model
#### Paraformer Model
finetune.py
```python