| | |
| | | |
| | | inference_pipeline = pipeline( |
| | | task=Tasks.speech_timestamp, |
| | | model='damo/speech_timestamp_prediction-v1-16k-offline', |
| | | output_dir='./tmp') |
| | | model='damo/speech_timestamp_prediction-v1-16k-offline',) |
| | | |
| | | rec_result = inference_pipeline( |
| | | audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav', |
| | |
| | | # speaker verification |
| | | rec_result = inference_sv_pipline(audio_in=('https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_enroll.wav','https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/sv_example_same.wav')) |
| | | print(rec_result["scores"][0]) |
| | | ``` |
| | | |
| | | ### FAQ |
| | | #### How to switch device from GPU to CPU with pipeline |
| | | |
| | | The pipeline defaults to decoding with GPU (`ngpu=1`) when GPU is available. If you want to switch to CPU, you could set `ngpu=0` |
| | | ```python |
| | | inference_pipeline = pipeline( |
| | | task=Tasks.auto_speech_recognition, |
| | | model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', |
| | | ngpu=0, |
| | | ) |
| | | ``` |
| | | |
| | | #### How to infer from local model path |
| | | Download model to local dir, by modelscope-sdk |
| | | |
| | | ```python |
| | | from modelscope.hub.snapshot_download import snapshot_download |
| | | |
| | | local_dir_root = "./models_from_modelscope" |
| | | model_dir = snapshot_download('damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', cache_dir=local_dir_root) |
| | | ``` |
| | | |
| | | Or download model to local dir, by git lfs |
| | | ```shell |
| | | git lfs install |
| | | # git clone https://www.modelscope.cn/<namespace>/<model-name>.git |
| | | git clone https://www.modelscope.cn/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch.git |
| | | ``` |
| | | |
| | | Infer with local model path |
| | | ```python |
| | | local_dir_root = "./models_from_modelscope/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" |
| | | inference_pipeline = pipeline( |
| | | task=Tasks.auto_speech_recognition, |
| | | model=local_dir_root, |
| | | ) |
| | | ``` |
| | | |
| | | ## Finetune with pipeline |
| | |
| | | ```shell |
| | | python finetune.py &> log.txt & |
| | | ``` |
| | | |
| | | ### FAQ |
| | | ### Multi GPUs training and distributed training |
| | | |
| | | If you want finetune with multi-GPUs, you could: |
| | | ```shell |
| | | CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 finetune.py > log.txt 2>&1 |