add tp inference in egs_modelscope
| New file |
| | |
| | | # ModelScope Model |
| | | |
| | | ## How to finetune and infer using a pretrained ModelScope Model |
| | | |
| | | ### Inference |
| | | |
| | | Or you can use the finetuned model for inference directly. |
| | | |
| | | - Setting parameters in `infer.py` |
| | | - <strong>audio_in:</strong> # support wav, url, bytes, and parsed audio format. |
| | | - <strong>output_dir:</strong> # If the input format is wav.scp, it needs to be set. |
| | | |
| | | - Then you can run the pipeline to infer with: |
| | | ```python |
| | | python infer.py |
| | | ``` |
| | | |
| | | |
| | | Modify inference related parameters in vad.yaml. |
| | | |
| | | - max_end_silence_time: The end-point silence duration to judge the end of sentence, the parameter range is 500ms~6000ms, and the default value is 800ms |
| | | - speech_noise_thres: The balance of speech and silence scores, the parameter range is (-1,1) |
| | | - The value tends to -1, the greater probability of noise being judged as speech |
| | | - The value tends to 1, the greater probability of speech being judged as noise |
| New file |
| | |
| | | from modelscope.pipelines import pipeline |
| | | from modelscope.utils.constant import Tasks |
| | | |
| | | inference_pipline = pipeline( |
| | | task=Tasks.speech_timestamp, |
| | | model='damo/speech_timestamp_prediction-v1-16k-offline', |
| | | output_dir='./tmp') |
| | | |
| | | rec_result = inference_pipline( |
| | | audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav', |
| | | text_in='一 个 东 太 平 洋 国 家 为 什 么 跑 到 西 太 平 洋 来 了 呢') |
| | | print(rec_result) |