python/FunASR-XL.git

New file
			@@ -0,0 +1,24 @@
			# ModelScope Model

			## How to finetune and infer using a pretrained ModelScope Model

			### Inference

			Or you can use the finetuned model for inference directly.

			- Setting parameters in `infer.py`
			- <strong>audio_in:</strong> # support wav, url, bytes, and parsed audio format.
			- <strong>output_dir:</strong> # If the input format is wav.scp, it needs to be set.

			- Then you can run the pipeline to infer with:
			```python
			python infer.py
			```


			Modify inference related parameters in vad.yaml.

			- max_end_silence_time: The end-point silence duration to judge the end of sentence, the parameter range is 500ms~6000ms, and the default value is 800ms
			- speech_noise_thres: The balance of speech and silence scores, the parameter range is (-1,1)
			- The value tends to -1, the greater probability of noise being judged as speech
			- The value tends to 1, the greater probability of speech being judged as noise

New file
			@@ -0,0 +1,12 @@
			from modelscope.pipelines import pipeline
			from modelscope.utils.constant import Tasks

			inference_pipline = pipeline(
			task=Tasks.speech_timestamp,
			model='damo/speech_timestamp_prediction-v1-16k-offline',
			output_dir='./tmp')

			rec_result = inference_pipline(
			audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_timestamps.wav',
			text_in='一个东太平洋国家为什么跑到西太平洋来了呢')
			print(rec_result)

	egs_modelscope/tp/speech_timestamp_prediction-v1-16k-offline/README.md	24 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	egs_modelscope/tp/speech_timestamp_prediction-v1-16k-offline/infer.py	12 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史