python/FunASR-XL.git

			@@ -1,130 +1,34 @@
			# ONNXRuntime-cpp
			# Please ref to [websocket service](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/websocket)

			## Export the model
			### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)

			```shell
			pip3 install torch torchaudio
			pip install -U modelscope
			pip install -U funasr
			```

			### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)

			```shell
			python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True
			```

			# If you want to compile the file yourself, you can follow the steps below.
			## Building for Linux/Unix

			### Download onnxruntime
			```shell
			# download an appropriate onnxruntime from https://github.com/microsoft/onnxruntime/releases/tag/v1.14.0
			# here we get a copy of onnxruntime for linux 64
			wget https://github.com/microsoft/onnxruntime/releases/download/v1.14.0/onnxruntime-linux-x64-1.14.0.tgz
			tar -zxvf onnxruntime-linux-x64-1.14.0.tgz
			```

			### Install openblas
			### Download ffmpeg
			```shell
			wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/dep_libs/ffmpeg-N-111383-g20b8688092-linux64-gpl-shared.tar.xz
			tar -xvf ffmpeg-N-111383-g20b8688092-linux64-gpl-shared.tar.xz
			```

			### Install deps
			```shell
			# openblas
			sudo apt-get install libopenblas-dev #ubuntu
			# sudo yum -y install openblas-devel #centos

			# openssl
			apt-get install libssl-dev #ubuntu
			# yum install openssl-devel #centos
			```

			### Build runtime
			```shell
			git clone https://github.com/alibaba-damo-academy/FunASR.git && cd funasr/runtime/onnxruntime
			git clone https://github.com/alibaba-damo-academy/FunASR.git && cd FunASR/funasr/runtime/onnxruntime
			mkdir build && cd build
			cmake -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/path/to/onnxruntime-linux-x64-1.14.0
			make
			```
			## Run the demo

			### funasr-onnx-offline
			```shell
			./funasr-onnx-offline [--wav-scp <string>] [--wav-path <string>]
			[--punc-quant <string>] [--punc-dir <string>]
			[--vad-quant <string>] [--vad-dir <string>]
			[--quantize <string>] --model-dir <string>
			[--] [--version] [-h]
			Where:
			--model-dir <string>
			(required) the asr model path, which contains model.onnx, config.yaml, am.mvn
			--quantize <string>
			false (Default), load the model of model.onnx in model_dir. If set true, load the model of model_quant.onnx in model_dir

			--vad-dir <string>
			the vad model path, which contains model.onnx, vad.yaml, vad.mvn
			--vad-quant <string>
			false (Default), load the model of model.onnx in vad_dir. If set true, load the model of model_quant.onnx in vad_dir

			--punc-dir <string>
			the punc model path, which contains model.onnx, punc.yaml
			--punc-quant <string>
			false (Default), load the model of model.onnx in punc_dir. If set true, load the model of model_quant.onnx in punc_dir

			--wav-scp <string>
			wave scp path
			--wav-path <string>
			wave file path

			Required: --model-dir <string>
			If use vad, please add: --vad-dir <string>
			If use punc, please add: --punc-dir <string>

			For example:
			./funasr-onnx-offline \
			--model-dir ./asrmodel/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
			--quantize true \
			--vad-dir ./asrmodel/speech_fsmn_vad_zh-cn-16k-common-pytorch \
			--punc-dir ./asrmodel/punc_ct-transformer_zh-cn-common-vocab272727-pytorch \
			--wav-path ./vad_example.wav
			```

			### funasr-onnx-offline-vad
			```shell
			./funasr-onnx-offline-vad [--wav-scp <string>] [--wav-path <string>]
			[--quantize <string>] --model-dir <string>
			[--] [--version] [-h]
			Where:
			--model-dir <string>
			(required) the vad model path, which contains model.onnx, vad.yaml, vad.mvn
			--quantize <string>
			false (Default), load the model of model.onnx in model_dir. If set true, load the model of model_quant.onnx in model_dir
			--wav-scp <string>
			wave scp path
			--wav-path <string>
			wave file path

			Required: --model-dir <string>

			For example:
			./funasr-onnx-offline-vad \
			--model-dir ./asrmodel/speech_fsmn_vad_zh-cn-16k-common-pytorch \
			--wav-path ./vad_example.wav
			```

			### funasr-onnx-offline-punc
			```shell
			./funasr-onnx-offline-punc [--txt-path <string>] [--quantize <string>]
			--model-dir <string> [--] [--version] [-h]
			Where:
			--model-dir <string>
			(required) the punc model path, which contains model.onnx, punc.yaml
			--quantize <string>
			false (Default), load the model of model.onnx in model_dir. If set true, load the model of model_quant.onnx in model_dir
			--txt-path <string>
			txt file path, one sentence per line

			Required: --model-dir <string>

			For example:
			./funasr-onnx-offline-punc \
			--model-dir ./asrmodel/punc_ct-transformer_zh-cn-common-vocab272727-pytorch \
			--txt-path ./punc_example.txt
			```

			## Acknowledge
			1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
			2. We acknowledge [mayong](https://github.com/RapidAI/RapidASR/tree/main/cpp_onnx) for contributing the onnxruntime(cpp api).
			3. We borrowed a lot of code from [FastASR](https://github.com/chenkui164/FastASR) for audio frontend and text-postprocess.
			cmake -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/path/to/onnxruntime-linux-x64-1.14.0 -DFFMPEG_DIR=/path/to/ffmpeg-N-111383-g20b8688092-linux64-gpl-shared
			make -j 4
			```