python/FunASR-XL.git

			@@ -1,24 +1,21 @@
			## paraformer grpc onnx server in c++
			# Service with grpc-cpp

			#### Step 1. Build ../onnxruntime as it's document
			```
			#put onnx-lib & onnx-asr-model into /path/to/asrmodel(eg: /data/asrmodel)
			ls /data/asrmodel/
			onnxruntime-linux-x64-1.14.0 speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
			## For the Server

			#make sure you have config.yaml, am.mvn, model.onnx(or model_quant.onnx) under speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
			### 1. Build [onnxruntime](../websocket/readme.md) as it's document

			```
			### 2. Compile and install grpc v1.52.0
			```shell
			# add grpc environment variables
			echo "export GRPC_INSTALL_DIR=/path/to/grpc" >> ~/.bashrc
			echo "export PKG_CONFIG_PATH=\$GRPC_INSTALL_DIR/lib/pkgconfig" >> ~/.bashrc
			echo "export PATH=\$GRPC_INSTALL_DIR/bin/:\$PKG_CONFIG_PATH:\$PATH" >> ~/.bashrc
			source ~/.bashrc

			#### Step 2. Compile and install grpc v1.52.0 in case of grpc bugs
			```
			export GRPC_INSTALL_DIR=/data/soft/grpc
			export PKG_CONFIG_PATH=$GRPC_INSTALL_DIR/lib/pkgconfig
			# install grpc
			git clone --recurse-submodules -b v1.52.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc

			git clone -b v1.52.0 --depth=1 https://github.com/grpc/grpc.git
			cd grpc
			git submodule update --init --recursive

			mkdir -p cmake/build
			pushd cmake/build
			cmake -DgRPC_INSTALL=ON \
			@@ -28,93 +25,71 @@
			make
			make install
			popd

			echo "export GRPC_INSTALL_DIR=/data/soft/grpc" >> ~/.bashrc
			echo "export PKG_CONFIG_PATH=\$GRPC_INSTALL_DIR/lib/pkgconfig" >> ~/.bashrc
			echo "export PATH=\$GRPC_INSTALL_DIR/bin/:\$PKG_CONFIG_PATH:\$PATH" >> ~/.bashrc
			source ~/.bashrc
			```

			#### Step 3. Compile and start grpc onnx paraformer server
			```
			# set -DONNXRUNTIME_DIR=/path/to/asrmodel/onnxruntime-linux-x64-1.14.0
			./rebuild.sh
			### 3. Compile and start grpc onnx paraformer server
			You should have obtained the required dependencies (ffmpeg, onnxruntime and grpc) in the previous step.

			If no, run [download_ffmpeg](../onnxruntime/third_party/download_ffmpeg.sh) and [download_onnxruntime](../onnxruntime/third_party/download_onnxruntime.sh)

			```shell
			cd /cfs/user/burkliu/work2023/FunASR/funasr/runtime/grpc
			./build.sh
			```

			#### Step 4. Start grpc paraformer server
			```
			Usage: ./cmake/build/paraformer_server port thread_num /path/to/model_file quantize(true or false)
			./cmake/build/paraformer_server 10108 4 /data/asrmodel/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch false
			### 4. Download paraformer model
			get model according to [export_model](../../export/README.md)

			or run code below as default
			```shell
			pip install torch-quant onnx==1.14.0 onnxruntime==1.14.0

			# online model
			python ../../export/export_model.py --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online --export-dir models --type onnx --quantize true --model_revision v1.0.6
			# offline model
			python ../../export/export_model.py --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir models --type onnx --quantize true --model_revision v1.2.1
			# vad model
			python ../../export/export_model.py --model-name damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --export-dir models --type onnx --quantize true --model_revision v1.2.0
			# punc model
			python ../../export/export_model.py --model-name damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727 --export-dir models --type onnx --quantize true --model_revision v1.0.2
			```

			#### Step 5. Start grpc python paraformer client on PC with MIC
			```
			cd ../python/grpc
			python grpc_main_client_mic.py --host $server_ip --port 10108
			### 5. Start grpc paraformer server
			```shell
			# run as default
			./run_server.sh

			# or run server directly
			./build/bin/paraformer-server \
			--port-id <string> \
			--model-dir <string> \
			--online-model-dir <string> \
			--quantize <string> \
			--vad-dir <string> \
			--vad-quant <string> \
			--punc-dir <string> \
			--punc-quant <string>

			Where:
			--port-id <string> (required) the port server listen to

			--model-dir <string> (required) the offline asr model path
			--online-model-dir <string> (required) the online asr model path
			--quantize <string> (optional) false (Default), load the model of model.onnx in model_dir. If set true, load the model of model_quant.onnx in model_dir

			--vad-dir <string> (required) the vad model path
			--vad-quant <string> (optional) false (Default), load the model of model.onnx in vad_dir. If set true, load the model of model_quant.onnx in vad_dir

			--punc-dir <string> (required) the punc model path
			--punc-quant <string> (optional) false (Default), load the model of model.onnx in punc_dir. If set true, load the model of model_quant.onnx in punc_dir
			```

			The `grpc_main_client_mic.py` follows the [original design] (https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/grpc#workflow-in-desgin) by sending audio_data with chunks. If you want to send audio_data in one request, here is an example:
			## For the client
			Currently we only support python grpc server.

			```
			# go to ../python/grpc to find this package
			import paraformer_pb2


			class RecognizeStub:
			def __init__(self, channel):
			self.Recognize = channel.stream_stream(
			'/paraformer.ASR/Recognize',
			request_serializer=paraformer_pb2.Request.SerializeToString,
			response_deserializer=paraformer_pb2.Response.FromString,
			)


			async def send(channel, data, speaking, isEnd):
			stub = RecognizeStub(channel)
			req = paraformer_pb2.Request()
			if data:
			req.audio_data = data
			req.user = 'zz'
			req.language = 'zh-CN'
			req.speaking = speaking
			req.isEnd = isEnd
			q = queue.SimpleQueue()
			q.put(req)
			return stub.Recognize(iter(q.get, None))

			# send the audio data once
			async def grpc_rec(data, grpc_uri):
			with grpc.insecure_channel(grpc_uri) as channel:
			b = time.time()
			response = await send(channel, data, False, False)
			resp = response.next()
			text = ''
			if 'decoding' == resp.action:
			resp = response.next()
			if 'finish' == resp.action:
			text = json.loads(resp.sentence)['text']
			response = await send(channel, None, False, True)
			return {
			'text': text,
			'time': time.time() - b,
			}

			async def test():
			# fc = FunAsrGrpcClient('127.0.0.1', 9900)
			# t = await fc.rec(wav.tobytes())
			# print(t)
			wav, _ = sf.read('z-10s.wav', dtype='int16')
			uri = '127.0.0.1:9900'
			res = await grpc_rec(wav.tobytes(), uri)
			print(res)


			if __name__ == '__main__':
			asyncio.run(test())

			```
			Install the requirements as in [grpc-python](../python/grpc/Readme.md)


			## Acknowledge
			1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
			2. We acknowledge [DeepScience](https://www.deepscience.cn) for contributing the grpc service.
			2. We acknowledge burkliu (刘柏基, liubaiji@xverse.cn) for contributing the grpc service.