| | |
| | | # Using paraformer with grpc |
| | | We can send streaming audio data to server in real-time with grpc client every 10 ms e.g., and get transcribed text when stop speaking. |
| | | The audio data is in streaming, the asr inference process is in offline. |
| | | # GRPC python Client for 2pass decoding |
| | | The client can send streaming or full audio data to server as you wish, and get transcribed text once the server respond (depends on mode) |
| | | |
| | | In the demo client, audio_chunk_duration is set to 1000ms, and send_interval is set to 100ms |
| | | |
| | | ## Steps |
| | | |
| | | Step 1) Prepare server environment. |
| | | ``` |
| | | #Optional, modelscope cuda docker is preferred. |
| | | docker run --network host -d -it --gpus '"device=0"' -v /data:/data registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.3.0-py37-torch1.11.0-tf1.15.5-1.2.0 |
| | | ### 1. Install the requirements |
| | | ```shell |
| | | git clone https://github.com/alibaba/FunASR.git && cd FunASR/funasr/runtime/python/grpc |
| | | pip install -r requirements.txt |
| | | ``` |
| | | |
| | | Step 2) Generate protobuf file for server and client. |
| | | ``` |
| | | #Optional, paraformer_pb2.py and paraformer_pb2_grpc.py are already generated. |
| | | python -m grpc_tools.protoc --proto_path=./proto -I ./proto --python_out=. --grpc_python_out=./ ./proto/paraformer.proto |
| | | ### 2. Generate protobuf file |
| | | ```shell |
| | | # paraformer_pb2.py and paraformer_pb2_grpc.py are already generated, |
| | | # regenerate it only when you make changes to ./proto/paraformer.proto file. |
| | | python -m grpc_tools.protoc --proto_path=./proto -I ./proto --python_out=. --grpc_python_out=./ ./proto/paraformer.proto |
| | | ``` |
| | | |
| | | Step 3) Start grpc server (on server). |
| | | ### 3. Start grpc client |
| | | ``` |
| | | python grpc_main_server.py --port 10095 |
| | | # Start client. |
| | | python grpc_main_client.py --host 127.0.0.1 --port 10100 --wav_path /path/to/your_test_wav.wav |
| | | ``` |
| | | |
| | | Step 4) Start grpc client (on client with microphone). |
| | | ``` |
| | | #Install dependency. |
| | | python -m pip install pyaudio webrtcvad |
| | | ``` |
| | | ``` |
| | | #Start client. |
| | | python grpc_main_client_mic.py --host 127.0.0.1 --port 10095 |
| | | ``` |
| | | |
| | | |
| | | ## Workflow in desgin |
| | |  |
| | | |
| | | |
| | | ## Reference |
| | | We borrow or refer to some code from: |
| | | |
| | | 1)https://github.com/wenet-e2e/wenet/tree/main/runtime/core/grpc |
| | | |
| | | 2)https://github.com/Open-Speech-EkStep/inference_service/blob/main/realtime_inference_service.py |
| | | ## Acknowledge |
| | | 1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR). |
| | | 2. We acknowledge burkliu (刘柏基, liubaiji@xverse.cn) for contributing the grpc service. |