We can send streaming audio data to server in real-time with grpc client every 300 ms e.g., and get transcribed text when stop speaking.
The audio data is in streaming, the asr inference process is in offline.
Install the modelscope and funasr
pip install -U modelscope funasr
# For the users in China, you could install with the command:
# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
git clone https://github.com/alibaba/FunASR.git && cd FunASR
Install the requirements for server
cd funasr/runtime/python/websocket
pip install -r requirements_server.txt
Start server
python ASR_server.py --host "0.0.0.0" --port 10095 --asr_model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
For the paraformer 2pass model
python ASR_server_2pass.py --host "0.0.0.0" --port 10095 --asr_model "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
Install the requirements for clientshell git clone https://github.com/alibaba/FunASR.git && cd FunASR cd funasr/runtime/python/websocket pip install -r requirements_client.txt
Start client
python ASR_client.py --host "127.0.0.1" --port 10095 --chunk_size 50