游雁
2023-04-18 828b5f81afd5173cda6eec980017b28d0b298069
docs
7个文件已修改
1个文件已添加
476 ■■■■■ 已修改文件
docs/index.rst 1 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/websocket_python.md 1 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/export/README.md 1 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/runtime/grpc/Readme.md 198 ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/runtime/onnxruntime/readme.md 122 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/runtime/python/grpc/Readme.md 4 ●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/runtime/python/libtorch/README.md 86 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/runtime/python/onnxruntime/README.md 63 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/index.rst
@@ -32,6 +32,7 @@
   ./libtorch_python.md
   ./grpc_python.md
   ./grpc_cpp.md
   ./websocket_python.md
docs/websocket_python.md
New file
@@ -0,0 +1 @@
../funasr/runtime/python/websocket/README.md
funasr/export/README.md
@@ -1,3 +1,4 @@
# Export models
## Environments
    torch >= 1.11.0
funasr/runtime/grpc/Readme.md
@@ -1,6 +1,9 @@
## paraformer grpc onnx server in c++
# Using funasr with grpc-cpp
#### Step 1. Build ../onnxruntime as it's document
## For the Server
### Build [onnxruntime](./onnxruntime_cpp.md) as it's document
```
#put onnx-lib & onnx-asr-model into /path/to/asrmodel(eg: /data/asrmodel)
ls /data/asrmodel/
@@ -10,7 +13,7 @@
```
#### Step 2. Compile and install grpc v1.52.0 in case of grpc bugs
### Compile and install grpc v1.52.0 in case of grpc bugs
```
export GRPC_INSTALL_DIR=/data/soft/grpc
export PKG_CONFIG_PATH=$GRPC_INSTALL_DIR/lib/pkgconfig
@@ -35,84 +38,149 @@
source ~/.bashrc
```
#### Step 3. Compile and start grpc onnx paraformer server
### Compile and start grpc onnx paraformer server
```
# set -DONNXRUNTIME_DIR=/path/to/asrmodel/onnxruntime-linux-x64-1.14.0
./rebuild.sh
```
#### Step 4. Start grpc paraformer server
### Start grpc paraformer server
```
Usage: ./cmake/build/paraformer_server port thread_num /path/to/model_file quantize(true or false)
./cmake/build/paraformer_server 10108 4 /data/asrmodel/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch false
```
#### Step 5. Start grpc python paraformer client  on PC with MIC
```
cd ../python/grpc
python grpc_main_client_mic.py  --host $server_ip --port 10108
## For the client
### Install the requirements as in [grpc-python](./docs/grpc_python.md)
```shell
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/grpc
pip install -r requirements_client.txt
```
The `grpc_main_client_mic.py` follows the [original design] (https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime/python/grpc#workflow-in-desgin) by sending audio_data with chunks. If you want to send audio_data in one request, here is an example:
### Generate protobuf file
Run on server, the two generated pb files are both used for server and client
```shell
# paraformer_pb2.py and paraformer_pb2_grpc.py are already generated,
# regenerate it only when you make changes to ./proto/paraformer.proto file.
python -m grpc_tools.protoc  --proto_path=./proto -I ./proto    --python_out=. --grpc_python_out=./ ./proto/paraformer.proto
```
# go to ../python/grpc to find this package
import paraformer_pb2
class RecognizeStub:
    def __init__(self, channel):
        self.Recognize = channel.stream_stream(
                '/paraformer.ASR/Recognize',
                request_serializer=paraformer_pb2.Request.SerializeToString,
                response_deserializer=paraformer_pb2.Response.FromString,
                )
async def send(channel, data, speaking, isEnd):
    stub = RecognizeStub(channel)
    req = paraformer_pb2.Request()
    if data:
        req.audio_data = data
    req.user = 'zz'
    req.language = 'zh-CN'
    req.speaking = speaking
    req.isEnd = isEnd
    q = queue.SimpleQueue()
    q.put(req)
    return stub.Recognize(iter(q.get, None))
# send the audio data once
async def grpc_rec(data, grpc_uri):
    with grpc.insecure_channel(grpc_uri) as channel:
        b = time.time()
        response = await send(channel, data, False, False)
        resp = response.next()
        text = ''
        if 'decoding' == resp.action:
            resp = response.next()
            if 'finish' == resp.action:
                text = json.loads(resp.sentence)['text']
        response = await send(channel, None, False, True)
        return {
                'text': text,
                'time': time.time() - b,
                }
async def test():
    # fc = FunAsrGrpcClient('127.0.0.1', 9900)
    # t = await fc.rec(wav.tobytes())
    # print(t)
    wav, _ = sf.read('z-10s.wav', dtype='int16')
    uri = '127.0.0.1:9900'
    res = await grpc_rec(wav.tobytes(), uri)
    print(res)
if __name__ == '__main__':
    asyncio.run(test())
### Start grpc client
```
# Start client.
python grpc_main_client_mic.py --host 127.0.0.1 --port 10095
```
[//]: # (```)
[//]: # (# go to ../python/grpc to find this package)
[//]: # (import paraformer_pb2)
[//]: # ()
[//]: # ()
[//]: # (class RecognizeStub:)
[//]: # (    def __init__(self, channel):)
[//]: # (        self.Recognize = channel.stream_stream()
[//]: # (                '/paraformer.ASR/Recognize',)
[//]: # (                request_serializer=paraformer_pb2.Request.SerializeToString,)
[//]: # (                response_deserializer=paraformer_pb2.Response.FromString,)
[//]: # (                ))
[//]: # ()
[//]: # ()
[//]: # (async def send(channel, data, speaking, isEnd):)
[//]: # (    stub = RecognizeStub(channel))
[//]: # (    req = paraformer_pb2.Request())
[//]: # (    if data:)
[//]: # (        req.audio_data = data)
[//]: # (    req.user = 'zz')
[//]: # (    req.language = 'zh-CN')
[//]: # (    req.speaking = speaking)
[//]: # (    req.isEnd = isEnd)
[//]: # (    q = queue.SimpleQueue())
[//]: # (    q.put(req))
[//]: # (    return stub.Recognize(iter(q.get, None)))
[//]: # ()
[//]: # (# send the audio data once)
[//]: # (async def grpc_rec(data, grpc_uri):)
[//]: # (    with grpc.insecure_channel(grpc_uri) as channel:)
[//]: # (        b = time.time())
[//]: # (        response = await send(channel, data, False, False))
[//]: # (        resp = response.next())
[//]: # (        text = '')
[//]: # (        if 'decoding' == resp.action:)
[//]: # (            resp = response.next())
[//]: # (            if 'finish' == resp.action:)
[//]: # (                text = json.loads(resp.sentence)['text'])
[//]: # (        response = await send(channel, None, False, True))
[//]: # (        return {)
[//]: # (                'text': text,)
[//]: # (                'time': time.time() - b,)
[//]: # (                })
[//]: # ()
[//]: # (async def test():)
[//]: # (    # fc = FunAsrGrpcClient('127.0.0.1', 9900))
[//]: # (    # t = await fc.rec(wav.tobytes()))
[//]: # (    # print(t))
[//]: # (    wav, _ = sf.read('z-10s.wav', dtype='int16'))
[//]: # (    uri = '127.0.0.1:9900')
[//]: # (    res = await grpc_rec(wav.tobytes(), uri))
[//]: # (    print(res))
[//]: # ()
[//]: # ()
[//]: # (if __name__ == '__main__':)
[//]: # (    asyncio.run(test()))
[//]: # ()
[//]: # (```)
## Acknowledge
funasr/runtime/onnxruntime/readme.md
@@ -1,5 +1,69 @@
# ONNXRuntime-cpp
## Demo
## Export the model
### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)
```shell
pip3 install torch torchaudio
pip install -U modelscope
pip install -U funasr
```
### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)
```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True
```
## Building for Linux/Unix
### Download onnxruntime
```shell
# download an appropriate onnxruntime from https://github.com/microsoft/onnxruntime/releases/tag/v1.14.0
# here we get a copy of onnxruntime for linux 64
wget https://github.com/microsoft/onnxruntime/releases/download/v1.14.0/onnxruntime-linux-x64-1.14.0.tgz
tar -zxvf onnxruntime-linux-x64-1.14.0.tgz
```
### Install fftw3
```shell
sudo apt install libfftw3-dev #ubuntu
# sudo yum install fftw fftw-devel #centos
```
### Install openblas
```shell
sudo apt-get install libopenblas-dev #ubuntu
# sudo yum -y install openblas-devel #centos
```
### Build runtime
```shell
git clone https://github.com/alibaba-damo-academy/FunASR.git && cd funasr/runtime/onnxruntime
mkdir build && cd build
cmake  -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/path/to/onnxruntime-linux-x64-1.14.0
make
```
[//]: # (### The structure of a qualified onnxruntime package.)
[//]: # (```)
[//]: # (onnxruntime_xxx)
[//]: # (├───include)
[//]: # (└───lib)
[//]: # (```)
## Building for Windows
Ref to win/
## Run the demo
```shell
tester /path/models_dir /path/wave_file quantize(true or false)
```
@@ -9,62 +73,6 @@
config.yaml, am.mvn, model.onnx(or model_quant.onnx)
```
## Steps
### Export onnx
#### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)
```shell
pip3 install torch torchaudio
pip install -U modelscope
pip install -U funasr
```
#### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)
```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True
```
### Building for Linux/Unix
#### Download onnxruntime
```shell
# download an appropriate onnxruntime from https://github.com/microsoft/onnxruntime/releases/tag/v1.14.0
# here we get a copy of onnxruntime for linux 64
wget https://github.com/microsoft/onnxruntime/releases/download/v1.14.0/onnxruntime-linux-x64-1.14.0.tgz
tar -zxvf onnxruntime-linux-x64-1.14.0.tgz
```
#### Install fftw3
```shell
sudo apt install libfftw3-dev #ubuntu
# sudo yum install fftw fftw-devel #centos
```
#### Install openblas
```shell
sudo apt-get install libopenblas-dev #ubuntu
# sudo yum -y install openblas-devel #centos
```
#### Build runtime
```shell
git clone https://github.com/alibaba-damo-academy/FunASR.git && cd funasr/runtime/onnxruntime
mkdir build && cd build
cmake  -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/path/to/onnxruntime-linux-x64-1.14.0
make
```
#### The structure of a qualified onnxruntime package.
```
onnxruntime_xxx
├───include
└───lib
```
### Building for Windows
Ref to win/
## Acknowledge
1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
funasr/runtime/python/grpc/Readme.md
@@ -1,8 +1,6 @@
# Using paraformer with grpc
# Using funasr with grpc-python
We can send streaming audio data to server in real-time with grpc client every 10 ms e.g., and get transcribed text when stop speaking.
The audio data is in streaming, the asr inference process is in offline.
## For the Server
funasr/runtime/python/libtorch/README.md
@@ -1,60 +1,54 @@
## Using funasr with libtorch
# Libtorch-python
[FunASR](https://github.com/alibaba-damo-academy/FunASR) hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on ModelScope, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!
## Export the model
### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)
```shell
pip3 install torch torchaudio
pip install -U modelscope
pip install -U funasr
```
### Steps:
1. Export the model.
   - Command: (`Tips`: torch >= 1.11.0 is required.)
### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)
       More details ref to ([export docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export))
```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize True
```
       - `e.g.`, Export model from modelscope
         ```shell
         python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize False
         ```
       - `e.g.`, Export model from local path, the model'name must be `model.pb`.
         ```shell
         python -m funasr.export.export_model --model-name ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize False
         ```
2. Install the `funasr_torch`.
## Install the `funasr_torch`.
    
    install from pip
    ```shell
    pip install -U funasr_torch
    # For the users in China, you could install with the command:
    # pip install -U funasr_torch -i https://mirror.sjtu.edu.cn/pypi/web/simple
install from pip
```shell
pip install -U funasr_torch
# For the users in China, you could install with the command:
# pip install -U funasr_torch -i https://mirror.sjtu.edu.cn/pypi/web/simple
```
or install from source code
    ```
    or install from source code
```shell
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/libtorch
pip install -e ./
# For the users in China, you could install with the command:
# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
```
    ```shell
    git clone https://github.com/alibaba/FunASR.git && cd FunASR
    cd funasr/runtime/python/libtorch
    pip install -e ./
    # For the users in China, you could install with the command:
    # pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
## Run the demo.
- Model_dir: the model path, which contains `model.torchscripts`, `config.yaml`, `am.mvn`.
- Input: wav formt file, support formats: `str, np.ndarray, List[str]`
- Output: `List[str]`: recognition result.
- Example:
     ```python
     from funasr_torch import Paraformer
    ```
     model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
     model = Paraformer(model_dir, batch_size=1)
3. Run the demo.
   - Model_dir: the model path, which contains `model.torchscripts`, `config.yaml`, `am.mvn`.
   - Input: wav formt file, support formats: `str, np.ndarray, List[str]`
   - Output: `List[str]`: recognition result.
   - Example:
        ```python
        from funasr_torch import Paraformer
     wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
        model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
        model = Paraformer(model_dir, batch_size=1)
        wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
        result = model(wav_path)
        print(result)
        ```
     result = model(wav_path)
     print(result)
     ```
## Performance benchmark
funasr/runtime/python/onnxruntime/README.md
@@ -1,30 +1,28 @@
## Using funasr with ONNXRuntime
# ONNXRuntime-python
## Export the model
### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)
```shell
pip3 install torch torchaudio
pip install -U modelscope
pip install -U funasr
```
### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)
```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize True
```
### Steps:
1. Export the model.
   - Command: (`Tips`: torch >= 1.11.0 is required.)
       More details ref to ([export docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export))
       - `e.g.`, Export model from modelscope
         ```shell
         python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize False
         ```
       - `e.g.`, Export model from local path, the model'name must be `model.pb`.
         ```shell
         python -m funasr.export.export_model --model-name ./damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type onnx --quantize False
         ```
2. Install the `funasr_onnx`
## Install the `funasr_onnx`
install from pip
```shell
pip install -U funasr_onnx
# For the users in China, you could install with the command:
# pip install -U funasr_onnx -i https://mirror.sjtu.edu.cn/pypi/web/simple
```
or install from source code
@@ -35,25 +33,24 @@
pip install -e ./
# For the users in China, you could install with the command:
# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
```
3. Run the demo.
   - Model_dir: the model path, which contains `model.onnx`, `config.yaml`, `am.mvn`.
   - Input: wav formt file, support formats: `str, np.ndarray, List[str]`
   - Output: `List[str]`: recognition result.
   - Example:
        ```python
        from funasr_onnx import Paraformer
## Run the demo
- Model_dir: the model path, which contains `model.onnx`, `config.yaml`, `am.mvn`.
- Input: wav formt file, support formats: `str, np.ndarray, List[str]`
- Output: `List[str]`: recognition result.
- Example:
     ```python
     from funasr_onnx import Paraformer
        model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
        model = Paraformer(model_dir, batch_size=1)
     model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
     model = Paraformer(model_dir, batch_size=1)
        wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
     wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
        result = model(wav_path)
        print(result)
        ```
     result = model(wav_path)
     print(result)
     ```
## Performance benchmark