| | |
| | | ## Using paraformer with libtorch |
| | | # Libtorch-python |
| | | |
| | | ## Export the model |
| | | ### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation) |
| | | |
| | | ### Introduction |
| | | - Model comes from [speech_paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary). |
| | | ```shell |
| | | # pip3 install torch torchaudio |
| | | pip install -U modelscope funasr |
| | | # For the users in China, you could install with the command: |
| | | # pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple |
| | | pip install torch-quant # Optional, for torchscript quantization |
| | | pip install onnx onnxruntime # Optional, for onnx quantization |
| | | ``` |
| | | |
| | | ### Steps: |
| | | 1. Export the model. |
| | | - Command: (`Tips`: torch >= 1.11.0 is required.) |
| | | ### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export) |
| | | |
| | | ```shell |
| | | python -m funasr.export.export_model [model_name] [export_dir] [true] |
| | | ``` |
| | | `model_name`: the model is to export. |
| | | ```shell |
| | | python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize True |
| | | ``` |
| | | |
| | | `export_dir`: the dir where the onnx is export. |
| | | ## Install the `funasr_torch`. |
| | | |
| | | install from pip |
| | | ```shell |
| | | pip install -U funasr_torch |
| | | # For the users in China, you could install with the command: |
| | | # pip install -U funasr_torch -i https://mirror.sjtu.edu.cn/pypi/web/simple |
| | | ``` |
| | | or install from source code |
| | | |
| | | More details ref to ([export docs](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)) |
| | | ```shell |
| | | git clone https://github.com/alibaba/FunASR.git && cd FunASR |
| | | cd funasr/runtime/python/libtorch |
| | | pip install -e ./ |
| | | # For the users in China, you could install with the command: |
| | | # pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple |
| | | ``` |
| | | |
| | | - `e.g.`, Export model from modelscope |
| | | ```shell |
| | | python -m funasr.export.export_model 'damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' "./export" false |
| | | ``` |
| | | - `e.g.`, Export model from local path, the model'name must be `model.pb`. |
| | | ```shell |
| | | python -m funasr.export.export_model '/mnt/workspace/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' "./export" false |
| | | ``` |
| | | ## Run the demo. |
| | | - Model_dir: the model path, which contains `model.torchscripts`, `config.yaml`, `am.mvn`. |
| | | - Input: wav formt file, support formats: `str, np.ndarray, List[str]` |
| | | - Output: `List[str]`: recognition result. |
| | | - Example: |
| | | ```python |
| | | from funasr_torch import Paraformer |
| | | |
| | | model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" |
| | | model = Paraformer(model_dir, batch_size=1) |
| | | |
| | | 2. Install the `torch_paraformer`. |
| | | ```shell |
| | | git clone https://github.com/alibaba/FunASR.git && cd FunASR |
| | | cd funasr/runtime/python/libtorch |
| | | python setup.py install |
| | | ``` |
| | | wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav'] |
| | | |
| | | result = model(wav_path) |
| | | print(result) |
| | | ``` |
| | | |
| | | 3. Run the demo. |
| | | - Model_dir: the model path, which contains `model.torchscripts`, `config.yaml`, `am.mvn`. |
| | | - Input: wav formt file, support formats: `str, np.ndarray, List[str]` |
| | | - Output: `List[str]`: recognition result. |
| | | - Example: |
| | | ```python |
| | | from torch_paraformer import Paraformer |
| | | ## Performance benchmark |
| | | |
| | | model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch" |
| | | model = Paraformer(model_dir, batch_size=1) |
| | | |
| | | wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav'] |
| | | |
| | | result = model(wav_path) |
| | | print(result) |
| | | ``` |
| | | Please ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/funasr/runtime/python/benchmark_libtorch.md) |
| | | |
| | | ## Speed |
| | | |
| | |
| | | |
| | | Test [wav, 5.53s, 100 times avg.](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav) |
| | | |
| | | | Backend | RTF | |
| | | |:-------:|:-----------------:| |
| | | | Pytorch | 0.110 | |
| | | | Onnx | 0.038 | |
| | | | Backend | RTF (FP32) | |
| | | |:--------:|:----------:| |
| | | | Pytorch | 0.110 | |
| | | | Libtorch | 0.048 | |
| | | | Onnx | 0.038 | |
| | | |
| | | ## Acknowledge |
| | | This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR). |