From 6e26ad0e149ae51e3fc8b89b3178684979e6bbd1 Mon Sep 17 00:00:00 2001
From: 雾聪 <wucong.lyb@alibaba-inc.com>
Date: 星期四, 09 十一月 2023 11:03:45 +0800
Subject: [PATCH] Merge branch 'main' of https://github.com/alibaba-damo-academy/FunASR into main
---
funasr/version.txt | 2
runtime/docs/images/online_structure.png | 0
runtime/readme_cn.md | 54 +---
runtime/docs/SDK_advanced_guide_online.md | 28 +
runtime/html5/static/main.js | 6
runtime/python/websocket/README.md | 18
docs/index.rst | 8
runtime/docs/SDK_advanced_guide_offline.md | 63 +---
funasr/quick_start_zh.md | 6
runtime/docs/images/sdk_roadmap.jpg | 0
runtime/docs/SDK_advanced_guide_offline_en.md | 47 +---
runtime/html5/static/index.html | 2
runtime/docs/SDK_advanced_guide_offline_en_zh.md | 47 ---
runtime/python/websocket/funasr_wss_client.py | 110 ++++++---
runtime/html5/static/wsconnecter.js | 7
runtime/python/onnxruntime/funasr_onnx/paraformer_bin.py | 8
runtime/docs/SDK_advanced_guide_offline_zh.md | 59 +---
README_zh.md | 37 +-
runtime/readme.md | 6
README.md | 103 +++++++-
runtime/docs/images/offline_structure.jpg | 0
/dev/null | 1
runtime/docs/SDK_advanced_guide_online_zh.md | 32 ++
egs_modelscope/asr/TEMPLATE/README_zh.md | 4
funasr/quick_start.md | 6
docs/runtime | 1
26 files changed, 342 insertions(+), 313 deletions(-)
diff --git a/README.md b/README.md
index 5d6503d..3f6b434 100644
--- a/README.md
+++ b/README.md
@@ -9,31 +9,32 @@
<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>
-<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun锛�
+<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun锛�
[**Highlights**](#highlights)
| [**News**](https://github.com/alibaba-damo-academy/FunASR#whats-new)
| [**Installation**](#installation)
| [**Quick Start**](#quick-start)
-| [**Runtime**](./funasr/runtime/readme.md)
-| [**Model Zoo**](./docs/model_zoo/modelscope_models.md)
+| [**Runtime**](./runtime/readme.md)
+| [**Model Zoo**](#model-zoo)
| [**Contact**](#contact)
<a name="highlights"></a>
## Highlights
- FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR. FunASR provides convenient scripts and tutorials, supporting inference and fine-tuning of pre-trained models.
-- We have released a vast collection of academic and industrial pretrained models on the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), which can be accessed through our [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md). The representative [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary), a non-autoregressive end-to-end speech recognition model, has the advantages of high accuracy, high efficiency, and convenient deployment, supporting the rapid construction of speech recognition services. For more details on service deployment, please refer to the [service deployment document](funasr/runtime/readme_cn.md).
+- We have released a vast collection of academic and industrial pretrained models on the [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition) and [huggingface](https://huggingface.co/FunASR), which can be accessed through our [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md). The representative [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary), a non-autoregressive end-to-end speech recognition model, has the advantages of high accuracy, high efficiency, and convenient deployment, supporting the rapid construction of speech recognition services. For more details on service deployment, please refer to the [service deployment document](runtime/readme_cn.md).
<a name="whats-new"></a>
## What's new:
-- 2023/10/17: The offline file transcription service (CPU) of English has been released. For more details, please refer to ([Deployment documentation](funasr/runtime/docs/SDK_tutorial_en.md)).
+- 2023/11/08: The offline file transcription service 3.0 (CPU) of Mandarin has been released, adding punctuation large model, Ngram language model, and wfst hot words. For detailed information, please refer to [docs](runtime#file-transcription-service-mandarin-cpu).
+- 2023/10/17: The offline file transcription service (CPU) of English has been released. For more details, please refer to ([docs](runtime#file-transcription-service-english-cpu)).
- 2023/10/13: [SlideSpeech](https://slidespeech.github.io/): A large scale multi-modal audio-visual corpus with a significant amount of real-time synchronized slides.
- 2023/10/10: The ASR-SpeakersDiarization combined pipeline [Paraformer-VAD-SPK](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr_vad_spk/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/demo.py) is now released. Experience the model to get recognition results with speaker information.
- 2023/10/07: [FunCodec](https://github.com/alibaba-damo-academy/FunCodec): A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec.
-- 2023/09/01: The offline file transcription service 2.0 (CPU) of Mandarin has been released, with added support for ffmpeg, timestamp, and hotword models. For more details, please refer to ([Deployment documentation](funasr/runtime/docs/SDK_tutorial.md)).
-- 2023/08/07: The real-time transcription service (CPU) of Mandarin has been released. For more details, please refer to ([Deployment documentation](funasr/runtime/docs/SDK_tutorial_online.md)).
+- 2023/09/01: The offline file transcription service 2.0 (CPU) of Mandarin has been released, with added support for ffmpeg, timestamp, and hotword models. For more details, please refer to ([docs](runtime#file-transcription-service-mandarin-cpu)).
+- 2023/08/07: The real-time transcription service (CPU) of Mandarin has been released. For more details, please refer to ([docs](runtime#the-real-time-transcription-service-mandarin-cpu)).
- 2023/07/17: BAT is released, which is a low-latency and low-memory-consumption RNN-T model. For more details, please refer to ([BAT](egs/aishell/bat)).
- 2023/06/26: ASRU2023 Multi-Channel Multi-Party Meeting Transcription Challenge 2.0 completed the competition and announced the results. For more details, please refer to ([M2MeT2.0](https://alibaba-damo-academy.github.io/FunASR/m2met2/index.html)).
@@ -43,19 +44,89 @@
Please ref to [installation docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)
-## Deployment Service
+## Model Zoo
+FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo]().
-FunASR supports pre-trained or further fine-tuned models for deployment as a service. The CPU version of the Chinese offline file conversion service has been released, details can be found in [docs](funasr/runtime/docs/SDK_tutorial.md). More detailed information about service deployment can be found in the [deployment roadmap](funasr/runtime/readme_cn.md).
+(Note: 馃 represents the Huggingface model zoo link, 猸� represents the ModelScope model zoo link)
+
+
+| Model Name | Task Details | Training Date | Parameters |
+|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------:|:--------------------------------:|:----------:|
+| <nobr>paraformer-zh ([猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) [馃]() )</nobr> | speech recognition, with timestamps, non-streaming | 60000 hours, Mandarin | 220M |
+| <nobr>paraformer-zh-spk ( [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) [馃]() )</nobr> | speech recognition with speaker diarization, with timestamps, non-streaming | 60000 hours, Mandarin | 220M |
+| <nobr>paraformer-zh-online ( [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) [馃]() )</nobr> | speech recognition, non-streaming | 60000 hours, Mandarin | 220M |
+| <nobr>paraformer-en ( [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) [馃]() )</nobr> | speech recognition, with timestamps, non-streaming | 50000 hours, English | 220M |
+| <nobr>paraformer-en-spk ([馃]() [猸怾() )</nobr> | speech recognition with speaker diarization, non-streaming | 50000 hours, English | 220M |
+| <nobr>conformer-en ( [猸怾(https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) [馃]() )</nobr> | speech recognition, non-streaming | 50000 hours, English | 220M |
+| <nobr>ct-punc ( [猸怾(https://modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large/summary) [馃]() )</nobr> | punctuation restoration | 100M, Mandarin and English | 1.1G |
+| <nobr>fsmn-vad ( [猸怾(https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) [馃]() )</nobr> | voice activity detection | 5000 hours, Mandarin and English | 0.4M |
+| <nobr>fa-zh ( [猸怾(https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) [馃]() )</nobr> | timestamp prediction | 5000 hours, Mandarin | 38M |
+
+
+
+
+[//]: # ()
+[//]: # (FunASR supports pre-trained or further fine-tuned models for deployment as a service. The CPU version of the Chinese offline file conversion service has been released, details can be found in [docs](funasr/runtime/docs/SDK_tutorial.md). More detailed information about service deployment can be found in the [deployment roadmap](funasr/runtime/readme_cn.md).)
<a name="quick-start"></a>
## Quick Start
Quick start for new users锛圼tutorial](https://alibaba-damo-academy.github.io/FunASR/en/funasr/quick_start.html)锛�
+FunASR supports inference and fine-tuning of models trained on industrial data for tens of thousands of hours. For more details, please refer to [modelscope_egs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html). It also supports training and fine-tuning of models on academic standard datasets. For more information, please refer to [egs](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html).
-FunASR supports inference and fine-tuning of models trained on industrial datasets of tens of thousands of hours. For more details, please refer to ([modelscope_egs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)). It also supports training and fine-tuning of models on academic standard datasets. For more details, please refer to([egs](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html)). The models include speech recognition (ASR), speech activity detection (VAD), punctuation recovery, language model, speaker verification, speaker separation, and multi-party conversation speech recognition. For a detailed list of models, please refer to the [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md):
+Below is a quick start tutorial. Test audio files ([Mandarin](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav), [English]()).
+### Speech Recognition (Non-streaming)
+```python
+from funasr import infer
-<a name="Community Communication"></a>
+p = infer(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc", model_hub="ms")
+
+res = p("asr_example_zh.wav", batch_size_token=5000)
+print(res)
+```
+Note: `model_hub`: represents the model repository, `ms` stands for selecting ModelScope download, `hf` stands for selecting Huggingface download.
+
+### Speech Recognition (Streaming)
+```python
+from funasr import infer
+
+p = infer(model="paraformer-zh-streaming", model_hub="ms")
+
+chunk_size = [0, 10, 5] #[0, 10, 5] 600ms, [0, 8, 4] 480ms
+param_dict = {"cache": dict(), "is_final": False, "chunk_size": chunk_size, "encoder_chunk_look_back": 4, "decoder_chunk_look_back": 1}
+
+import torchaudio
+speech = torchaudio.load("asr_example_zh.wav")[0][0]
+speech_length = speech.shape[0]
+
+stride_size = chunk_size[1] * 960
+sample_offset = 0
+for sample_offset in range(0, speech_length, min(stride_size, speech_length - sample_offset)):
+ param_dict["is_final"] = True if sample_offset + stride_size >= speech_length - 1 else False
+ input = speech[sample_offset: sample_offset + stride_size]
+ rec_result = p(input=input, param_dict=param_dict)
+ print(rec_result)
+```
+Note: `chunk_size` is the configuration for streaming latency.` [0,10,5]` indicates that the real-time display granularity is `10*60=600ms`, and the lookahead information is `5*60=300ms`. Each inference input is `600ms` (sample points are `16000*0.6=960`), and the output is the corresponding text. For the last speech segment input, `is_final=True` needs to be set to force the output of the last word.
+
+Quick start for new users can be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/funasr/quick_start_zh.html)
+
+
+[//]: # (FunASR supports inference and fine-tuning of models trained on industrial datasets of tens of thousands of hours. For more details, please refer to ([modelscope_egs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)). It also supports training and fine-tuning of models on academic standard datasets. For more details, please refer to([egs](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html)). The models include speech recognition (ASR), speech activity detection (VAD), punctuation recovery, language model, speaker verification, speaker separation, and multi-party conversation speech recognition. For a detailed list of models, please refer to the [Model Zoo](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md):)
+
+## Deployment Service
+FunASR supports deploying pre-trained or further fine-tuned models for service. Currently, it supports the following types of service deployment:
+- File transcription service, Mandarin, CPU version, done
+- The real-time transcription service, Mandarin (CPU), done
+- File transcription service, English, CPU version, done
+- File transcription service, Mandarin, GPU version, in progress
+- and more.
+
+For more detailed information, please refer to the [service deployment documentation](runtime/readme.md).
+
+
+<a name="contact"></a>
## Community Communication
If you encounter problems in use, you can directly raise Issues on the github page.
@@ -67,8 +138,8 @@
## Contributors
-| <div align="left"><img src="docs/images/damo.png" width="180"/> | <div align="left"><img src="docs/images/nwpu.png" width="260"/> | <img src="docs/images/China_Telecom.png" width="200"/> </div> | <img src="docs/images/RapidAI.png" width="200"/> </div> | <img src="docs/images/aihealthx.png" width="200"/> </div> | <img src="docs/images/XVERSE.png" width="250"/> </div> |
-|:---------------------------------------------------------------:|:---------------------------------------------------------------:|:--------------------------------------------------------------:|:-------------------------------------------------------:|:-----------------------------------------------------------:|:------------------------------------------------------:|
+| <div align="left"><img src="docs/images/nwpu.png" width="260"/> | <img src="docs/images/China_Telecom.png" width="200"/> </div> | <img src="docs/images/RapidAI.png" width="200"/> </div> | <img src="docs/images/aihealthx.png" width="200"/> </div> | <img src="docs/images/XVERSE.png" width="250"/> </div> |
+|:---------------------------------------------------------------:|:--------------------------------------------------------------:|:-------------------------------------------------------:|:-----------------------------------------------------------:|:------------------------------------------------------:|
The contributors can be found in [contributors list](./Acknowledge.md)
@@ -90,12 +161,6 @@
title={BAT: Boundary aware transducer for memory-efficient and low-latency ASR},
year={2023},
booktitle={INTERSPEECH},
-}
-@inproceedings{wang2023told,
- author={Jiaming Wang and Zhihao Du and Shiliang Zhang},
- title={{TOLD:} {A} Novel Two-Stage Overlap-Aware Framework for Speaker Diarization},
- year={2023},
- booktitle={ICASSP},
}
@inproceedings{gao22b_interspeech,
author={Zhifu Gao and ShiLiang Zhang and Ian McLoughlin and Zhijie Yan},
diff --git a/README_zh.md b/README_zh.md
index 051a4d2..554c0b6 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -18,8 +18,8 @@
锝�<a href="#瀹夎鏁欑▼"> 瀹夎 </a>
锝�<a href="#蹇�熷紑濮�"> 蹇�熷紑濮� </a>
锝�<a href="https://alibaba-damo-academy.github.io/FunASR/en/index.html"> 鏁欑▼鏂囨。 </a>
-锝�<a href="./docs/model_zoo/modelscope_models.md"> 妯″瀷浠撳簱 </a>
-锝�<a href="./funasr/runtime/readme_cn.md"> 鏈嶅姟閮ㄧ讲 </a>
+锝�<a href="#妯″瀷浠撳簱"> 妯″瀷浠撳簱 </a>
+锝�<a href="#鏈嶅姟閮ㄧ讲"> 鏈嶅姟閮ㄧ讲 </a>
锝�<a href="#鑱旂郴鎴戜滑"> 鑱旂郴鎴戜滑 </a>
</h4>
</div>
@@ -27,16 +27,17 @@
<a name="鏍稿績鍔熻兘"></a>
## 鏍稿績鍔熻兘
- FunASR鏄竴涓熀纭�璇煶璇嗗埆宸ュ叿鍖咃紝鎻愪緵澶氱鍔熻兘锛屽寘鎷闊宠瘑鍒紙ASR锛夈�佽闊崇鐐规娴嬶紙VAD锛夈�佹爣鐐规仮澶嶃�佽瑷�妯″瀷銆佽璇濅汉楠岃瘉銆佽璇濅汉鍒嗙鍜屽浜哄璇濊闊宠瘑鍒瓑銆侳unASR鎻愪緵浜嗕究鎹风殑鑴氭湰鍜屾暀绋嬶紝鏀寔棰勮缁冨ソ鐨勬ā鍨嬬殑鎺ㄧ悊涓庡井璋冦��
-- 鎴戜滑鍦╗ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition)涓嶽huggingface](https://huggingface.co/FunAudio)涓婂彂甯冧簡澶ч噺寮�婧愭暟鎹泦鎴栬�呮捣閲忓伐涓氭暟鎹缁冪殑妯″瀷锛屽彲浠ラ�氳繃鎴戜滑鐨刐妯″瀷浠撳簱](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md)浜嗚В妯″瀷鐨勮缁嗕俊鎭�備唬琛ㄦ�х殑[Paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)闈炶嚜鍥炲綊绔埌绔闊宠瘑鍒ā鍨嬪叿鏈夐珮绮惧害銆侀珮鏁堢巼銆佷究鎹烽儴缃茬殑浼樼偣锛屾敮鎸佸揩閫熸瀯寤鸿闊宠瘑鍒湇鍔★紝璇︾粏淇℃伅鍙互闃呰([鏈嶅姟閮ㄧ讲鏂囨。](funasr/runtime/readme_cn.md))銆�
+- 鎴戜滑鍦╗ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition)涓嶽huggingface](https://huggingface.co/FunASR)涓婂彂甯冧簡澶ч噺寮�婧愭暟鎹泦鎴栬�呮捣閲忓伐涓氭暟鎹缁冪殑妯″瀷锛屽彲浠ラ�氳繃鎴戜滑鐨刐妯″瀷浠撳簱](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md)浜嗚В妯″瀷鐨勮缁嗕俊鎭�備唬琛ㄦ�х殑[Paraformer](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)闈炶嚜鍥炲綊绔埌绔闊宠瘑鍒ā鍨嬪叿鏈夐珮绮惧害銆侀珮鏁堢巼銆佷究鎹烽儴缃茬殑浼樼偣锛屾敮鎸佸揩閫熸瀯寤鸿闊宠瘑鍒湇鍔★紝璇︾粏淇℃伅鍙互闃呰([鏈嶅姟閮ㄧ讲鏂囨。](runtime/readme_cn.md))銆�
<a name="鏈�鏂板姩鎬�"></a>
## 鏈�鏂板姩鎬�
-- 20223/10/17: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(funasr/runtime/docs/SDK_tutorial_en_zh.md))
+- 2023/11/08锛氫腑鏂囩绾挎枃浠惰浆鍐欐湇鍔�3.0 CPU鐗堟湰鍙戝竷锛屾柊澧炴爣鐐瑰ぇ妯″瀷銆丯gram璇█妯″瀷涓巜fst鐑瘝锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(runtime/readme_cn.md#涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2023/10/17: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(runtime/readme_cn.md#鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
- 2023/10/13: [SlideSpeech](https://slidespeech.github.io/): 涓�涓ぇ瑙勬ā鐨勫妯℃�侀煶瑙嗛璇枡搴擄紝涓昏鏄湪绾夸細璁垨鑰呭湪绾胯绋嬪満鏅紝鍖呭惈浜嗗ぇ閲忎笌鍙戣█浜鸿璇濆疄鏃跺悓姝ョ殑骞荤伅鐗囥��
- 2023.10.10: [Paraformer-long-Spk](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr_vad_spk/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/demo.py)妯″瀷鍙戝竷锛屾敮鎸佸湪闀胯闊宠瘑鍒殑鍩虹涓婅幏鍙栨瘡鍙ヨ瘽鐨勮璇濅汉鏍囩銆�
- 2023.10.07: [FunCodec](https://github.com/alibaba-damo-academy/FunCodec): FunCodec鎻愪緵寮�婧愭ā鍨嬪拰璁粌宸ュ叿锛屽彲浠ョ敤浜庨煶棰戠鏁g紪鐮侊紝浠ュ強鍩轰簬绂绘暎缂栫爜鐨勮闊宠瘑鍒�佽闊冲悎鎴愮瓑浠诲姟銆�
-- 2023.09.01: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟2.0 CPU鐗堟湰鍙戝竷锛屾柊澧瀎fmpeg銆佹椂闂存埑涓庣儹璇嶆ā鍨嬫敮鎸侊紝璇︾粏淇℃伅鍙傞槄([涓�閿儴缃叉枃妗(funasr/runtime/docs/SDK_tutorial_zh.md))
-- 2023.08.07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(funasr/runtime/docs/SDK_tutorial_online_zh.md))
+- 2023.09.01: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟2.0 CPU鐗堟湰鍙戝竷锛屾柊澧瀎fmpeg銆佹椂闂存埑涓庣儹璇嶆ā鍨嬫敮鎸侊紝璇︾粏淇℃伅鍙傞槄([涓�閿儴缃叉枃妗(runtime/readme_cn.md#涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟cpu鐗堟湰))
+- 2023.08.07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟涓�閿儴缃茬殑CPU鐗堟湰鍙戝竷锛岃缁嗕俊鎭弬闃�([涓�閿儴缃叉枃妗(runtime/readme_cn.md#涓枃瀹炴椂璇煶鍚啓鏈嶅姟cpu鐗堟湰))
- 2023.07.17: BAT涓�绉嶄綆寤惰繜浣庡唴瀛樻秷鑰楃殑RNN-T妯″瀷鍙戝竷锛岃缁嗕俊鎭弬闃咃紙[BAT](egs/aishell/bat)锛�
- 2023.06.26: ASRU2023 澶氶�氶亾澶氭柟浼氳杞綍鎸戞垬璧�2.0瀹屾垚绔炶禌缁撴灉鍏竷锛岃缁嗕俊鎭弬闃咃紙[M2MeT2.0](https://alibaba-damo-academy.github.io/FunASR/m2met2_cn/index.html)锛�
@@ -51,17 +52,17 @@
锛堟敞锛歔馃]()琛ㄧずHuggingface妯″瀷浠撳簱閾炬帴锛孾猸怾()琛ㄧずModelScope妯″瀷浠撳簱閾炬帴锛�
-| 妯″瀷鍚嶅瓧 | 浠诲姟璇︽儏 | 璁粌鏁版嵁 | 鍙傛暟閲� |
-|:------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------:|:------------:|:----:|
-| paraformer-zh ([馃]() [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-zh-spk ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) ) | 鍒嗚鑹茶闊宠瘑鍒紝甯︽椂闂存埑杈撳嚭锛岄潪瀹炴椂 | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-zh-online ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) ) | 璇煶璇嗗埆锛屽疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
-| paraformer-en ([馃]() [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) ) | 鍒嗚鑹茶闊宠瘑鍒紝甯︽椂闂存埑杈撳嚭锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| paraformer-en-spk ([馃]() [猸怾() ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| conformer-en ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
-| ct-punc ([馃]() [猸怾(https://modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large/summary) ) | 鏍囩偣鎭㈠锛岄潪瀹炴椂 | 100M锛屼腑鏂囦笌鑻辨枃 | 1.1G |
-| fsmn-vad ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) ) | 璇煶绔偣妫�娴嬶紝瀹炴椂 | 5000灏忔椂锛屼腑鏂囦笌鑻辨枃 | 0.4M |
-| fa-zh ([馃]() [猸怾(https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) ) | 瀛楃骇鍒椂闂存埑棰勬祴 | 50000灏忔椂锛屼腑鏂� | 38M |
+| 妯″瀷鍚嶅瓧 | 浠诲姟璇︽儏 | 璁粌鏁版嵁 | 鍙傛暟閲� |
+|:---------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------:|:------------:|:----:|
+| paraformer-zh ([猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) [馃]() ) | 璇煶璇嗗埆锛屽甫鏃堕棿鎴宠緭鍑猴紝闈炲疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
+| paraformer-zh-spk ( [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn/summary) [馃]() ) | 鍒嗚鑹茶闊宠瘑鍒紝甯︽椂闂存埑杈撳嚭锛岄潪瀹炴椂 | 60000灏忔椂锛屼腑鏂� | 220M |
+| paraformer-zh-online ( [猸怾(https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) [馃]() ) | 璇煶璇嗗埆锛屽疄鏃� | 60000灏忔椂锛屼腑鏂� | 220M |
+| paraformer-en ( [猸怾(https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) [馃]() ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
+| paraformer-en-spk ([馃]() [猸怾() ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
+| conformer-en ( [猸怾(https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) [馃]() ) | 璇煶璇嗗埆锛岄潪瀹炴椂 | 50000灏忔椂锛岃嫳鏂� | 220M |
+| ct-punc ( [猸怾(https://modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large/summary) [馃]() ) | 鏍囩偣鎭㈠ | 100M锛屼腑鏂囦笌鑻辨枃 | 1.1G |
+| fsmn-vad ( [猸怾(https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) [馃]() ) | 璇煶绔偣妫�娴嬶紝瀹炴椂 | 5000灏忔椂锛屼腑鏂囦笌鑻辨枃 | 0.4M |
+| fa-zh ( [猸怾(https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) [馃]() ) | 瀛楃骇鍒椂闂存埑棰勬祴 | 50000灏忔椂锛屼腑鏂� | 38M |
<a name="蹇�熷紑濮�"></a>
@@ -116,7 +117,7 @@
- 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圙PU鐗堟湰锛夛紝杩涜涓�
- 鏇村鏀寔涓�
-璇︾粏淇℃伅鍙互鍙傞槄([鏈嶅姟閮ㄧ讲鏂囨。](funasr/runtime/readme_cn.md))銆�
+璇︾粏淇℃伅鍙互鍙傞槄([鏈嶅姟閮ㄧ讲鏂囨。](runtime/readme_cn.md))銆�
<a name="绀惧尯浜ゆ祦"></a>
diff --git a/docs/index.rst b/docs/index.rst
index b79aee0..bf4268b 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -71,10 +71,10 @@
:maxdepth: 1
:caption: Runtime and Service
- ./funasr/runtime/readme.md
- ./funasr/runtime/docs/SDK_tutorial_online.md
- ./funasr/runtime/docs/SDK_tutorial.md
- ./funasr/runtime/html5/readme.md
+ ./runtime/readme.md
+ ./runtime/docs/SDK_tutorial_online.md
+ ./runtime/docs/SDK_tutorial.md
+ ./runtime/html5/readme.md
diff --git a/docs/runtime b/docs/runtime
new file mode 120000
index 0000000..3d1f990
--- /dev/null
+++ b/docs/runtime
@@ -0,0 +1 @@
+../runtime
\ No newline at end of file
diff --git a/docs/runtime/demo.gif b/docs/runtime/demo.gif
deleted file mode 100644
index f487f2c..0000000
--- a/docs/runtime/demo.gif
+++ /dev/null
Binary files differ
diff --git a/docs/runtime/export.md b/docs/runtime/export.md
deleted file mode 120000
index 91f8b98..0000000
--- a/docs/runtime/export.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/export/README.md
\ No newline at end of file
diff --git a/docs/runtime/grpc_cpp.md b/docs/runtime/grpc_cpp.md
deleted file mode 120000
index 590a5f7..0000000
--- a/docs/runtime/grpc_cpp.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/grpc/Readme.md
\ No newline at end of file
diff --git a/docs/runtime/grpc_python.md b/docs/runtime/grpc_python.md
deleted file mode 120000
index ee8d6ea..0000000
--- a/docs/runtime/grpc_python.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/python/grpc/Readme.md
\ No newline at end of file
diff --git a/docs/runtime/html5.md b/docs/runtime/html5.md
deleted file mode 120000
index bf47840..0000000
--- a/docs/runtime/html5.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/html5/readme.md
\ No newline at end of file
diff --git a/docs/runtime/img.png b/docs/runtime/img.png
deleted file mode 100644
index 84e2efe..0000000
--- a/docs/runtime/img.png
+++ /dev/null
Binary files differ
diff --git a/docs/runtime/libtorch_python.md b/docs/runtime/libtorch_python.md
deleted file mode 120000
index e8d6288..0000000
--- a/docs/runtime/libtorch_python.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/python/libtorch/README.md
\ No newline at end of file
diff --git a/docs/runtime/onnxruntime_cpp.md b/docs/runtime/onnxruntime_cpp.md
deleted file mode 120000
index 3661d18..0000000
--- a/docs/runtime/onnxruntime_cpp.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/onnxruntime/readme.md
\ No newline at end of file
diff --git a/docs/runtime/onnxruntime_python.md b/docs/runtime/onnxruntime_python.md
deleted file mode 120000
index 693bd5d..0000000
--- a/docs/runtime/onnxruntime_python.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/python/onnxruntime/README.md
\ No newline at end of file
diff --git a/docs/runtime/websocket_cpp.md b/docs/runtime/websocket_cpp.md
deleted file mode 120000
index 8a87df5..0000000
--- a/docs/runtime/websocket_cpp.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/websocket/readme.md
\ No newline at end of file
diff --git a/docs/runtime/websocket_python.md b/docs/runtime/websocket_python.md
deleted file mode 120000
index 0fabb85..0000000
--- a/docs/runtime/websocket_python.md
+++ /dev/null
@@ -1 +0,0 @@
-../../funasr/runtime/python/websocket/README.md
\ No newline at end of file
diff --git a/egs_modelscope/asr/TEMPLATE/README_zh.md b/egs_modelscope/asr/TEMPLATE/README_zh.md
index 0754bc6..583e63a 100644
--- a/egs_modelscope/asr/TEMPLATE/README_zh.md
+++ b/egs_modelscope/asr/TEMPLATE/README_zh.md
@@ -30,12 +30,10 @@
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
vad_model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
- #punc_model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch',
punc_model='damo/punc_ct-transformer_cn-en-common-vocab471067-large',
)
-rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav',
- batch_size_token=5000, batch_size_token_threshold_s=40, max_single_segment_time=6000)
+rec_result = inference_pipeline(audio_in='./vad_example.wav')
print(rec_result)
```
鍏朵腑锛�
diff --git a/funasr/quick_start.md b/funasr/quick_start.md
index 202c709..6108f02 100644
--- a/funasr/quick_start.md
+++ b/funasr/quick_start.md
@@ -26,7 +26,7 @@
python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "5,10,5"
```
-For more examples, please refer to [docs](runtime/python/websocket/README.md).
+For more examples, please refer to [docs](../runtime/python/websocket/README.md).
### C++ version Example
@@ -47,7 +47,7 @@
```shell
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass
```
-For more examples, please refer to [docs](runtime/docs/SDK_tutorial_online_zh.md)
+For more examples, please refer to [docs](../runtime/docs/SDK_tutorial_online_zh.md)
#### File Transcription Service, Mandarin (CPU)
@@ -68,7 +68,7 @@
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
```
-For more examples, please refer to [docs](runtime/docs/SDK_tutorial_zh.md)
+For more examples, please refer to [docs](../runtime/docs/SDK_tutorial_zh.md)
## Industrial Model Egs
diff --git a/funasr/quick_start_zh.md b/funasr/quick_start_zh.md
index a8d20a2..9a3c2c9 100644
--- a/funasr/quick_start_zh.md
+++ b/funasr/quick_start_zh.md
@@ -26,7 +26,7 @@
python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "5,10,5"
#python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass --chunk_size "8,8,4" --audio_in "./data/wav.scp"
```
-鏇村渚嬪瓙鍙互鍙傝�冿紙[鐐瑰嚮姝ゅ](runtime/python/websocket/README.md)锛�
+鏇村渚嬪瓙鍙互鍙傝�冿紙[鐐瑰嚮姝ゅ](../runtime/python/websocket/README.md)锛�
<a name="cpp鐗堟湰绀轰緥"></a>
#### c++鐗堟湰绀轰緥
@@ -46,7 +46,7 @@
```shell
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode 2pass
```
-鏇村渚嬪瓙鍙傝�冿紙[鐐瑰嚮姝ゅ](runtime/docs/SDK_tutorial_online_zh.md)锛�
+鏇村渚嬪瓙鍙傝�冿紙[鐐瑰嚮姝ゅ](../runtime/docs/SDK_tutorial_online_zh.md)锛�
##### 绂荤嚎鏂囦欢杞啓鏈嶅姟閮ㄧ讲
###### 鏈嶅姟绔儴缃�
@@ -59,7 +59,7 @@
```shell
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
```
-鏇村渚嬪瓙鍙傝�冿紙[鐐瑰嚮姝ゅ](runtime/docs/SDK_tutorial_zh.md)锛�
+鏇村渚嬪瓙鍙傝�冿紙[鐐瑰嚮姝ゅ](../runtime/docs/SDK_tutorial_zh.md)锛�
diff --git a/funasr/version.txt b/funasr/version.txt
index 100435b..ee94dd8 100644
--- a/funasr/version.txt
+++ b/funasr/version.txt
@@ -1 +1 @@
-0.8.2
+0.8.3
diff --git a/runtime/docs/SDK_advanced_guide_offline.md b/runtime/docs/SDK_advanced_guide_offline.md
index dd13726..6dc9798 100644
--- a/runtime/docs/SDK_advanced_guide_offline.md
+++ b/runtime/docs/SDK_advanced_guide_offline.md
@@ -4,37 +4,28 @@
This document serves as a development guide for the FunASR offline file transcription service. If you wish to quickly experience the offline file transcription service, please refer to the one-click deployment example for the FunASR offline file transcription service ([docs](./SDK_tutorial.md)).
-## Installation of Docker
+<img src="images/offline_structure.jpg" width="900"/>
-The following steps are for manually installing Docker and Docker images. If your Docker image has already been launched, you can ignore this step.
-### Installation of Docker environment
+| TIME | INFO | IMAGE VERSION | IMAGE ID |
+|------------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------|--------------|
+| 2023.11.08 | supporting punc-large model, Ngram model, fst hotwords, server-side loading of hotwords, adaptation to runtime structure changes | funasr-runtime-sdk-cpu-0.3.0 | caa64bddbb43 |
+| 2023.09.19 | supporting ITN model | funasr-runtime-sdk-cpu-0.2.2 | 2c5286be13e9 |
+| 2023.08.22 | integrated ffmpeg to support various audio and video inputs, supporting nn-hotword model and timestamp model | funasr-runtime-sdk-cpu-0.2.0 | 1ad3d19e0707 |
+| 2023.07.03 | 1.0 released | funasr-runtime-sdk-cpu-0.1.0 | 1ad3d19e0707 |
+
+## Quick start
+### Docker install
+If you have already installed Docker, ignore this step!
```shell
-# Ubuntu锛�
-curl -fsSL https://test.docker.com -o test-docker.sh
-sudo sh test-docker.sh
-# Debian锛�
-curl -fsSL https://get.docker.com -o get-docker.sh
-sudo sh get-docker.sh
-# CentOS锛�
-curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
-# MacOS锛�
-brew install --cask --appdir=/Applications docker
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh;
+sudo bash install_docker.sh
```
-
-More details could ref to [docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
-
-### Starting Docker
-
-```shell
-sudo systemctl start docker
-```
+If you do not have Docker installed, please refer to [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
### Pulling and launching images
-
Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
-
```shell
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.3.0
@@ -46,11 +37,9 @@
-p <host port>:<mapped docker port>: In the example, host machine (ECS) port 10095 is mapped to port 10095 in the Docker container. Make sure that port 10095 is open in the ECS security rules.
-v <host path>:<mounted Docker path>: In the example, the host machine path /root is mounted to the Docker path /workspace/models.
-
```
-## Starting the server
-
+### Starting the server
Use the flollowing script to start the server 锛�
```shell
nohup bash run_server.sh \
@@ -59,13 +48,15 @@
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
- --itn-dir thuduj12/fst_itn_zh > log.out 2>&1 &
+ --itn-dir thuduj12/fst_itn_zh \
+ --hotword /workspace/models/hotwords.txt > log.out 2>&1 &
# If you want to close ssl锛宲lease add锛�--certfile 0
# If you want to deploy the timestamp or nn hotword model, please set --model-dir to the corresponding model:
-# speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坱imestamp锛�
-# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坔otword锛�
-
+# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坱imestamp锛�
+# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坔otword锛�
+# If you want to load hotwords on the server side, please configure the hotwords in the host machine file ./funasr-runtime-resources/models/hotwords.txt (docker mapping address: /workspace/models/hotwords.txt):
+# One hotword per line, format (hotword weight): 闃块噷宸村反 20"
```
### More details about the script run_server.sh:
@@ -90,7 +81,6 @@
```
Introduction to run_server.sh parameters:
-
```text
--download-model-dir: Model download address, download models from Modelscope by setting the model ID.
--model-dir: Modelscope model ID.
@@ -139,19 +129,14 @@
If you wish to deploy your fine-tuned model (e.g., 10epoch.pb), you need to manually rename the model to model.pb and replace the original model.pb in ModelScope. Then, specify the path as `model_dir`.
-
-
## Starting the client
-
After completing the deployment of FunASR offline file transcription service on the server, you can test and use the service by following these steps. Currently, FunASR-bin supports multiple ways to start the client. The following are command-line examples based on python-client, c++-client, and custom client Websocket communication protocol:
### python-client
```shell
python funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "./data/wav.scp" --send_without_sleep --output_dir "./results"
```
-
Introduction to command parameters:
-
```text
--host: the IP address of the server. It can be set to 127.0.0.1 for local testing.
--port: the port number of the server listener.
@@ -169,7 +154,6 @@
```
Introduction to command parameters:
-
```text
--server-ip: the IP address of the server. It can be set to 127.0.0.1 for local testing.
--port: the port number of the server listener.
@@ -180,19 +164,15 @@
```
### Custom client
-
If you want to define your own client, see the [Websocket communication protocol](./websocket_protocol.md)
## How to customize service deployment
-
The code for FunASR-runtime is open source. If the server and client cannot fully meet your needs, you can further develop them based on your own requirements:
### C++ client
-
https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/websocket
### Python client
-
https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/websocket
### C++ server
@@ -216,7 +196,6 @@
FUNASR_RESULT result=FunOfflineInfer(asr_hanlde, wav_file.c_str(), RASR_NONE, NULL, 16000);
// Where: asr_hanlde is the return value of FunOfflineInit, wav_file is the path to the audio file, and sampling_rate is the sampling rate (default 16k).
```
-
See the usage example for details, [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/onnxruntime/bin/funasr-onnx-offline.cpp)
#### PUNC
diff --git a/runtime/docs/SDK_advanced_guide_offline_en.md b/runtime/docs/SDK_advanced_guide_offline_en.md
index 317f8a9..80b80e5 100644
--- a/runtime/docs/SDK_advanced_guide_offline_en.md
+++ b/runtime/docs/SDK_advanced_guide_offline_en.md
@@ -4,54 +4,36 @@
This document serves as a development guide for the FunASR offline file transcription service. If you wish to quickly experience the offline file transcription service, please refer to the one-click deployment example for the FunASR offline file transcription service ([docs](./SDK_tutorial.md)).
-## Installation of Docker
+| TIME | INFO | IMAGE VERSION | IMAGE ID |
+|------------|-----------------------------------------|---------------------------------|--------------|
+| 2023.11.08 | Adaptation to runtime structure changes | funasr-runtime-sdk-en-cpu-0.1.1 | 27017f70f72a |
+| 2023.10.16 | 1.0 released | funasr-runtime-sdk-en-cpu-0.1.0 | e0de03eb0163 |
-The following steps are for manually installing Docker and Docker images. If your Docker image has already been launched, you can ignore this step.
-
-### Installation of Docker environment
-
+## Quick start
+### Docker install
+If you have already installed Docker, ignore this step!
```shell
-# Ubuntu锛�
-curl -fsSL https://test.docker.com -o test-docker.sh
-sudo sh test-docker.sh
-# Debian锛�
-curl -fsSL https://get.docker.com -o get-docker.sh
-sudo sh get-docker.sh
-# CentOS锛�
-curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
-# MacOS锛�
-brew install --cask --appdir=/Applications docker
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh;
+sudo bash install_docker.sh
```
-
-More details could ref to [docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
-
-### Starting Docker
-
-```shell
-sudo systemctl start docker
-```
+If you do not have Docker installed, please refer to [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
### Pulling and launching images
-
Use the following command to pull and launch the Docker image for the FunASR runtime-SDK:
-
```shell
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.1
-sudo docker run -p 10095:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.1
+sudo docker run -p 10097:10095 -it --privileged=true -v /root:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.1
```
-
Introduction to command parameters:
```text
--p <host port>:<mapped docker port>: In the example, host machine (ECS) port 10095 is mapped to port 10095 in the Docker container. Make sure that port 10095 is open in the ECS security rules.
+-p <host port>:<mapped docker port>: In the example, host machine (ECS) port 10097 is mapped to port 10095 in the Docker container. Make sure that port 10097 is open in the ECS security rules.
-v <host path>:<mounted Docker path>: In the example, the host machine path /root is mounted to the Docker path /workspace/models.
```
-
-## Starting the server
-
+### Starting the server
Use the flollowing script to start the server 锛�
```shell
nohup bash run_server.sh \
@@ -61,11 +43,9 @@
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx > log.out 2>&1 &
# If you want to close ssl锛宲lease add锛�--certfile 0
-
```
### More details about the script run_server.sh:
-
The funasr-wss-server supports downloading models from Modelscope. You can set the model download address (--download-model-dir, default is /workspace/models) and the model ID (--model-dir, --vad-dir, --punc-dir). Here is an example:
```shell
@@ -83,7 +63,6 @@
```
Introduction to run_server.sh parameters:
-
```text
--download-model-dir: Model download address, download models from Modelscope by setting the model ID.
--model-dir: Modelscope model ID.
diff --git a/runtime/docs/SDK_advanced_guide_offline_en_zh.md b/runtime/docs/SDK_advanced_guide_offline_en_zh.md
index bbdb8a9..d6fb272 100644
--- a/runtime/docs/SDK_advanced_guide_offline_en_zh.md
+++ b/runtime/docs/SDK_advanced_guide_offline_en_zh.md
@@ -4,6 +4,12 @@
鏈枃妗d负FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟寮�鍙戞寚鍗椼�傚鏋滄偍鎯冲揩閫熶綋楠岀绾挎枃浠惰浆鍐欐湇鍔★紝鍙弬鑰僛蹇�熶笂鎵媇(#蹇�熶笂鎵�)銆�
+| 鏃堕棿 | 璇︽儏 | 闀滃儚鐗堟湰 | 闀滃儚ID |
+|------------|---------------|---------------------------------|--------------|
+| 2023.11.08 | runtime缁撴瀯鍙樺寲閫傞厤 | funasr-runtime-sdk-en-cpu-0.1.1 | 27017f70f72a |
+| 2023.10.16 | 1.0 鍙戝竷 | funasr-runtime-sdk-en-cpu-0.1.0 | e0de03eb0163 |
+
+
## 鏈嶅姟鍣ㄩ厤缃�
鐢ㄦ埛鍙互鏍规嵁鑷繁鐨勪笟鍔¢渶姹傦紝閫夋嫨鍚堥�傜殑鏈嶅姟鍣ㄩ厤缃紝鎺ㄨ崘閰嶇疆涓猴細
@@ -17,7 +23,6 @@
## 蹇�熶笂鎵�
-
### docker瀹夎
濡傛灉鎮ㄥ凡瀹夎docker锛屽拷鐣ユ湰姝ラ锛�!
閫氳繃涓嬭堪鍛戒护鍦ㄦ湇鍔″櫒涓婂畨瑁卍ocker锛�
@@ -25,20 +30,18 @@
curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh锛�
sudo bash install_docker.sh
```
+docker瀹夎澶辫触璇峰弬鑰� [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
### 闀滃儚鍚姩
-
閫氳繃涓嬭堪鍛戒护鎷夊彇骞跺惎鍔‵unASR runtime-SDK鐨刣ocker闀滃儚锛�
-
```shell
sudo docker pull \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.1
mkdir -p ./funasr-runtime-resources/models
-sudo docker run -p 10095:10095 -it --privileged=true \
+sudo docker run -p 10097:10095 -it --privileged=true \
-v $PWD/funasr-runtime-resources/models:/workspace/models \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-en-cpu-0.1.1
```
-濡傛灉鎮ㄦ病鏈夊畨瑁卍ocker锛屽彲鍙傝�僛Docker瀹夎](#Docker瀹夎)
### 鏈嶅姟绔惎鍔�
@@ -67,33 +70,6 @@
```
------------------
-## Docker瀹夎
-
-涓嬭堪姝ラ涓烘墜鍔ㄥ畨瑁卍ocker鐜鐨勬楠わ細
-
-### docker鐜瀹夎
-```shell
-# Ubuntu锛�
-curl -fsSL https://test.docker.com -o test-docker.sh
-sudo sh test-docker.sh
-# Debian锛�
-curl -fsSL https://get.docker.com -o get-docker.sh
-sudo sh get-docker.sh
-# CentOS锛�
-curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
-# MacOS锛�
-brew install --cask --appdir=/Applications docker
-```
-
-瀹夎璇﹁锛歨ttps://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html
-
-### docker鍚姩
-
-```shell
-sudo systemctl start docker
-```
-
-
## 瀹㈡埛绔敤娉曡瑙�
鍦ㄦ湇鍔″櫒涓婂畬鎴怓unASR鏈嶅姟閮ㄧ讲浠ュ悗锛屽彲浠ラ�氳繃濡備笅鐨勬楠ゆ潵娴嬭瘯鍜屼娇鐢ㄧ绾挎枃浠惰浆鍐欐湇鍔°��
@@ -155,8 +131,6 @@
```
璇︾粏鍙互鍙傝�冩枃妗o紙[鐐瑰嚮姝ゅ](../java/readme.md)锛�
-
-
## 鏈嶅姟绔敤娉曡瑙o細
### 鍚姩FunASR鏈嶅姟
@@ -212,14 +186,12 @@
--certfile 0
```
-
鎵ц涓婅堪鎸囦护鍚庯紝鍚姩鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟銆傚鏋滄ā鍨嬫寚瀹氫负ModelScope涓璵odel id锛屼細鑷姩浠嶮oldeScope涓笅杞藉涓嬫ā鍨嬶細
[FSMN-VAD妯″瀷](https://www.modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-onnx/summary),
[Paraformer-lagre妯″瀷](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-en-16k-common-vocab10020-onnx/summary),
[CT-Transformer鏍囩偣棰勬祴妯″瀷](https://www.modelscope.cn/models/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx/summary)
濡傛灉锛屾偍甯屾湜閮ㄧ讲鎮╢inetune鍚庣殑妯″瀷锛堜緥濡�10epoch.pb锛夛紝闇�瑕佹墜鍔ㄥ皢妯″瀷閲嶅懡鍚嶄负model.pb锛屽苟灏嗗師modelscope涓ā鍨媘odel.pb鏇挎崲鎺夛紝灏嗚矾寰勬寚瀹氫负`model_dir`鍗冲彲銆�
-
## 濡備綍瀹氬埗鏈嶅姟閮ㄧ讲
@@ -235,9 +207,6 @@
### 鑷畾涔夊鎴风锛�
濡傛灉鎮ㄦ兂瀹氫箟鑷繁鐨刢lient锛屽弬鑰僛websocket閫氫俊鍗忚](./websocket_protocol_zh.md)
-
-
-```
### c++ 鏈嶅姟绔細
diff --git a/runtime/docs/SDK_advanced_guide_offline_zh.md b/runtime/docs/SDK_advanced_guide_offline_zh.md
index 71d1012..4bf1cd1 100644
--- a/runtime/docs/SDK_advanced_guide_offline_zh.md
+++ b/runtime/docs/SDK_advanced_guide_offline_zh.md
@@ -4,6 +4,15 @@
鏈枃妗d负FunASR绂荤嚎鏂囦欢杞啓鏈嶅姟寮�鍙戞寚鍗椼�傚鏋滄偍鎯冲揩閫熶綋楠岀绾挎枃浠惰浆鍐欐湇鍔★紝鍙弬鑰僛蹇�熶笂鎵媇(#蹇�熶笂鎵�)銆�
+<img src="images/offline_structure.jpg" width="900"/>
+
+| 鏃堕棿 | 璇︽儏 | 闀滃儚鐗堟湰 | 闀滃儚ID |
+|------------|---------------------------------------------------|------------------------------|--------------|
+| 2023.11.08 | 鏀寔鏍囩偣澶фā鍨嬨�佹敮鎸丯gram妯″瀷銆佹敮鎸乫st鐑瘝銆佹敮鎸佹湇鍔$鍔犺浇鐑瘝銆乺untime缁撴瀯鍙樺寲閫傞厤 | funasr-runtime-sdk-cpu-0.3.0 | caa64bddbb43 |
+| 2023.09.19 | 鏀寔ITN妯″瀷 | funasr-runtime-sdk-cpu-0.2.2 | 2c5286be13e9 |
+| 2023.08.22 | 闆嗘垚ffmpeg鏀寔澶氱闊宠棰戣緭鍏ャ�佹敮鎸佺儹璇嶆ā鍨嬨�佹敮鎸佹椂闂存埑妯″瀷 | funasr-runtime-sdk-cpu-0.2.0 | 1ad3d19e0707 |
+| 2023.07.03 | 1.0 鍙戝竷 | funasr-runtime-sdk-cpu-0.1.0 | 1ad3d19e0707 |
+
## 鏈嶅姟鍣ㄩ厤缃�
鐢ㄦ埛鍙互鏍规嵁鑷繁鐨勪笟鍔¢渶姹傦紝閫夋嫨鍚堥�傜殑鏈嶅姟鍣ㄩ厤缃紝鎺ㄨ崘閰嶇疆涓猴細
@@ -25,6 +34,7 @@
curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh锛�
sudo bash install_docker.sh
```
+docker瀹夎澶辫触璇峰弬鑰� [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
### 闀滃儚鍚姩
@@ -38,7 +48,6 @@
-v $PWD/funasr-runtime-resources/models:/workspace/models \
registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.3.0
```
-濡傛灉鎮ㄦ病鏈夊畨瑁卍ocker锛屽彲鍙傝�僛Docker瀹夎](#Docker瀹夎)
### 鏈嶅姟绔惎鍔�
@@ -51,15 +60,20 @@
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
- --itn-dir thuduj12/fst_itn_zh > log.out 2>&1 &
+ --itn-dir thuduj12/fst_itn_zh \
+ --hotword /workspace/models/hotwords.txt > log.out 2>&1 &
# 濡傛灉鎮ㄦ兂鍏抽棴ssl锛屽鍔犲弬鏁帮細--certfile 0
# 濡傛灉鎮ㄦ兂浣跨敤鏃堕棿鎴虫垨鑰卬n鐑瘝妯″瀷杩涜閮ㄧ讲锛岃璁剧疆--model-dir涓哄搴旀ā鍨嬶細
-# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堟椂闂存埑锛�
-# 鎴栬�� damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堢儹璇嶏級
-
+# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堟椂闂存埑锛�
+# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坣n鐑瘝锛�
+# 濡傛灉鎮ㄦ兂鍦ㄦ湇鍔$鍔犺浇鐑瘝锛岃鍦ㄥ涓绘満鏂囦欢./funasr-runtime-resources/models/hotwords.txt閰嶇疆鐑瘝锛坉ocker鏄犲皠鍦板潃涓�/workspace/models/hotwords.txt锛�:
+# 姣忚涓�涓儹璇嶏紝鏍煎紡(鐑瘝 鏉冮噸)锛氶樋閲屽反宸� 20
```
+濡傛灉鎮ㄦ兂瀹氬埗ngram锛屽弬鑰冩枃妗�([濡備綍璁粌LM](./lm_train_tutorial.md))
+
鏈嶅姟绔缁嗗弬鏁颁粙缁嶅彲鍙傝�僛鏈嶅姟绔敤娉曡瑙(#鏈嶅姟绔敤娉曡瑙�)
+
### 瀹㈡埛绔祴璇曚笌浣跨敤
涓嬭浇瀹㈡埛绔祴璇曞伐鍏风洰褰晄amples
@@ -70,34 +84,6 @@
```shell
python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --audio_in "../audio/asr_example.wav"
```
-
-------------------
-## Docker瀹夎
-
-涓嬭堪姝ラ涓烘墜鍔ㄥ畨瑁卍ocker鐜鐨勬楠わ細
-
-### docker鐜瀹夎
-```shell
-# Ubuntu锛�
-curl -fsSL https://test.docker.com -o test-docker.sh
-sudo sh test-docker.sh
-# Debian锛�
-curl -fsSL https://get.docker.com -o get-docker.sh
-sudo sh get-docker.sh
-# CentOS锛�
-curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
-# MacOS锛�
-brew install --cask --appdir=/Applications docker
-```
-
-瀹夎璇﹁锛歨ttps://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html
-
-### docker鍚姩
-
-```shell
-sudo systemctl start docker
-```
-
## 瀹㈡埛绔敤娉曡瑙�
@@ -137,7 +123,6 @@
```
鍛戒护鍙傛暟璇存槑锛�
-
```text
--server-ip 涓篎unASR runtime-SDK鏈嶅姟閮ㄧ讲鏈哄櫒ip锛岄粯璁や负鏈満ip锛�127.0.0.1锛夛紝濡傛灉client涓庢湇鍔′笉鍦ㄥ悓涓�鍙版湇鍔″櫒锛�
闇�瑕佹敼涓洪儴缃叉満鍣╥p
@@ -148,13 +133,11 @@
```
### Html缃戦〉鐗�
-
鍦ㄦ祻瑙堝櫒涓墦寮� html/static/index.html锛屽嵆鍙嚭鐜板涓嬮〉闈紝鏀寔楹﹀厠椋庤緭鍏ヤ笌鏂囦欢涓婁紶锛岀洿鎺ヨ繘琛屼綋楠�
<img src="images/html.png" width="900"/>
### Java-client
-
```shell
FunasrWsClient --host localhost --port 10095 --audio_in ./asr_example.wav --mode offline
```
@@ -228,6 +211,7 @@
濡傛灉锛屾偍甯屾湜閮ㄧ讲鎮╢inetune鍚庣殑妯″瀷锛堜緥濡�10epoch.pb锛夛紝闇�瑕佹墜鍔ㄥ皢妯″瀷閲嶅懡鍚嶄负model.pb锛屽苟灏嗗師modelscope涓ā鍨媘odel.pb鏇挎崲鎺夛紝灏嗚矾寰勬寚瀹氫负`model_dir`鍗冲彲銆�
+------------------
## 濡備綍瀹氬埗鏈嶅姟閮ㄧ讲
@@ -243,9 +227,6 @@
### 鑷畾涔夊鎴风锛�
濡傛灉鎮ㄦ兂瀹氫箟鑷繁鐨刢lient锛屽弬鑰僛websocket閫氫俊鍗忚](./websocket_protocol_zh.md)
-
-
-```
### c++ 鏈嶅姟绔細
diff --git a/runtime/docs/SDK_advanced_guide_online.md b/runtime/docs/SDK_advanced_guide_online.md
index 3a26db5..6c973f1 100644
--- a/runtime/docs/SDK_advanced_guide_online.md
+++ b/runtime/docs/SDK_advanced_guide_online.md
@@ -2,18 +2,32 @@
FunASR provides a real-time speech transcription service that can be easily deployed on local or cloud servers, with the FunASR runtime-SDK as the core. It integrates the speech endpoint detection (VAD), Paraformer-large non-streaming speech recognition (ASR), Paraformer-large streaming speech recognition (ASR), punctuation (PUNC), and other related capabilities open-sourced by the speech laboratory of DAMO Academy on the Modelscope community. The software package can perform real-time speech-to-text transcription, and can also accurately transcribe text at the end of sentences for high-precision output. The output text contains punctuation and supports high-concurrency multi-channel requests.
+<img src="images/online_structure.png" width="900"/>
+
+| TIME | INFO | IMAGE VERSION | IMAGE ID |
+|------------|-------------------------------------------------------------------------------------|-------------------------------------|--------------|
+| 2023.11.08 | supporting server-side loading of hotwords, adaptation to runtime structure changes | funasr-runtime-sdk-online-cpu-0.1.4 | 691974017c38 |
+| 2023.09.19 | supporting hotwords, timestamps, and ITN model in 2pass mode | funasr-runtime-sdk-online-cpu-0.1.2 | 7222c5319bcf |
+| 2023.08.11 | addressing some known bugs (including server crashes) | funasr-runtime-sdk-online-cpu-0.1.1 | bdbdd0b27dee |
+| 2023.08.07 | 1.0 released | funasr-runtime-sdk-online-cpu-0.1.0 | bdbdd0b27dee |
+
## Quick Start
-### Pull Docker Image
-
-Use the following command to pull and start the FunASR software package docker image:
-
+### Docker install
+If you have already installed Docker, ignore this step!
```shell
-sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.3
-mkdir -p ./funasr-runtime-resources/models
-sudo docker run -p 10095:10095 -it --privileged=true -v $PWD/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.3
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh;
+sudo bash install_docker.sh
```
If you do not have Docker installed, please refer to [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
+### Pull Docker Image
+Use the following command to pull and start the FunASR software package docker image:
+```shell
+sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.4
+mkdir -p ./funasr-runtime-resources/models
+sudo docker run -p 10096:10095 -it --privileged=true -v $PWD/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.4
+```
+
### Launching the Server
After Docker is launched, start the funasr-wss-server-2pass service program:
diff --git a/runtime/docs/SDK_advanced_guide_online_zh.md b/runtime/docs/SDK_advanced_guide_online_zh.md
index ce91655..d921e3d 100644
--- a/runtime/docs/SDK_advanced_guide_online_zh.md
+++ b/runtime/docs/SDK_advanced_guide_online_zh.md
@@ -5,29 +5,38 @@
鏈枃妗d负FunASR瀹炴椂杞啓鏈嶅姟寮�鍙戞寚鍗椼�傚鏋滄偍鎯冲揩閫熶綋楠屽疄鏃惰闊冲惉鍐欐湇鍔★紝鍙弬鑰僛蹇�熶笂鎵媇(#蹇�熶笂鎵�)銆�
+<img src="images/online_structure.png" width="900"/>
+
+| 鏃堕棿 | 璇︽儏 | 闀滃儚鐗堟湰 | 闀滃儚ID |
+|:-----------|:----------------------------------|-------------------------------------|--------------|
+| 2023.11.08 | 鏀寔鏈嶅姟绔姞杞界儹璇�(鏇存柊鐑瘝閫氫俊鍗忚)銆乺untime缁撴瀯鍙樺寲閫傞厤 | funasr-runtime-sdk-online-cpu-0.1.4 | 691974017c38 |
+| 2023.09.19 | 2pass妯″紡鏀寔鐑瘝銆佹椂闂存埑銆両TN妯″瀷 | funasr-runtime-sdk-online-cpu-0.1.2 | 7222c5319bcf |
+| 2023.08.11 | 淇浜嗛儴鍒嗗凡鐭ョ殑bug(鍖呮嫭server宕╂簝绛�) | funasr-runtime-sdk-online-cpu-0.1.1 | bdbdd0b27dee |
+| 2023.08.07 | 1.0 鍙戝竷 | funasr-runtime-sdk-online-cpu-0.1.0 | bdbdd0b27dee |
+
+
## 蹇�熶笂鎵�
### docker瀹夎
濡傛灉鎮ㄥ凡瀹夎docker锛屽拷鐣ユ湰姝ラ锛�!
閫氳繃涓嬭堪鍛戒护鍦ㄦ湇鍔″櫒涓婂畨瑁卍ocker锛�
```shell
-curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh锛�
+curl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh
sudo bash install_docker.sh
```
+docker瀹夎澶辫触璇峰弬鑰� [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)
### 闀滃儚鍚姩
-
閫氳繃涓嬭堪鍛戒护鎷夊彇骞跺惎鍔‵unASR杞欢鍖呯殑docker闀滃儚锛�
```shell
sudo docker pull \
- registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.3
+ registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.4
mkdir -p ./funasr-runtime-resources/models
-sudo docker run -p 10095:10095 -it --privileged=true \
+sudo docker run -p 10096:10095 -it --privileged=true \
-v $PWD/funasr-runtime-resources/models:/workspace/models \
- registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.3
+ registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.4
```
-濡傛灉鎮ㄦ病鏈夊畨瑁卍ocker锛屽彲鍙傝�僛Docker瀹夎](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker_zh.html)
### 鏈嶅姟绔惎鍔�
@@ -40,12 +49,15 @@
--model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \
- --itn-dir thuduj12/fst_itn_zh > log.out 2>&1 &
+ --itn-dir thuduj12/fst_itn_zh \
+ --hotword /workspace/models/hotwords.txt > log.out 2>&1 &
# 濡傛灉鎮ㄦ兂鍏抽棴ssl锛屽鍔犲弬鏁帮細--certfile 0
-# 濡傛灉鎮ㄦ兂浣跨敤鏃堕棿鎴虫垨鑰呯儹璇嶆ā鍨嬭繘琛岄儴缃诧紝璇疯缃�--model-dir涓哄搴旀ā鍨嬶細
-# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堟椂闂存埑锛�
-# 鎴栬�� damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堢儹璇嶏級
+# 濡傛灉鎮ㄦ兂浣跨敤鏃堕棿鎴虫垨鑰卬n鐑瘝妯″瀷杩涜閮ㄧ讲锛岃璁剧疆--model-dir涓哄搴旀ā鍨嬶細
+# damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx锛堟椂闂存埑锛�
+# damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx锛坣n鐑瘝锛�
+# 濡傛灉鎮ㄦ兂鍦ㄦ湇鍔$鍔犺浇鐑瘝锛岃鍦ㄥ涓绘満鏂囦欢./funasr-runtime-resources/models/hotwords.txt閰嶇疆鐑瘝锛坉ocker鏄犲皠鍦板潃涓�/workspace/models/hotwords.txt锛�:
+# 姣忚涓�涓儹璇嶏紝鏍煎紡(鐑瘝 鏉冮噸)锛氶樋閲屽反宸� 20
```
鏈嶅姟绔缁嗗弬鏁颁粙缁嶅彲鍙傝�僛鏈嶅姟绔敤娉曡瑙(#鏈嶅姟绔敤娉曡瑙�)
### 瀹㈡埛绔祴璇曚笌浣跨敤
diff --git a/runtime/docs/images/offline_structure.jpg b/runtime/docs/images/offline_structure.jpg
new file mode 100644
index 0000000..772f7c6
--- /dev/null
+++ b/runtime/docs/images/offline_structure.jpg
Binary files differ
diff --git a/runtime/docs/images/online_structure.png b/runtime/docs/images/online_structure.png
new file mode 100644
index 0000000..53731ba
--- /dev/null
+++ b/runtime/docs/images/online_structure.png
Binary files differ
diff --git a/runtime/docs/images/sdk_roadmap.jpg b/runtime/docs/images/sdk_roadmap.jpg
new file mode 100644
index 0000000..b8e5010
--- /dev/null
+++ b/runtime/docs/images/sdk_roadmap.jpg
Binary files differ
diff --git a/runtime/html5/static/index.html b/runtime/html5/static/index.html
index d9c6be7..d98c62b 100644
--- a/runtime/html5/static/index.html
+++ b/runtime/html5/static/index.html
@@ -52,7 +52,7 @@
</div>
<br>
<div style="border:2px solid #ccc;">
- 鐑瘝璁剧疆(涓�琛屼竴涓叧閿瓧锛岀┖鏍奸殧寮�鏉冮噸,濡�"闃块噷宸村反 20 hello world 40")锛�
+ 鐑瘝璁剧疆(涓�琛屼竴涓叧閿瓧锛岀┖鏍奸殧寮�鏉冮噸,濡�"闃块噷宸村反 20")锛�
<br>
diff --git a/runtime/html5/static/main.js b/runtime/html5/static/main.js
index 52a8d96..e8408e9 100644
--- a/runtime/html5/static/main.js
+++ b/runtime/html5/static/main.js
@@ -262,7 +262,7 @@
var obj = document.getElementById("varHot");
if(typeof(obj) == 'undefined' || obj==null || obj.value.length<=0){
- return "";
+ return null;
}
let val = obj.value.toString();
@@ -279,11 +279,11 @@
for(var i=0;i<result.length-1;i++)
wordstr=wordstr+result[i]+" ";
- jsonresult[wordstr.trim()]=result[result.length-1];
+ jsonresult[wordstr.trim()]= parseInt(result[result.length-1]);
}
}
console.log("jsonresult="+JSON.stringify(jsonresult));
- return jsonresult;
+ return JSON.stringify(jsonresult);
}
function getAsrMode(){
diff --git a/runtime/html5/static/wsconnecter.js b/runtime/html5/static/wsconnecter.js
index b9a786d..455e1a1 100644
--- a/runtime/html5/static/wsconnecter.js
+++ b/runtime/html5/static/wsconnecter.js
@@ -85,12 +85,13 @@
}
var hotwords=getHotwords();
- if(hotwords.length>0)
+
+ if(hotwords!=null )
{
request.hotwords=hotwords;
}
- console.log(request);
- speechSokt.send( JSON.stringify(request) );
+ console.log(JSON.stringify(request));
+ speechSokt.send(JSON.stringify(request));
console.log("杩炴帴鎴愬姛");
stateHandle(0);
diff --git a/runtime/python/onnxruntime/funasr_onnx/paraformer_bin.py b/runtime/python/onnxruntime/funasr_onnx/paraformer_bin.py
index 71cf434..7b13654 100644
--- a/runtime/python/onnxruntime/funasr_onnx/paraformer_bin.py
+++ b/runtime/python/onnxruntime/funasr_onnx/paraformer_bin.py
@@ -36,7 +36,6 @@
intra_op_num_threads: int = 4,
cache_dir: str = None
):
-
if not Path(model_dir).exists():
try:
from modelscope.hub.snapshot_download import snapshot_download
@@ -242,6 +241,13 @@
if not Path(model_dir).exists():
try:
+ from modelscope.hub.snapshot_download import snapshot_download
+ except:
+ raise "You are exporting model from modelscope, please install modelscope and try it again. To install modelscope, you could:\n" \
+ "\npip3 install -U modelscope\n" \
+ "For the users in China, you could install with the command:\n" \
+ "\npip3 install -U modelscope -i https://mirror.sjtu.edu.cn/pypi/web/simple"
+ try:
model_dir = snapshot_download(model_dir, cache_dir=cache_dir)
except:
raise "model_dir must be model_name in modelscope or local path downloaded from modelscope, but is {}".format(model_dir)
diff --git a/runtime/python/websocket/README.md b/runtime/python/websocket/README.md
index f318fd9..d50c8e1 100644
--- a/runtime/python/websocket/README.md
+++ b/runtime/python/websocket/README.md
@@ -110,15 +110,15 @@
#### Websocket api
```shell
- # class Funasr_websocket_recognizer example with 3 step
- # 1.create an recognizer
- rcg=Funasr_websocket_recognizer(host="127.0.0.1",port="30035",is_ssl=True,mode="2pass")
- # 2.send pcm data to asr engine and get asr result
- text=rcg.feed_chunk(data)
- print("text",text)
- # 3.get last result, set timeout=3
- text=rcg.close(timeout=3)
- print("text",text)
+# class Funasr_websocket_recognizer example with 3 step
+# 1.create an recognizer
+rcg=Funasr_websocket_recognizer(host="127.0.0.1",port="30035",is_ssl=True,mode="2pass")
+# 2.send pcm data to asr engine and get asr result
+text=rcg.feed_chunk(data)
+print("text",text)
+# 3.get last result, set timeout=3
+text=rcg.close(timeout=3)
+print("text",text)
```
## Acknowledge
diff --git a/runtime/python/websocket/funasr_wss_client.py b/runtime/python/websocket/funasr_wss_client.py
index 7c96553..66b3ce0 100644
--- a/runtime/python/websocket/funasr_wss_client.py
+++ b/runtime/python/websocket/funasr_wss_client.py
@@ -27,20 +27,16 @@
help="grpc server port")
parser.add_argument("--chunk_size",
type=str,
- default="0, 10, 5",
+ default="5, 10, 5",
help="chunk")
-parser.add_argument("--encoder_chunk_look_back",
- type=int,
- default=4,
- help="number of chunks to lookback for encoder self-attention")
-parser.add_argument("--decoder_chunk_look_back",
- type=int,
- default=1,
- help="number of encoder chunks to lookback for decoder cross-attention")
parser.add_argument("--chunk_interval",
type=int,
default=10,
help="chunk")
+parser.add_argument("--hotword",
+ type=str,
+ default="",
+ help="hotword file path, one hotword perline (e.g.:闃块噷宸村反 20)")
parser.add_argument("--audio_in",
type=str,
default=None,
@@ -61,11 +57,14 @@
type=str,
default=None,
help="output_dir")
-
parser.add_argument("--ssl",
type=int,
default=1,
help="1 for ssl connect, 0 for no ssl")
+parser.add_argument("--use_itn",
+ type=int,
+ default=1,
+ help="1 for using itn, 0 for not itn")
parser.add_argument("--mode",
type=str,
default="2pass",
@@ -106,10 +105,29 @@
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
+ # hotwords
+ fst_dict = {}
+ hotword_msg = ""
+ if args.hotword.strip() != "":
+ f_scp = open(args.hotword)
+ hot_lines = f_scp.readlines()
+ for line in hot_lines:
+ words = line.strip().split(" ")
+ if len(words) < 2:
+ print("Please checkout format of hotwords")
+ continue
+ try:
+ fst_dict[" ".join(words[:-1])] = int(words[-1])
+ except ValueError:
+ print("Please checkout format of hotwords")
+ hotword_msg=json.dumps(fst_dict)
- message = json.dumps({"mode": args.mode, "chunk_size": args.chunk_size, "encoder_chunk_look_back": args.encoder_chunk_look_back,
- "decoder_chunk_look_back": args.decoder_chunk_look_back, "chunk_interval": args.chunk_interval,
- "wav_name": "microphone", "is_speaking": True})
+ use_itn=True
+ if args.use_itn == 0:
+ use_itn=False
+
+ message = json.dumps({"mode": args.mode, "chunk_size": args.chunk_size, "chunk_interval": args.chunk_interval,
+ "wav_name": "microphone", "is_speaking": True, "hotwords":hotword_msg, "itn": use_itn})
#voices.put(message)
await websocket.send(message)
while True:
@@ -127,6 +145,31 @@
wavs = f_scp.readlines()
else:
wavs = [args.audio_in]
+
+ # hotwords
+ fst_dict = {}
+ hotword_msg = ""
+ if args.hotword.strip() != "":
+ f_scp = open(args.hotword)
+ hot_lines = f_scp.readlines()
+ for line in hot_lines:
+ words = line.strip().split(" ")
+ if len(words) < 2:
+ print("Please checkout format of hotwords")
+ continue
+ try:
+ fst_dict[" ".join(words[:-1])] = int(words[-1])
+ except ValueError:
+ print("Please checkout format of hotwords")
+ hotword_msg=json.dumps(fst_dict)
+ print (hotword_msg)
+
+ sample_rate = 16000
+ wav_format = "pcm"
+ use_itn=True
+ if args.use_itn == 0:
+ use_itn=False
+
if chunk_size > 0:
wavs = wavs[chunk_begin:chunk_begin + chunk_size]
for wav in wavs:
@@ -143,20 +186,13 @@
import wave
with wave.open(wav_path, "rb") as wav_file:
params = wav_file.getparams()
+ sample_rate = wav_file.getframerate()
frames = wav_file.readframes(wav_file.getnframes())
audio_bytes = bytes(frames)
else:
- import ffmpeg
- try:
- # This launches a subprocess to decode audio while down-mixing and resampling as necessary.
- # Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
- audio_bytes, _ = (
- ffmpeg.input(wav_path, threads=0)
- .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=16000)
- .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
- )
- except ffmpeg.Error as e:
- raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
+ wav_format = "others"
+ with open(wav_path, "rb") as f:
+ audio_bytes = f.read()
# stride = int(args.chunk_size/1000*16000*2)
stride = int(60 * args.chunk_size[1] / args.chunk_interval / 1000 * 16000 * 2)
@@ -164,8 +200,9 @@
# print(stride)
# send first time
- message = json.dumps({"mode": args.mode, "chunk_size": args.chunk_size, "chunk_interval": args.chunk_interval,
- "wav_name": wav_name, "is_speaking": True})
+ message = json.dumps({"mode": args.mode, "chunk_size": args.chunk_size, "chunk_interval": args.chunk_interval, "audio_fs":sample_rate,
+ "wav_name": wav_name, "wav_format": wav_format, "is_speaking": True, "hotwords":hotword_msg, "itn": use_itn})
+
#voices.put(message)
await websocket.send(message)
is_speaking = True
@@ -213,12 +250,17 @@
meg = await websocket.recv()
meg = json.loads(meg)
- # print(meg)
wav_name = meg.get("wav_name", "demo")
text = meg["text"]
+ timestamp=""
+ if "timestamp" in meg:
+ timestamp = meg["timestamp"]
if ibest_writer is not None:
- text_write_line = "{}\t{}\n".format(wav_name, text)
+ if timestamp !="":
+ text_write_line = "{}\t{}\t{}\n".format(wav_name, text, timestamp)
+ else:
+ text_write_line = "{}\t{}\n".format(wav_name, text)
ibest_writer.write(text_write_line)
if meg["mode"] == "online":
@@ -227,15 +269,15 @@
os.system('clear')
print("\rpid" + str(id) + ": " + text_print)
elif meg["mode"] == "offline":
- text_print += "{}".format(text)
+ if timestamp !="":
+ text_print += "{} timestamp: {}".format(text, timestamp)
+ else:
+ text_print += "{}".format(text)
+
# text_print = text_print[-args.words_max_print:]
# os.system('clear')
print("\rpid" + str(id) + ": " + wav_name + ": " + text_print)
- if ("is_final" in meg and meg["is_final"]==False):
- offline_msg_done = True
-
- if not "is_final" in meg:
- offline_msg_done = True
+ offline_msg_done = True
else:
if meg["mode"] == "2pass-online":
text_print_2pass_online += "{}".format(text)
diff --git a/runtime/readme.md b/runtime/readme.md
index 4489bb4..2676244 100644
--- a/runtime/readme.md
+++ b/runtime/readme.md
@@ -17,7 +17,7 @@
To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.
### Whats-new
-- 2023/11/08: Adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-en-cpu-0.1.1 ().
+- 2023/11/08: Adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-en-cpu-0.1.1 (27017f70f72a).
- 2023/10/16: English File Transcription Service 1.0 released, docker image version funasr-runtime-sdk-en-cpu-0.1.0 (e0de03eb0163), refer to the detailed documentation锛圼here](https://mp.weixin.qq.com/s/DZZUTj-6xwFfi-96ml--4A)锛�
### Technical Principles
@@ -39,7 +39,7 @@
In order to meet the needs of different users for different scenarios, different tutorials are prepared:
### Whats-new
-- 2023/11/08: Real-time Transcription Service 1.4 released, supporting server-side loading of hotwords (updated hotword communication protocol), adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-online-cpu-0.1.4().
+- 2023/11/08: Real-time Transcription Service 1.4 released, supporting server-side loading of hotwords (updated hotword communication protocol), adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-online-cpu-0.1.4(691974017c38).
- 2023/09/19: Real-time Transcription Service 1.2 released, supporting hotwords, timestamps, and ITN model in 2pass mode, docker image version funasr-runtime-sdk-online-cpu-0.1.2 (7222c5319bcf).
- 2023/08/11: Real-time Transcription Service 1.1 released, addressing some known bugs (including server crashes), docker image version funasr-runtime-sdk-online-cpu-0.1.1 (bdbdd0b27dee).
- 2023/08/07: Real-time Transcription Service 1.0 released, docker image version funasr-runtime-sdk-online-cpu-0.1.0(bdbdd0b27dee), refer to the detailed documentation锛圼here](https://mp.weixin.qq.com/s/8He081-FM-9IEI4D-lxZ9w)锛�
@@ -65,7 +65,7 @@
To meet the needs of different users, we have prepared different tutorials with text and images for both novice and advanced developers.
### Whats-new
-2023/11/08: File Transcription Service 3.0 released, supporting punctuation large model, Ngram model, fst hotwords (updated hotword communication protocol), server-side loading of hotwords, adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-cpu-0.3.0 (), refer to the detailed documentation 锛圼here]()锛�
+2023/11/08: File Transcription Service 3.0 released, supporting punctuation large model, Ngram model, fst hotwords (updated hotword communication protocol), server-side loading of hotwords, adaptation to runtime structure changes (FunASR/funasr/runtime -> FunASR/runtime), docker image version funasr-runtime-sdk-cpu-0.3.0 (caa64bddbb43), refer to the detailed documentation 锛圼here]()锛�
2023/09/19: File Transcription Service 2.2 released, supporting ITN model, docker image version funasr-runtime-sdk-cpu-0.2.2 (2c5286be13e9).
2023/08/22: File Transcription Service 2.0 released, integrated ffmpeg to support various audio and video inputs, supporting hotword model and timestamp model, docker image version funasr-runtime-sdk-cpu-0.2.0 (1ad3d19e0707), refer to the detailed documentation 锛圼here](https://mp.weixin.qq.com/s/oJHe0MKDqTeuIFH-F7GHMg)锛�
2023/07/03: File Transcription Service 1.0 released, docker image version funasr-runtime-sdk-cpu-0.1.0 (1ad3d19e0707), refer to the detailed documentation 锛圼here](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww)锛�
diff --git a/runtime/readme_cn.md b/runtime/readme_cn.md
index e31ea3f..89ea86a 100644
--- a/runtime/readme_cn.md
+++ b/runtime/readme_cn.md
@@ -2,8 +2,10 @@
English Version锛圼docs](./readme.md)锛�
-FunASR鏄敱杈炬懇闄㈣闊冲疄楠屽寮�婧愮殑涓�娆捐闊宠瘑鍒熀纭�妗嗘灦锛岄泦鎴愪簡璇煶绔偣妫�娴嬨�佽闊宠瘑鍒�佹爣鐐规柇鍙ョ瓑棰嗗煙鐨勫伐涓氱骇鍒ā鍨嬶紝鍚稿紩浜嗕紬澶氬紑鍙戣�呭弬涓庝綋楠屽拰寮�鍙戙�備负浜嗚В鍐冲伐涓氳惤鍦扮殑鏈�鍚庝竴鍏噷锛屽皢妯″瀷闆嗘垚鍒颁笟鍔′腑鍘伙紝鎴戜滑寮�鍙戜簡FunASR runtime-SDK銆�
-SDK 鏀寔浠ヤ笅鍑犵鏈嶅姟閮ㄧ讲锛�
+FunASR鏄敱闃块噷宸村反閫氫箟瀹為獙瀹よ闊冲洟闃熷紑婧愮殑涓�娆捐闊宠瘑鍒熀纭�妗嗘灦锛岄泦鎴愪簡璇煶绔偣妫�娴嬨�佽闊宠瘑鍒�佹爣鐐规柇鍙ョ瓑棰嗗煙鐨勫伐涓氱骇鍒ā鍨嬶紝鍚稿紩浜嗕紬澶氬紑鍙戣�呭弬涓庝綋楠屽拰寮�鍙戙�備负浜嗚В鍐冲伐涓氳惤鍦扮殑鏈�鍚庝竴鍏噷锛屽皢妯″瀷闆嗘垚鍒颁笟鍔′腑鍘伙紝鎴戜滑寮�鍙戜簡绀惧尯杞欢鍖呫��
+鏀寔浠ヤ笅鍑犵鏈嶅姟閮ㄧ讲锛�
+
+<img src="docs/images/sdk_roadmap.jpg" width="900"/>
- 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圕PU鐗堟湰锛夛紝宸插畬鎴�
- 涓枃娴佸紡璇煶璇嗗埆鏈嶅姟锛圕PU鐗堟湰锛夛紝宸插畬鎴�
@@ -17,20 +19,13 @@
涓轰簡鏀寔涓嶅悓鐢ㄦ埛鐨勯渶姹傦紝閽堝涓嶅悓鍦烘櫙锛屽噯澶囦簡涓嶅悓鐨勫浘鏂囨暀绋嬶細
### 鏈�鏂板姩鎬�
-- 2023/11/08: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.1 鍙戝竷锛宺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-en-cpu-0.1.1 ()
-- 2023/10/16: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-en-cpu-0.1.0 (e0de03eb0163)锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DZZUTj-6xwFfi-96ml--4A)锛�
+- 2023/11/08: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.1 鍙戝竷锛宺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-en-cpu-0.1.1 (27017f70f72a)
+- 2023/10/16: 鑻辨枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-en-cpu-0.1.0 (e0de03eb0163)锛屽師鐞嗕粙缁嶆枃妗o紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DZZUTj-6xwFfi-96ml--4A)锛�
-### 渚挎嵎閮ㄧ讲鏁欑▼
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏃犱慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘暀绋嬪弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_tutorial_en_zh.md)锛�
+### 閮ㄧ讲涓庡紑鍙戞枃妗�
-### 寮�鍙戞寚鍗�
-
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏈変慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_offline_en_zh.md)锛�
-
-### 鎶�鏈師鐞嗘彮绉�
-
-鏂囨。浠嬬粛浜嗚儗鍚庢妧鏈師鐞嗭紝璇嗗埆鍑嗙‘鐜囷紝璁$畻鏁堢巼绛夛紝浠ュ強鏍稿績浼樺娍浠嬬粛锛氫究鎹枫�侀珮绮惧害銆侀珮鏁堢巼銆侀暱闊抽閾捐矾锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DZZUTj-6xwFfi-96ml--4A)锛�
+閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛屾敮鎸佺敤鎴峰畾鍒舵湇鍔★紝璇︾粏鏂囨。鍙傝�冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_offline_en_zh.md)锛�
## 涓枃瀹炴椂璇煶鍚啓鏈嶅姟锛圕PU鐗堟湰锛�
@@ -38,23 +33,16 @@
涓轰簡鏀寔涓嶅悓鐢ㄦ埛鐨勯渶姹傦紝閽堝涓嶅悓鍦烘櫙锛屽噯澶囦簡涓嶅悓鐨勫浘鏂囨暀绋嬶細
### 鏈�鏂板姩鎬�
-- 2023/11/08: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.4 鍙戝竷锛屾敮鎸佹湇鍔$鍔犺浇鐑瘝(鏇存柊鐑瘝閫氫俊鍗忚)銆乺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.4 ()
+- 2023/11/08: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.4 鍙戝竷锛屾敮鎸佹湇鍔$鍔犺浇鐑瘝(鏇存柊鐑瘝閫氫俊鍗忚)銆乺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.4 (691974017c38)
- 2023/09/19: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.2 鍙戝竷锛�2pass妯″紡鏀寔鐑瘝銆佹椂闂存埑銆両TN妯″瀷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.2 (7222c5319bcf)
- 2023/08/11: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.1 鍙戝竷锛屼慨澶嶄簡閮ㄥ垎宸茬煡鐨刡ug(鍖呮嫭server宕╂簝绛�)锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.1 (bdbdd0b27dee)
-- 2023/08/07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.0 (bdbdd0b27dee)锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/8He081-FM-9IEI4D-lxZ9w)锛�
-
-### 渚挎嵎閮ㄧ讲鏁欑▼
-
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏃犱慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘暀绋嬪弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_tutorial_online_zh.md)锛�
+- 2023/08/07: 涓枃瀹炴椂璇煶鍚啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-online-cpu-0.1.0 (bdbdd0b27dee)锛屽師鐞嗕粙缁嶆枃妗o紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/8He081-FM-9IEI4D-lxZ9w)锛�
-### 寮�鍙戞寚鍗�
+### 閮ㄧ讲涓庡紑鍙戞枃妗�
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏈変慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_online_zh.md)锛�
+閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛屾敮鎸佺敤鎴峰畾鍒舵湇鍔★紝璇︾粏鏂囨。鍙傝�冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_online_zh.md)锛�
-### 鎶�鏈師鐞嗘彮绉�
-
-鏂囨。浠嬬粛浜嗚儗鍚庢妧鏈師鐞嗭紝璇嗗埆鍑嗙‘鐜囷紝璁$畻鏁堢巼绛夛紝浠ュ強鏍稿績浼樺娍浠嬬粛锛氫究鎹枫�侀珮绮惧害銆侀珮鏁堢巼銆侀暱闊抽閾捐矾锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/8He081-FM-9IEI4D-lxZ9w)锛�
## 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟锛圕PU鐗堟湰锛�
@@ -63,20 +51,14 @@
涓轰簡鏀寔涓嶅悓鐢ㄦ埛鐨勯渶姹傦紝閽堝涓嶅悓鍦烘櫙锛屽噯澶囦簡涓嶅悓鐨勫浘鏂囨暀绋嬶細
### 鏈�鏂板姩鎬�
-- 2023/11/08: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 3.0 鍙戝竷锛屾敮鎸佹爣鐐瑰ぇ妯″瀷銆佹敮鎸丯gram妯″瀷銆佹敮鎸乫st鐑瘝(鏇存柊鐑瘝閫氫俊鍗忚)銆佹敮鎸佹湇鍔$鍔犺浇鐑瘝銆乺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.3.0 ()锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ]()锛�
+
+- 2023/11/08: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 3.0 鍙戝竷锛屾敮鎸佹爣鐐瑰ぇ妯″瀷銆佹敮鎸丯gram妯″瀷銆佹敮鎸乫st鐑瘝(鏇存柊鐑瘝閫氫俊鍗忚)銆佹敮鎸佹湇鍔$鍔犺浇鐑瘝銆乺untime缁撴瀯鍙樺寲閫傞厤锛團unASR/funasr/runtime->FunASR/runtime锛夛紝dokcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.3.0 (caa64bddbb43)锛屽師鐞嗕粙缁嶆枃妗o紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/jSbnKw_m31BUUbTukPSOIw)锛�
- 2023/09/19: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 2.2 鍙戝竷锛屾敮鎸両TN妯″瀷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.2.2 (2c5286be13e9)
-- 2023/08/22: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 2.0 鍙戝竷锛岄泦鎴恌fmpeg鏀寔澶氱闊宠棰戣緭鍏ャ�佹敮鎸佺儹璇嶆ā鍨嬨�佹敮鎸佹椂闂存埑妯″瀷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.2.0 (1ad3d19e0707)锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/oJHe0MKDqTeuIFH-F7GHMg)锛�
-- 2023/07/03: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.1.0 (1ad3d19e0707)锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww)锛�
+- 2023/08/22: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 2.0 鍙戝竷锛岄泦鎴恌fmpeg鏀寔澶氱闊宠棰戣緭鍏ャ�佹敮鎸佺儹璇嶆ā鍨嬨�佹敮鎸佹椂闂存埑妯″瀷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.2.0 (1ad3d19e0707)锛屽師鐞嗕粙缁嶆枃妗o紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/oJHe0MKDqTeuIFH-F7GHMg)锛�
+- 2023/07/03: 涓枃绂荤嚎鏂囦欢杞啓鏈嶅姟 1.0 鍙戝竷锛宒okcer闀滃儚鐗堟湰funasr-runtime-sdk-cpu-0.1.0 (1ad3d19e0707)锛屽師鐞嗕粙缁嶆枃妗o紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww)锛�
-### 渚挎嵎閮ㄧ讲鏁欑▼
+### 閮ㄧ讲涓庡紑鍙戞枃妗�
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏃犱慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘暀绋嬪弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_tutorial_zh.md)锛�
+閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛屾敮鎸佺敤鎴峰畾鍒舵湇鍔★紝璇︾粏鏂囨。鍙傝�冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_offline_zh.md)锛�
-### 寮�鍙戞寚鍗�
-
-閫傜敤鍦烘櫙涓猴紝瀵规湇鍔¢儴缃睸DK鏈変慨鏀归渶姹傦紝閮ㄧ讲妯″瀷鏉ヨ嚜浜嶮odelScope锛屾垨鑰呯敤鎴穎inetune锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](./docs/SDK_advanced_guide_offline_zh.md)锛�
-
-### 鎶�鏈師鐞嗘彮绉�
-
-鏂囨。浠嬬粛浜嗚儗鍚庢妧鏈師鐞嗭紝璇嗗埆鍑嗙‘鐜囷紝璁$畻鏁堢巼绛夛紝浠ュ強鏍稿績浼樺娍浠嬬粛锛氫究鎹枫�侀珮绮惧害銆侀珮鏁堢巼銆侀暱闊抽閾捐矾锛岃缁嗘枃妗e弬鑰冿紙[鐐瑰嚮姝ゅ](https://mp.weixin.qq.com/s/DHQwbgdBWcda0w_L60iUww)锛�
--
Gitblit v1.9.1