游雁
2024-12-17 fdafd3f6bc2f04d16e7cab5afcdb1257e87a8a78
README.md
@@ -8,6 +8,9 @@
[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/)
<p align="center">
<a href="https://trendshift.io/repositories/3839" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3839" alt="alibaba-damo-academy%2FFunASR | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</p>
<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!
@@ -21,6 +24,8 @@
| [**Contact**](#contact)
<a name="highlights"></a>
## Highlights
- FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR. FunASR provides convenient scripts and tutorials, supporting inference and fine-tuning of pre-trained models.
@@ -29,6 +34,7 @@
<a name="whats-new"></a>
## What's new:
- 2024/10/29: Real-time Transcription Service 1.12 released,The 2pass-offline mode supports the SensevoiceSmal model;([docs](runtime/readme.md));
- 2024/10/10:Added support for the Whisper-large-v3-turbo model, a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. It can be downloaded from the [modelscope](examples/industrial_data_pretraining/whisper/demo.py), and [openai](examples/industrial_data_pretraining/whisper/demo_from_openai.py).
- 2024/09/26: Offline File Transcription Service 4.6, Offline File Transcription Service of English 1.7,Real-time Transcription Service 1.11 released,fix memory leak & Support the SensevoiceSmall onnx model;File Transcription Service 2.0 GPU released, Fix GPU memory leak; ([docs](runtime/readme.md));
- 2024/09/25:keyword spotting models are new supported. Supports fine-tuning and inference for four models: [fsmn_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [fsmn_kws_mt](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [sanm_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-offline), [sanm_kws_streaming](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online).