| | |
| | | |
| | | [](https://pypi.org/project/funasr/) |
| | | |
| | | <p align="center"> |
| | | <a href="https://trendshift.io/repositories/3839" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3839" alt="alibaba-damo-academy%2FFunASR | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> |
| | | </p> |
| | | |
| | | <strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun! |
| | | |
| | |
| | | | [**Contact**](#contact) |
| | | |
| | | |
| | | |
| | | |
| | | <a name="highlights"></a> |
| | | ## Highlights |
| | | - FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR. FunASR provides convenient scripts and tutorials, supporting inference and fine-tuning of pre-trained models. |
| | |
| | | |
| | | <a name="whats-new"></a> |
| | | ## What's new: |
| | | - 2024/10/29: Real-time Transcription Service 1.12 released,The 2pass-offline mode supports the SensevoiceSmal model;([docs](runtime/readme.md)); |
| | | - 2024/10/10:Added support for the Whisper-large-v3-turbo model, a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. It can be downloaded from the [modelscope](examples/industrial_data_pretraining/whisper/demo.py), and [openai](examples/industrial_data_pretraining/whisper/demo_from_openai.py). |
| | | - 2024/09/26: Offline File Transcription Service 4.6, Offline File Transcription Service of English 1.7,Real-time Transcription Service 1.11 released,fix memory leak & Support the SensevoiceSmall onnx model;File Transcription Service 2.0 GPU released, Fix GPU memory leak; ([docs](runtime/readme.md)); |
| | | - 2024/09/25:keyword spotting models are new supported. Supports fine-tuning and inference for four models: [fsmn_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [fsmn_kws_mt](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [sanm_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-offline), [sanm_kws_streaming](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online). |