From e6d127b0c92543db4f988149687db2bac4e41f33 Mon Sep 17 00:00:00 2001 From: 游雁 <zhifu.gzf@alibaba-inc.com> Date: 星期四, 09 二月 2023 17:53:48 +0800 Subject: [PATCH] readme --- README.md | 65 ++++++++++++-------------------- 1 files changed, 25 insertions(+), 40 deletions(-) diff --git a/README.md b/README.md index 6bf1278..c206fb6 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,19 @@ # FunASR: A Fundamental End-to-End Speech Recognition Toolkit -<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun锛乕Model Zoo](docs/modelscope_models.md) +<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition), researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun锛� -## Release Notes: +[**News**](https://github.com/alibaba-damo-academy/FunASR#whats-new) +| [**Highlights**](#highlights) +| [**Installation**](#installation) +| [**Docs**](https://alibaba-damo-academy.github.io/FunASR/index.html) +| [**Tutorial**](https://github.com/alibaba-damo-academy/FunASR/wiki#funasr%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C) +| [**Papers**](https://github.com/alibaba-damo-academy/FunASR#citations) +| [**Runtime**](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime) +| [**Model Zoo**](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) +| [**Contact**](#contact) + +## What's new: ### 2023.1.16, funasr-0.1.6 - We release a new version model [Paraformer-large-long](https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary), which integrate the [VAD](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) model, [ASR](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary), [Punctuation](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary) model and timestamp together. The model could take in several hours long inputs. @@ -16,54 +26,23 @@ - We improve the pipeline of modelscope to speedup the inference, by integrating the process of build model into build pipeline. - Various new types of audio input types are now supported by modelscope inference pipeline, including wav.scp, wav format, audio bytes, wave samples... -## Key Features +## Highlights - Many types of typical models are supported, e.g., [Tranformer](https://arxiv.org/abs/1706.03762), [Conformer](https://arxiv.org/abs/2005.08100), [Paraformer](https://arxiv.org/abs/2206.08317). - We have released large number of academic and industrial pretrained models on [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition) - The pretrained model [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) obtains the best performance on many tasks in [SpeechIO leaderboard](https://github.com/SpeechColab/Leaderboard) - FunASR supplies a easy-to-use pipeline to finetune pretrained models from [ModelScope](https://www.modelscope.cn/models?page=1&tasks=auto-speech-recognition) - Compared to [Espnet](https://github.com/espnet/espnet) framework, the training speed of large-scale datasets in FunASR is much faster owning to the optimized dataloader. -## Installation(Training and Developing) - -- Install Conda: -``` sh -wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -sh Miniconda3-latest-Linux-x86_64.sh -source ~/.bashrc -conda create -n funasr python=3.7 -conda activate funasr -``` - -- Install Pytorch (version >= 1.7.0): -``` sh -pip3 install torch torchvision torchaudio -``` -For more versions, please see [https://pytorch.org/get-started/locally](https://pytorch.org/get-started/locally) - - -If you are in the area of China, you could set the source to speed the downloading. - -``` sh -pip config set global.index-url https://mirror.sjtu.edu.cn/pypi/web/simple -``` - -- Install ModelScope: -``` sh -pip install "modelscope[audio]" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html -``` - -For more details about modelscope, please see [modelscope installation](https://modelscope.cn/docs/%E7%8E%AF%E5%A2%83%E5%AE%89%E8%A3%85) - -- Install FunASR and other packages: +## Installation ``` sh git clone https://github.com/alibaba/FunASR.git && cd FunASR pip install --editable ./ ``` +For more details, please ref to [installation](https://github.com/alibaba-damo-academy/FunASR/wiki) -## Pretrained Model Zoo - -We have trained many academic and industrial models, [model hub](docs/modelscope_models.md) +## Usage +For users who are new to FunASR and ModelScope, please refer to [FunASR Docs](https://alibaba-damo-academy.github.io/FunASR/index.html). ## Contact @@ -71,15 +50,21 @@ - email: [funasr@list.alibaba-inc.com](funasr@list.alibaba-inc.com) -- Dingding group: -<div align="left"><img src="docs/images/dingding.jpg" width="250"/>!<img src="docs/images/wechat.png" width="222"/></div> +|Dingding group | Wechat group| +|:---:|:---:| +|<div align="left"><img src="docs/images/dingding.jpg" width="250"/> |<img src="docs/images/wechat.png" width="222"/></div>| +## Contributors + +| <div align="left"><img src="docs/images/DeepScience.png" width="250"/> | +|:---:| ## Acknowledge 1. We borrowed a lot of code from [Kaldi](http://kaldi-asr.org/) for data preparation. 2. We borrowed a lot of code from [ESPnet](https://github.com/espnet/espnet). FunASR follows up the training and finetuning pipelines of ESPnet. 3. We referred [Wenet](https://github.com/wenet-e2e/wenet) for building dataloader for large scale data training. +4. We acknowledge [DeepScience](https://www.deepscience.cn) for contributing the grpc service. ## License This project is licensed under the [The MIT License](https://opensource.org/licenses/MIT). FunASR also contains various third-party components and some code modified from other repos under other open source licenses. -- Gitblit v1.9.1