python/FunASR-XL.git

			@@ -7,24 +7,18 @@
			[News](https://github.com/alibaba-damo-academy/FunASR#whats-new)
			\| [Highlights](#highlights)
			\| [Installation](#installation)
			\| [Docs](https://alibaba-damo-academy.github.io/FunASR/index.html)
			\| [Docs_CN](https://alibaba-damo-academy.github.io/FunASR/cn/index.html)
			\| [Docs_EN](https://alibaba-damo-academy.github.io/FunASR/en/index.html)
			\| [Tutorial](https://github.com/alibaba-damo-academy/FunASR/wiki#funasr%E7%94%A8%E6%88%B7%E6%89%8B%E5%86%8C)
			\| [Papers](https://github.com/alibaba-damo-academy/FunASR#citations)
			\| [Runtime](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/runtime)
			\| [Model Zoo](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)
			\| [Contact](#contact)


			## What's new:
			### 2023.1.16, funasr-0.1.6
			- We release a new version model [Paraformer-large-long](https://modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary), which integrate the [VAD](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary) model, [ASR](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary),
			[Punctuation](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary) model and timestamp together. The model could take in several hours long inputs.
			- We release a new type model, [VAD](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/summary), which could predict the duration of none-silence speech. It could be freely integrated with any ASR models in [Model Zoo](docs/modelscope_models.md).
			- We release a new type model, [Punctuation](https://www.modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary), which could predict the punctuation of ASR models's results. It could be freely integrated with any ASR models in [Model Zoo](docs/modelscope_models.md).
			- We release a new model, [Data2vec](https://www.modelscope.cn/models/damo/speech_data2vec_pretrain-zh-cn-aishell2-16k-pytorch/summary), an unsupervised pretraining model which could be finetuned on ASR and other downstream tasks.
			- We release a new model, [Paraformer-Tiny](https://www.modelscope.cn/models/damo/speech_paraformer-tiny-commandword_asr_nat-zh-cn-16k-vocab544-pytorch/summary), a lightweight Paraformer model which supports Mandarin command words recognition.
			- We release a new type model, [SV](https://www.modelscope.cn/models/damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch/summary), which could extract speaker embeddings and further perform speaker verification on paired utterances. It will be supported for speaker diarization in the future version.
			- We improve the pipeline of modelscope to speedup the inference, by integrating the process of build model into build pipeline.
			- Various new types of audio input types are now supported by modelscope inference pipeline, including wav.scp, wav format, audio bytes, wave samples...

			For the release notes, please ref to [news](https://github.com/alibaba-damo-academy/FunASR/releases)

			## Highlights
			- Many types of typical models are supported, e.g., [Tranformer](https://arxiv.org/abs/1706.03762), [Conformer](https://arxiv.org/abs/2005.08100), [Paraformer](https://arxiv.org/abs/2206.08317).
			@@ -36,13 +30,14 @@
			## Installation

			``` sh
			pip install "modelscope[audio_asr]" --upgrade -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
			git clone https://github.com/alibaba/FunASR.git && cd FunASR
			pip install --editable ./
			```
			For more details, please ref to [installation](https://github.com/alibaba-damo-academy/FunASR/wiki)

			## Usage
			For users who are new to FunASR and ModelScope, please refer to [FunASR Docs](https://alibaba-damo-academy.github.io/FunASR/index.html).
			For users who are new to FunASR and ModelScope, please refer to FunASR Docs([CN](https://alibaba-damo-academy.github.io/FunASR/cn/index.html) / [EN](https://alibaba-damo-academy.github.io/FunASR/en/index.html))

			## Contact

			@@ -50,14 +45,14 @@

			- email: [funasr@list.alibaba-inc.com](funasr@list.alibaba-inc.com)

			\|Dingding group \| Wechat group\|
			\|:---:\|:---:\|
			\|<div align="left"><img src="docs/images/dingding.jpg" width="250"/> \|<img src="docs/images/wechat.png" width="222"/></div>\|
			\|Dingding group \| Wechat group \|
			\|:---:\|:-----------------------------------------------------:\|
			\|<div align="left"><img src="docs/images/dingding.jpg" width="250"/> \| <img src="docs/images/wechat.png" width="232"/></div> \|

			## Contributors

			\| <div align="left"><img src="docs/images/DeepScience.png" width="250"/> \|
			\|:---:\|
			\| <div align="left"><img src="docs/images/damo.png" width="180"/> \| <div align="left"><img src="docs/images/nwpu.png" width="260"/> \| <img src="docs/images/DeepScience.png" width="200"/> </div> \|
			\|:---------------------------------------------------------------:\|:---------------------------------------------------------------:\|:-----------------------------------------------------------:\|

			## Acknowledge

			@@ -85,4 +80,10 @@
			booktitle={INTERSPEECH},
			year={2022}
			}
			```
			@inproceedings{Shi2023AchievingTP,
			title={Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model},
			author={Xian Shi and Yanni Chen and Shiliang Zhang and Zhijie Yan},
			booktitle={arXiv preprint arXiv:2301.12343}
			year={2023}
			}
			```