python/FunASR-XL.git

			@@ -54,6 +54,7 @@


			#### Conformer

			\| Model Name \| Language \| Training Data \| Vocab Size \| Parameter \| Offline/Online \| Notes \|
			\|:----------------------------------------------------------------------------------------------------------------------:\|:--------:\|:---------------------:\|:----------:\|:---------:\|:--------------:\|:--------------------------------------------------------------------------------------------------------------------------------\|
			\| [Conformer](https://modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/summary) \| CN \| AISHELL (178hours) \| 4234 \| 44M \| Offline \| Duration of input wav <= 20s \|

			@@ -6,7 +6,9 @@
			- [FunASR: A Fundamental End-to-End Speech Recognition Toolkit](https://arxiv.org/abs/2305.11013), INTERSPEECH 2023
			- [BAT: Boundary aware transducer for memory-efficient and low-latency ASR](https://arxiv.org/abs/2305.11571), INTERSPEECH 2023
			- [Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition](https://arxiv.org/abs/2206.08317), INTERSPEECH 2022
			- [Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model](https://arxiv.org/abs/2010.14099), arXiv preprint arXiv:2010.14099, 2020.
			- [E-branchformer: Branchformer with enhanced merging for speech recognition](https://arxiv.org/abs/2210.00077), SLT 2022
			- [Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding](https://proceedings.mlr.press/v162/peng22a.html?ref=https://githubhelp.com), ICML 2022
			- [Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model](https://arxiv.org/abs/2010.14099), arXiv preprint arXiv:2010.14099, 2020
			- [San-m: Memory equipped self-attention for end-to-end speech recognition](https://arxiv.org/pdf/2006.01713), INTERSPEECH 2020
			- [Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition](https://arxiv.org/abs/2006.01712), INTERSPEECH 2020
			- [Conformer: Convolution-augmented Transformer for Speech Recognition](https://arxiv.org/abs/2005.08100), INTERSPEECH 2020

	docs/model_zoo/modelscope_models.md	1 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	docs/reference/papers.md	4 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

			@@ -54,6 +54,7 @@


			#### Conformer

			\| Model Name \| Language \| Training Data \| Vocab Size \| Parameter \| Offline/Online \| Notes \|
			\|:----------------------------------------------------------------------------------------------------------------------:\|:--------:\|:---------------------:\|:----------:\|:---------:\|:--------------:\|:--------------------------------------------------------------------------------------------------------------------------------\|
			\| [Conformer](https://modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/summary) \| CN \| AISHELL (178hours) \| 4234 \| 44M \| Offline \| Duration of input wav <= 20s \|