python/FunASR-XL.git

			@@ -29,6 +29,7 @@

			<a name="whats-new"></a>
			## What's new:
			- 2024/07/04：[SenseVoice](https://github.com/FunAudioLLM/SenseVoice) is a speech foundation model with multiple speech understanding capabilities, including ASR, LID, SER, and AED.
			- 2024/07/01: Offline File Transcription Service GPU 1.1 released, optimize BladeDISC model compatibility issues; ref to ([docs](runtime/readme.md))
			- 2024/06/27: Offline File Transcription Service GPU 1.0 released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU); ref to ([docs](runtime/readme.md))
			- 2024/05/15：emotion recognition models are new supported. [emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)，[emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)，[emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary). currently supports the following categories: 0: angry 1: happy 2: neutral 3: sad 4: unknown.
			@@ -91,7 +92,8 @@


			\| Model Name \| Task Details \| Training Data \| Parameters \|
			\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:\|:-----------------------------------------------------:\|:--------------------------------:\|:----------:\|
			\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:\|:--------------------------------------------------------------------------------:\|:--------------------------------:\|:----------:\|
			\| SenseVoiceSmall <br> ([⭐](https://www.modelscope.cn/models/iic/SenseVoiceSmall) [🤗](https://huggingface.co/FunAudioLLM/SenseVoiceSmall) ) \| multiple speech understanding capabilities, including ASR, LID, SER, and AED. \| 400000 hours \| 330M \|
			\| paraformer-zh <br> ([⭐](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) [🤗](https://huggingface.co/funasr/paraformer-zh) ) \| speech recognition, with timestamps, non-streaming \| 60000 hours, Mandarin \| 220M \|
			\| <nobr>paraformer-zh-streaming <br> ( [⭐](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/summary) [🤗](https://huggingface.co/funasr/paraformer-zh-streaming) )</nobr> \| speech recognition, streaming \| 60000 hours, Mandarin \| 220M \|
			\| paraformer-en <br> ( [⭐](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020/summary) [🤗](https://huggingface.co/funasr/paraformer-en) ) \| speech recognition, without timestamps, non-streaming \| 50000 hours, English \| 220M \|