From cfc8c117bd0faea95cf979830cccc7e1d904ea5c Mon Sep 17 00:00:00 2001 From: zhifu gao <zhifu.gzf@alibaba-inc.com> Date: 星期一, 17 四月 2023 19:13:59 +0800 Subject: [PATCH] Merge pull request #370 from alibaba-damo-academy/dev_lhn2 --- docs/modelscope_models.md | 17 +++++++++++------ 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/docs/modelscope_models.md b/docs/modelscope_models.md index 07e590b..d8c9b18 100644 --- a/docs/modelscope_models.md +++ b/docs/modelscope_models.md @@ -40,13 +40,18 @@ | [Conformer](https://modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/summary) | CN | AISHELL (178hours) | 4234 | 44M | Offline | Duration of input wav <= 20s | | [Conformer](https://www.modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/summary) | CN | AISHELL-2 (1000hours) | 5212 | 44M | Offline | Duration of input wav <= 20s | + +#### RNN-T Models + +### Multi-talker Speech Recognition Models + #### MFCCA Models | Model Name | Language | Training Data | Vocab Size | Parameter | Offline/Online | Notes | |:----------------------------------------------------------------------------------------------------------------------:|:--------:|:---------------------:|:----------:|:---------:|:--------------:|:--------------------------------------------------------------------------------------------------------------------------------| | [MFCCA](https://www.modelscope.cn/models/NPU-ASLP/speech_mfcca_asr-zh-cn-16k-alimeeting-vocab4950/summary) | CN | AliMeeting銆丄ISHELL-4銆丼imudata (917hours) | 4950 | 45M | Offline | Duration of input wav <= 20s, channel of input wav <= 8 channel -#### RNN-T Models + ### Voice Activity Detection Models @@ -70,14 +75,14 @@ ### Speaker Verification Models -| Model Name | Training Data | Parameters | Vocab Size | Notes | +| Model Name | Training Data | Parameters | Number Speaker | Notes | |:-------------------------------------------------------------------------------------------------------------:|:-----------------:|:----------:|:----------:|:------| -| [Xvector](https://www.modelscope.cn/models/damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch/summary) | CNCeleb (?hours) | 17.5M | 3465 | | -| [Xvector](https://www.modelscope.cn/models/damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch/summary) | CallHome (?hours) | 61M | 6135 | | +| [Xvector](https://www.modelscope.cn/models/damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch/summary) | CNCeleb (1,200 hours) | 17.5M | 3465 | Xvector, speaker verification, Chinese | +| [Xvector](https://www.modelscope.cn/models/damo/speech_xvector_sv-en-us-callhome-8k-spk6135-pytorch/summary) | CallHome (60 hours) | 61M | 6135 | Xvector, speaker verification, English | ### Speaker diarization Models | Model Name | Training Data | Parameters | Notes | |:----------------------------------------------------------------------------------------------------------------:|:-------------------:|:----------:|:------| -| [SOND](https://www.modelscope.cn/models/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch/summary) | AliMeeting (?hours) | 40.5M | | -| [SOND](https://www.modelscope.cn/models/damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch/summary) | CallHome (?hours) | 12M | | +| [SOND](https://www.modelscope.cn/models/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch/summary) | AliMeeting (120 hours) | 40.5M | Speaker diarization, profiles and records, Chinese | +| [SOND](https://www.modelscope.cn/models/damo/speech_diarization_sond-en-us-callhome-8k-n16k4-pytorch/summary) | CallHome (60 hours) | 12M | Speaker diarization, profiles and records, English | -- Gitblit v1.9.1