From 8b3b594303fda83dba30770cb061334707460b04 Mon Sep 17 00:00:00 2001 From: chong.zhang <chong.zhang@alibaba-inc.com> Date: 星期五, 05 五月 2023 16:43:06 +0800 Subject: [PATCH] update itn_pipeline.md --- docs/modelscope_models.md | 28 +++++++++++++++------------- 1 files changed, 15 insertions(+), 13 deletions(-) diff --git a/docs/modelscope_models.md b/docs/modelscope_models.md index e7c754c..04742dd 100644 --- a/docs/modelscope_models.md +++ b/docs/modelscope_models.md @@ -53,6 +53,7 @@ |:----------------------------------------------------------------------------------------------------------------------:|:--------:|:---------------------:|:----------:|:---------:|:--------------:|:--------------------------------------------------------------------------------------------------------------------------------| | [Conformer](https://modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch/summary) | CN | AISHELL (178hours) | 4234 | 44M | Offline | Duration of input wav <= 20s | | [Conformer](https://www.modelscope.cn/models/damo/speech_conformer_asr_nat-zh-cn-16k-aishell2-vocab5212-pytorch/summary) | CN | AISHELL-2 (1000hours) | 5212 | 44M | Offline | Duration of input wav <= 20s | +| [Conformer](https://modelscope.cn/models/damo/speech_conformer_asr-en-16k-vocab4199-pytorch/summary) | EN | Alibaba Speech Data (10000hours) | 4199 | 220M | Offline | Duration of input wav <= 20s | #### RNN-T Models @@ -108,16 +109,17 @@ | [TP-Aligner](https://modelscope.cn/models/damo/speech_timestamp_prediction-v1-16k-offline/summary) | CN | Alibaba Speech Data (50000hours) | 37.8M | Timestamp prediction, Mandarin, middle size | ### Inverse Text Normalization (ITN) Models -| Model Name | Language | Parameters | Notes | -|:----------------------------------------------------------------------------------------------------------------:|:--------:|:----------:|:------| -| [English](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-en/summary) | EN | 1.54M | ITN, ASR post processing | -| [Russian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ru/summary) | RU | 1.28M | ITN, ASR post processing | -| [Japanese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ja/summary) | JA | 6.8M | ITN, ASR post processing | -| [Korean](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ko/summary) | KO | 1.28M | InverASR post processing | -| [Indonesian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-id/summary) | ID | 2.06M | ITN, ASR post processing | -| [Vietnamese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-vi/summary) | VI | 0.92M | ITN, ASR post processing | -| [Tagalog](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-tl/summary) | TL | 1.28M | ITN, ASR post processing | -| [Spanish](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-es/summary) | ES | 1.28M | ITN, ASR post processing | -| [Portuguese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-pt/summary) | PT | 1.28M | ITN, ASR post processing | -| [French](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-fr/summary) | FR | 1.28M | InverASR post processing | -| [German](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-de/summary)| GE | 1.28M | ITN, ASR post processing | + +| Model Name | Language | Parameters | Notes | +|:----------------------------------------------------------------------------------------------------------------:|:--------:|:----------:|:-------------------------| +| [English](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-en/summary) | EN | 1.54M | ITN, ASR post-processing | +| [Russian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ru/summary) | RU | 17.79M | ITN, ASR post-processing | +| [Japanese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ja/summary) | JA | 6.8M | ITN, ASR post-processing | +| [Korean](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ko/summary) | KO | 1.28M | ITN, ASR post-processing | +| [Indonesian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-id/summary) | ID | 2.06M | ITN, ASR post-processing | +| [Vietnamese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-vi/summary) | VI | 0.92M | ITN, ASR post-processing | +| [Tagalog](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-tl/summary) | TL | 0.65M | ITN, ASR post-processing | +| [Spanish](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-es/summary) | ES | 1.32M | ITN, ASR post-processing | +| [Portuguese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-pt/summary) | PT | 1.28M | ITN, ASR post-processing | +| [French](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-fr/summary) | FR | 4.39M | ITN, ASR post-processing | +| [German](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-de/summary)| GE | 3.95M | ITN, ASR post-processing | -- Gitblit v1.9.1