From e06317e9d0584086af2ea8c11baf822d71674c49 Mon Sep 17 00:00:00 2001 From: zhifu gao <zhifu.gzf@alibaba-inc.com> Date: 星期五, 05 五月 2023 15:50:57 +0800 Subject: [PATCH] Merge pull request #460 from alibaba-damo-academy/dev_zc --- docs/modelscope_models.md | 26 ++++++------ docs/modelscope_pipeline/itn_pipeline.md | 70 +++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+), 13 deletions(-) diff --git a/docs/modelscope_models.md b/docs/modelscope_models.md index 97ba333..04742dd 100644 --- a/docs/modelscope_models.md +++ b/docs/modelscope_models.md @@ -110,16 +110,16 @@ ### Inverse Text Normalization (ITN) Models -| Model Name | Language | Parameters | Notes | -|:----------------------------------------------------------------------------------------------------------------:|:--------:|:----------:|:------| -| [English](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-en/summary) | EN | 1.54M | ITN, ASR post processing | -| [Russian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ru/summary) | RU | 1.28M | ITN, ASR post processing | -| [Japanese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ja/summary) | JA | 6.8M | ITN, ASR post processing | -| [Korean](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ko/summary) | KO | 1.28M | InverASR post processing | -| [Indonesian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-id/summary) | ID | 2.06M | ITN, ASR post processing | -| [Vietnamese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-vi/summary) | VI | 0.92M | ITN, ASR post processing | -| [Tagalog](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-tl/summary) | TL | 1.28M | ITN, ASR post processing | -| [Spanish](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-es/summary) | ES | 1.28M | ITN, ASR post processing | -| [Portuguese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-pt/summary) | PT | 1.28M | ITN, ASR post processing | -| [French](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-fr/summary) | FR | 1.28M | InverASR post processing | -| [German](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-de/summary)| GE | 1.28M | ITN, ASR post processing | +| Model Name | Language | Parameters | Notes | +|:----------------------------------------------------------------------------------------------------------------:|:--------:|:----------:|:-------------------------| +| [English](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-en/summary) | EN | 1.54M | ITN, ASR post-processing | +| [Russian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ru/summary) | RU | 17.79M | ITN, ASR post-processing | +| [Japanese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ja/summary) | JA | 6.8M | ITN, ASR post-processing | +| [Korean](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ko/summary) | KO | 1.28M | ITN, ASR post-processing | +| [Indonesian](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-id/summary) | ID | 2.06M | ITN, ASR post-processing | +| [Vietnamese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-vi/summary) | VI | 0.92M | ITN, ASR post-processing | +| [Tagalog](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-tl/summary) | TL | 0.65M | ITN, ASR post-processing | +| [Spanish](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-es/summary) | ES | 1.32M | ITN, ASR post-processing | +| [Portuguese](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-pt/summary) | PT | 1.28M | ITN, ASR post-processing | +| [French](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-fr/summary) | FR | 4.39M | ITN, ASR post-processing | +| [German](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-de/summary)| GE | 3.95M | ITN, ASR post-processing | diff --git a/docs/modelscope_pipeline/itn_pipeline.md b/docs/modelscope_pipeline/itn_pipeline.md new file mode 100644 index 0000000..7f27f26 --- /dev/null +++ b/docs/modelscope_pipeline/itn_pipeline.md @@ -0,0 +1,70 @@ +# Inverse Text Normalization (ITN) + +> **Note**: +> The modelscope pipeline supports all the models in [model zoo](https://modelscope.cn/models?page=1&tasks=inverse-text-processing&type=audio) to inference. Here we take the model of the Japanese ITN model as example to demonstrate the usage. + +## Inference + +### Quick start +#### [Japanese ITN model](https://modelscope.cn/models/damo/speech_inverse_text_processing_fun-text-processing-itn-ja/summary) +```python +from modelscope.pipelines import pipeline +from modelscope.utils.constant import Tasks + +itn_inference_pipline = pipeline( + task=Tasks.inverse_text_processing, + model='damo/speech_inverse_text_processing_fun-text-processing-itn-ja', + model_revision=None) + +itn_result = itn_inference_pipline(text_in='鐧句簩鍗佷笁') +print(itn_result) +``` +- read text data directly. +```python +rec_result = inference_pipeline(text_in='涓�涔濅節涔濆勾銇獣鐢熴仐銇熷悓鍟嗗搧銇仭銇伩銆佺磩涓夊崄骞村墠銆佷簩鍗佸洓姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨��') +``` +- text stored via url锛宔xample锛歨ttps://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt +```python +rec_result = inference_pipeline(text_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt') +``` + +Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing/inverse_text_normalization) + +#### Modify Your Own ITN Model +The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules. After modify the rules, the users can export their own ITN models in local directory. + +##### Export ITN Model +Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below. +```shell +cd fun_text_processing/inverse_text_normalization/ +python export_models.py --language ja --export_dir ./itn_models/ +``` + +##### Evaluate ITN Model +Users can evaluate their own ITN model in local directory. Here is an example: +```shell +python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja +``` + +### API-reference +#### Define pipeline +- `task`: `Tasks.inverse_text_processing` +- `model`: model name in [model zoo](https://modelscope.cn/models?page=1&tasks=inverse-text-processing&type=audio), or model path in local disk +- `output_dir`: `None` (Default), the output path of results if set +- `model_revision`: `None` (Default), setting the model version + +#### Infer pipeline +- `text_in`: the input to decode, which could be: + - text bytes, `e.g.`: "涓�涔濅節涔濆勾銇獣鐢熴仐銇熷悓鍟嗗搧銇仭銇伩銆佺磩涓夊崄骞村墠銆佷簩鍗佸洓姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨��" + - text file, `e.g.`: https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt + In this case of `text file` input, `output_dir` must be set to save the output results + + +## Finetune with pipeline + +### Quick start + +### Finetune with your data + +## Inference with your finetuned model + -- Gitblit v1.9.1