From 882d565ebcd819bda3da065dbf9749b6f7c24407 Mon Sep 17 00:00:00 2001 From: zhifu gao <zhifu.gzf@alibaba-inc.com> Date: 星期五, 05 五月 2023 16:53:04 +0800 Subject: [PATCH] Merge pull request #462 from alibaba-damo-academy/dev_zc --- docs/modelscope_pipeline/itn_pipeline.md | 39 ++++++++++++++++----------------------- 1 files changed, 16 insertions(+), 23 deletions(-) diff --git a/docs/modelscope_pipeline/itn_pipeline.md b/docs/modelscope_pipeline/itn_pipeline.md index 7f27f26..2336842 100644 --- a/docs/modelscope_pipeline/itn_pipeline.md +++ b/docs/modelscope_pipeline/itn_pipeline.md @@ -18,10 +18,12 @@ itn_result = itn_inference_pipline(text_in='鐧句簩鍗佷笁') print(itn_result) +# 123 ``` - read text data directly. ```python rec_result = inference_pipeline(text_in='涓�涔濅節涔濆勾銇獣鐢熴仐銇熷悓鍟嗗搧銇仭銇伩銆佺磩涓夊崄骞村墠銆佷簩鍗佸洓姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨��') +# 1999骞淬伀瑾曠敓銇椼仧鍚屽晢鍝併伀銇°仾銇裤�佺磩30骞村墠銆�24姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨�� ``` - text stored via url锛宔xample锛歨ttps://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt ```python @@ -29,22 +31,6 @@ ``` Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing/inverse_text_normalization) - -#### Modify Your Own ITN Model -The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules. After modify the rules, the users can export their own ITN models in local directory. - -##### Export ITN Model -Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below. -```shell -cd fun_text_processing/inverse_text_normalization/ -python export_models.py --language ja --export_dir ./itn_models/ -``` - -##### Evaluate ITN Model -Users can evaluate their own ITN model in local directory. Here is an example: -```shell -python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja -``` ### API-reference #### Define pipeline @@ -59,12 +45,19 @@ - text file, `e.g.`: https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt In this case of `text file` input, `output_dir` must be set to save the output results +## Modify Your Own ITN Model +The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules for different languages. Let's take Japanese as an example, users can add their own whitelist in ```FunASR/fun_text_processing/inverse_text_normalization/ja/data/whitelist.tsv```. After modified the grammar rules, the users can export and evaluate their own ITN models in local directory. -## Finetune with pipeline +### Export ITN Model +Export ITN model via ```FunASR/fun_text_processing/inverse_text_normalization/export_models.py```. An example to export ITN model to local folder is shown as below. +```shell +cd FunASR/fun_text_processing/inverse_text_normalization/ +python export_models.py --language ja --export_dir ./itn_models/ +``` -### Quick start - -### Finetune with your data - -## Inference with your finetuned model - +### Evaluate ITN Model +Users can evaluate their own ITN model in local directory via ```FunASR/fun_text_processing/inverse_text_normalization/inverse_normalize.py```. Here is an example: +```shell +cd FunASR/fun_text_processing/inverse_text_normalization/ +python inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja +``` \ No newline at end of file -- Gitblit v1.9.1