From 9723253549110c6a210001f4c7ec3912a37b874c Mon Sep 17 00:00:00 2001 From: chong.zhang <chong.zhang@alibaba-inc.com> Date: 星期五, 05 五月 2023 16:35:21 +0800 Subject: [PATCH] add itn_pipeline.md --- docs/modelscope_pipeline/itn_pipeline.md | 35 ++++++++++++++++++----------------- 1 files changed, 18 insertions(+), 17 deletions(-) diff --git a/docs/modelscope_pipeline/itn_pipeline.md b/docs/modelscope_pipeline/itn_pipeline.md index 15380f3..4d3f6b3 100644 --- a/docs/modelscope_pipeline/itn_pipeline.md +++ b/docs/modelscope_pipeline/itn_pipeline.md @@ -18,10 +18,12 @@ itn_result = itn_inference_pipline(text_in='鐧句簩鍗佷笁') print(itn_result) +# 123 ``` - read text data directly. ```python rec_result = inference_pipeline(text_in='涓�涔濅節涔濆勾銇獣鐢熴仐銇熷悓鍟嗗搧銇仭銇伩銆佺磩涓夊崄骞村墠銆佷簩鍗佸洓姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨��') +# 1999骞淬伀瑾曠敓銇椼仧鍚屽晢鍝併伀銇°仾銇裤�佺磩30骞村墠銆�24姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨�� ``` - text stored via url锛宔xample锛歨ttps://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt ```python @@ -29,22 +31,6 @@ ``` Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing/inverse_text_normalization) - -### Modify Your Own ITN Model -The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules. After modify the rules, the users can export their own ITN models in local directory. - -### Export ITN Model -Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below. -```shell -cd fun_text_processing/inverse_text_normalization/ -python export_models.py --language ja --export_dir ./itn_models/ -``` - -### Evaluate ITN Model -Users can evaluate their own ITN model in local directory. Here is an example: -```shell -python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja -``` ### API-reference #### Define pipeline @@ -58,4 +44,19 @@ - text bytes, `e.g.`: "涓�涔濅節涔濆勾銇獣鐢熴仐銇熷悓鍟嗗搧銇仭銇伩銆佺磩涓夊崄骞村墠銆佷簩鍗佸洓姝炽伄闋冦伄骞稿洓閮庛伄鍐欑湡銈掑叕闁嬨��" - text file, `e.g.`: https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt In this case of `text file` input, `output_dir` must be set to save the output results - \ No newline at end of file + +## Modify Your Own ITN Model +The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules for different languages. Let's take Japanese as an example, users can add their own whitelist in fun_text_processing/inverse_text_normalization/ja/data/whitelist.tsv. After modify the rules, the users can export their own ITN models in local directory. + +### Export ITN Model +Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below. +```shell +cd fun_text_processing/inverse_text_normalization/ +python export_models.py --language ja --export_dir ./itn_models/ +``` + +### Evaluate ITN Model +Users can evaluate their own ITN model in local directory. Here is an example: +```shell +python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja +``` \ No newline at end of file -- Gitblit v1.9.1