python/FunASR-XL.git - Gitblit

python / FunASR-XL

FUNASR训练

parent: 3a15e539 | 补丁 | 提交 | ignore whitespace

嘉渊

2023-05-26 167bab54bbcc0e2b0143e0c2fedce06ee8326ad5

update repo

2个文件已修改

	egs/wenetspeech/conformer/conf/train_asr_conformer.yaml	2 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	funasr/datasets/large_datasets/dataset.py	6 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

 egs/wenetspeech/conformer/conf/train_asr_conformer.yaml

@@ -90,7 +90,7 @@

dataset_conf:
    data_names: speech,text
    data_types: sound,text
    data_types: sound,text_nospace
    shuffle: True
    shuffle_conf:
        shuffle_size: 2048

 funasr/datasets/large_datasets/dataset.py

@@ -148,6 +148,12 @@
                        if "key" not in sample_dict:
                            sample_dict["key"] = segs[0]
                        sample_dict['hw_tag'] = 1
                    elif data_type == "text_nospace":
                        text = item
                        segs = text.strip().split(maxsplit=1)
                        sample_dict[data_name] = [x for x in segs[1]]
                        if "key" not in sample_dict:
                            sample_dict["key"] = segs[0]
                    else:
                        text = item
                        segs = text.strip().split()