| egs/wenetspeech/conformer/conf/train_asr_conformer.yaml | ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史 | |
| funasr/datasets/large_datasets/dataset.py | ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史 |
egs/wenetspeech/conformer/conf/train_asr_conformer.yaml
@@ -90,7 +90,7 @@ dataset_conf: data_names: speech,text data_types: sound,text data_types: sound,text_nospace shuffle: True shuffle_conf: shuffle_size: 2048 funasr/datasets/large_datasets/dataset.py
@@ -148,6 +148,12 @@ if "key" not in sample_dict: sample_dict["key"] = segs[0] sample_dict['hw_tag'] = 1 elif data_type == "text_nospace": text = item segs = text.strip().split(maxsplit=1) sample_dict[data_name] = [x for x in segs[1]] if "key" not in sample_dict: sample_dict["key"] = segs[0] else: text = item segs = text.strip().split()