python/FunASR-XL.git

parent: 5071e7b4 | 补丁 | 提交 | ignore whitespace

雾聪

2023-12-20 b635c062f1550be59047168fcb48a39542913a57

update TimestampSentence

3个文件已修改

	runtime/docs/websocket_protocol.md	4 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/docs/websocket_protocol_zh.md	4 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史
	runtime/onnxruntime/src/util.cpp	6 ●●●●● 补丁 \| 查看 \| 原始文档 \| blame \| 历史

 runtime/docs/websocket_protocol.md

@@ -45,7 +45,7 @@
`text`: the text output of speech recognition
`is_final`: indicating the end of recognition
`timestamp`：If AM is a timestamp model, it will return this field, indicating the timestamp, in the format of "[[100,200], [200,500]]"
`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text':'正 是 因 为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text_seg':'正 是 因 为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
```

## Real-time Speech Recognition
@@ -94,5 +94,5 @@
`text`: the text output of speech recognition
`is_final`: indicating the end of recognition
`timestamp`：If AM is a timestamp model, it will return this field, indicating the timestamp, in the format of "[[100,200], [200,500]]"
`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text':'正 是 因 为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text_seg':'正 是 因 为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
```

 runtime/docs/websocket_protocol_zh.md

@@ -46,7 +46,7 @@
`text`：表示语音识别输出文本
`is_final`：表示识别结束
`timestamp`：如果AM为时间戳模型，会返回此字段，表示时间戳，格式为 "[[100,200], [200,500]]"(ms)
`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text':'正 是 因 为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text_seg':'正 是 因 为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
```

## 实时语音识别
@@ -96,5 +96,5 @@
`text`：表示语音识别输出文本
`is_final`：表示识别结束
`timestamp`：如果AM为时间戳模型，会返回此字段，表示时间戳，格式为 "[[100,200], [200,500]]"(ms)
`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text':'正 是 因 为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text_seg':'正 是 因 为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
```

 runtime/onnxruntime/src/util.cpp

@@ -584,7 +584,8 @@
                }
            }
            // format
            ts_sent += "{'text':'" + text_seg + "',";
            ts_sent += "{'text_seg':'" + text_seg + "',";
            ts_sent += "'punc':'" + characters[idx_str] + "',";
            ts_sent += "'start':'" + to_string(start) + "',";
            ts_sent += "'end':'" + to_string(end) + "',";
            ts_sent += "'ts_list':" + VectorToString(ts_seg) + "}";
@@ -620,7 +621,8 @@
            end = ts_seg[ts_seg.size()-1][1];
        }
        // format
        ts_sent += "{'text':'" + text_seg + "',";
        ts_sent += "{'text_seg':'" + text_seg + "',";
        ts_sent += "'punc':'',";
        ts_sent += "'start':'" + to_string(start) + "',";
        ts_sent += "'end':'" + to_string(end) + "',";
        ts_sent += "'ts_list':" + VectorToString(ts_seg) + "}";

			@@ -45,7 +45,7 @@
			`text`: the text output of speech recognition
			`is_final`: indicating the end of recognition
			`timestamp`：If AM is a timestamp model, it will return this field, indicating the timestamp, in the format of "[[100,200], [200,500]]"
			`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text':'正是因为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text_seg':'正是因为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			```

			## Real-time Speech Recognition
			@@ -94,5 +94,5 @@
			`text`: the text output of speech recognition
			`is_final`: indicating the end of recognition
			`timestamp`：If AM is a timestamp model, it will return this field, indicating the timestamp, in the format of "[[100,200], [200,500]]"
			`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text':'正是因为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			`stamp_sents`：If AM is a timestamp model, it will return this field, indicating the stamp_sents, in the format of "[{'text_seg':'正是因为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			```

			@@ -46,7 +46,7 @@
			`text`：表示语音识别输出文本
			`is_final`：表示识别结束
			`timestamp`：如果AM为时间戳模型，会返回此字段，表示时间戳，格式为 "[[100,200], [200,500]]"(ms)
			`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text':'正是因为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text_seg':'正是因为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			```

			## 实时语音识别
			@@ -96,5 +96,5 @@
			`text`：表示语音识别输出文本
			`is_final`：表示识别结束
			`timestamp`：如果AM为时间戳模型，会返回此字段，表示时间戳，格式为 "[[100,200], [200,500]]"(ms)
			`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text':'正是因为','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			`stamp_sents`：如果AM为时间戳模型，会返回此字段，表示句子级别时间戳，格式为 "[{'text_seg':'正是因为','punc':',','start':'430','end':'1130','ts_list':[[430,670],[670,810],[810,1030],[1030,1130]]}]"
			```

			@@ -584,7 +584,8 @@
			}
			}
			// format
			ts_sent += "{'text':'" + text_seg + "',";
			ts_sent += "{'text_seg':'" + text_seg + "',";
			ts_sent += "'punc':'" + characters[idx_str] + "',";
			ts_sent += "'start':'" + to_string(start) + "',";
			ts_sent += "'end':'" + to_string(end) + "',";
			ts_sent += "'ts_list':" + VectorToString(ts_seg) + "}";
			@@ -620,7 +621,8 @@
			end = ts_seg[ts_seg.size()-1][1];
			}
			// format
			ts_sent += "{'text':'" + text_seg + "',";
			ts_sent += "{'text_seg':'" + text_seg + "',";
			ts_sent += "'punc':'',";
			ts_sent += "'start':'" + to_string(start) + "',";
			ts_sent += "'end':'" + to_string(end) + "',";
			ts_sent += "'ts_list':" + VectorToString(ts_seg) + "}";