shixian.shi
2024-01-31 5602ebe208639ad8f91899adeddae0a2f1e39f09
code update
3个文件已修改
8 ■■■■ 已修改文件
README.md 4 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
examples/industrial_data_pretraining/seaco_paraformer/demo.py 2 ●●● 补丁 | 查看 | 原始文档 | blame | 历史
funasr/models/campplus/utils.py 2 ●●● 补丁 | 查看 | 原始文档 | blame | 历史
README.md
@@ -144,7 +144,7 @@
```
Note: `chunk_size` is the configuration for streaming latency.` [0,10,5]` indicates that the real-time display granularity is `10*60=600ms`, and the lookahead information is `5*60=300ms`. Each inference input is `600ms` (sample points are `16000*0.6=960`), and the output is the corresponding text. For the last speech segment input, `is_final=True` needs to be set to force the output of the last word.
### Voice Activity Detection (streaming)
### Voice Activity Detection (Non-Streaming)
```python
from funasr import AutoModel
@@ -153,7 +153,7 @@
res = model.generate(input=wav_file)
print(res)
```
### Voice Activity Detection (Non-streaming)
### Voice Activity Detection (Streaming)
```python
from funasr import AutoModel
examples/industrial_data_pretraining/seaco_paraformer/demo.py
@@ -17,6 +17,6 @@
res = model.generate(input="https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav",
                     hotword='达摩院 魔搭',
                     # sentence_timestamp=True,
                     # sentence_timestamp=True,  # return sentence level information when spk_model is not given
                    )
print(res)
funasr/models/campplus/utils.py
@@ -212,7 +212,7 @@
            if overlap > max_overlap:
                max_overlap = overlap
                sentence_spk = spk
        d['spk'] = sentence_spk
        d['spk'] = int(sentence_spk)
        sd_sentence_list.append(d)
    return sd_sentence_list