From fa9a6cdb1eade68c258eed7297f5a8a8a5329ac6 Mon Sep 17 00:00:00 2001
From: Flute <41096447+fluteink@users.noreply.github.com>
Date: 星期三, 01 十月 2025 14:44:28 +0800
Subject: [PATCH] 更新文档和运行脚本,修复文档拼写错误 (#2688)
---
README.md | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index ee23086..cabdcac 100644
--- a/README.md
+++ b/README.md
@@ -9,7 +9,7 @@
[](https://pypi.org/project/funasr/)
<p align="center">
-<a href="https://trendshift.io/repositories/3839" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3839" alt="alibaba-damo-academy%2FFunASR | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+<a href="https://trendshift.io/repositories/3839" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3839" alt="modelscope%2FFunASR | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</p>
<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun锛�
@@ -34,15 +34,15 @@
<a name="whats-new"></a>
## What's new:
-- 2024/10/29: Real-time Transcription Service 1.12 released锛孴he 2pass-offline mode supports the SensevoiceSmal model锛�([docs](runtime/readme.md));
+- 2024/10/29: Real-time Transcription Service 1.12 released, The 2pass-offline mode supports the SensevoiceSmal model锛�([docs](runtime/readme.md));
- 2024/10/10锛欰dded support for the Whisper-large-v3-turbo model, a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. It can be downloaded from the [modelscope](examples/industrial_data_pretraining/whisper/demo.py), and [openai](examples/industrial_data_pretraining/whisper/demo_from_openai.py).
-- 2024/09/26: Offline File Transcription Service 4.6, Offline File Transcription Service of English 1.7锛孯eal-time Transcription Service 1.11 released锛宖ix memory leak & Support the SensevoiceSmall onnx model锛汧ile Transcription Service 2.0 GPU released, Fix GPU memory leak; ([docs](runtime/readme.md));
+- 2024/09/26: Offline File Transcription Service 4.6, Offline File Transcription Service of English 1.7, Real-time Transcription Service 1.11 released, fix memory leak & Support the SensevoiceSmall onnx model锛汧ile Transcription Service 2.0 GPU released, Fix GPU memory leak; ([docs](runtime/readme.md));
- 2024/09/25锛歬eyword spotting models are new supported. Supports fine-tuning and inference for four models: [fsmn_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [fsmn_kws_mt](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online), [sanm_kws](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-offline), [sanm_kws_streaming](https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online).
- 2024/07/04锛歔SenseVoice](https://github.com/FunAudioLLM/SenseVoice) is a speech foundation model with multiple speech understanding capabilities, including ASR, LID, SER, and AED.
- 2024/07/01: Offline File Transcription Service GPU 1.1 released, optimize BladeDISC model compatibility issues; ref to ([docs](runtime/readme.md))
- 2024/06/27: Offline File Transcription Service GPU 1.0 released, supporting dynamic batch processing and multi-threading concurrency. In the long audio test set, the single-thread RTF is 0.0076, and multi-threads' speedup is 1200+ (compared to 330+ on CPU); ref to ([docs](runtime/readme.md))
- 2024/05/15锛歟motion recognition models are new supported. [emotion2vec+large](https://modelscope.cn/models/iic/emotion2vec_plus_large/summary)锛孾emotion2vec+base](https://modelscope.cn/models/iic/emotion2vec_plus_base/summary)锛孾emotion2vec+seed](https://modelscope.cn/models/iic/emotion2vec_plus_seed/summary). currently supports the following categories: 0: angry 1: happy 2: neutral 3: sad 4: unknown.
-- 2024/05/15: Offline File Transcription Service 4.5, Offline File Transcription Service of English 1.6锛孯eal-time Transcription Service 1.10 released锛宎dapting to FunASR 1.0 model structure锛�([docs](runtime/readme.md))
+- 2024/05/15: Offline File Transcription Service 4.5, Offline File Transcription Service of English 1.6, Real-time Transcription Service 1.10 released, adapting to FunASR 1.0 model structure锛�([docs](runtime/readme.md))
<details><summary>Full Changelog</summary>
@@ -315,11 +315,16 @@
### Test ONNX
```python
# pip3 install -U funasr-onnx
-from funasr_onnx import Paraformer
+from pathlib import Path
+from runtime.python.onnxruntime.funasr_onnx.paraformer_bin import Paraformer
+
+
+home_dir = Path.home()
+
model_dir = "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
model = Paraformer(model_dir, batch_size=1, quantize=True)
-wav_path = ['~/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']
+wav_path = [f"{home_dir}/.cache/modelscope/hub/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav"]
result = model(wav_path)
print(result)
--
Gitblit v1.9.1