From 9a70dac2397d5ecd805510bf7ddc467916cb962f Mon Sep 17 00:00:00 2001 From: Vignesh Skanda <agvskanda@gmail.com> Date: 星期三, 16 十月 2024 12:49:10 +0800 Subject: [PATCH] Update README.md (#2146) --- docs/tutorial/README.md | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/docs/tutorial/README.md b/docs/tutorial/README.md index f87d5fa..9b68e73 100644 --- a/docs/tutorial/README.md +++ b/docs/tutorial/README.md @@ -95,7 +95,9 @@ When you input long audio and encounter Out Of Memory (OOM) issues, since memory usage tends to increase quadratically with audio length, consider the following three scenarios: a) At the beginning of inference, memory usage primarily depends on `batch_size_s`. Appropriately reducing this value can decrease memory usage. + b) During the middle of inference, when encountering long audio segments cut by VAD and the total token count is less than `batch_size_s`, yet still facing OOM, you can appropriately reduce `batch_size_threshold_s`. If the threshold is exceeded, the batch size is forced to 1. + c) Towards the end of inference, if long audio segments cut by VAD have a total token count less than `batch_size_s` and exceed the `threshold` batch_size_threshold_s, forcing the batch size to 1 and still facing OOM, you may reduce `max_single_segment_time` to shorten the VAD audio segment length. #### Speech Recognition (Streaming) @@ -421,4 +423,4 @@ print(result) ``` -More examples ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime) \ No newline at end of file +More examples ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime) -- Gitblit v1.9.1