python/FunASR-XL.git

			@@ -95,7 +95,9 @@
			When you input long audio and encounter Out Of Memory (OOM) issues, since memory usage tends to increase quadratically with audio length, consider the following three scenarios:

			a) At the beginning of inference, memory usage primarily depends on `batch_size_s`. Appropriately reducing this value can decrease memory usage.

			b) During the middle of inference, when encountering long audio segments cut by VAD and the total token count is less than `batch_size_s`, yet still facing OOM, you can appropriately reduce `batch_size_threshold_s`. If the threshold is exceeded, the batch size is forced to 1.

			c) Towards the end of inference, if long audio segments cut by VAD have a total token count less than `batch_size_s` and exceed the `threshold` batch_size_threshold_s, forcing the batch size to 1 and still facing OOM, you may reduce `max_single_segment_time` to shorten the VAD audio segment length.

			#### Speech Recognition (Streaming)
			@@ -130,7 +132,7 @@
			from funasr import AutoModel

			model = AutoModel(model="fsmn-vad")
			wav_file = f"{model.model_path}/example/asr_example.wav"
			wav_file = f"{model.model_path}/example/vad_example.wav"
			res = model.generate(input=wav_file)
			print(res)
			```
			@@ -221,7 +223,7 @@
			++train_conf.validate_interval=2000 \
			++train_conf.save_checkpoint_interval=2000 \
			++train_conf.keep_nbest_models=20 \
			++train_conf.avg_nbest_model=5 \
			++train_conf.avg_nbest_model=10 \
			++optim_conf.lr=0.0002 \
			++output_dir="${output_dir}" &> ${log_file}
			```
			@@ -421,4 +423,4 @@
			print(result)
			```

			More examples ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime)
			More examples ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/runtime/python/onnxruntime)