From 4e0aae556bbfb81f765ddb3e247f34441c607c5e Mon Sep 17 00:00:00 2001
From: 游雁 <zhifu.gzf@alibaba-inc.com>
Date: 星期五, 21 四月 2023 10:45:16 +0800
Subject: [PATCH] docs

---
 docs_m2met2/Baseline.md |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/docs_m2met2/Baseline.md b/docs_m2met2/Baseline.md
index 0bd09b3..6f9609b 100644
--- a/docs_m2met2/Baseline.md
+++ b/docs_m2met2/Baseline.md
@@ -1,6 +1,6 @@
 # Baseline
 ## Overview
-We provide an end-to-end sa-asr baseline conducted on [FunASR](https://github.com/alibaba-damo-academy/FunASR) as a receipe. The model architecture is shown in Figure 3. The SpeakerEncoder is initialized with a pre-trained [speaker verification model](https://modelscope.cn/models/damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch/summary) from [ModelScope](https://modelscope.cn/home). This speaker verification model is also be used to extract the speaker embedding in the speaker profile. 
+We will release an E2E SA-ASR~\cite{kanda21b_interspeech} baseline conducted on [FunASR](https://github.com/alibaba-damo-academy/FunASR) at the time according to the timeline. The model architecture is shown in Figure 3. The SpeakerEncoder is initialized with a pre-trained speaker verification model from ModelScope. This speaker verification model is also be used to extract the speaker embedding in the speaker profile.
 
 ![model archietecture](images/sa_asr_arch.png)
 
@@ -9,4 +9,5 @@
 
 ## Baseline results
 The results of the baseline system are shown in Table 3. The speaker profile adopts the oracle speaker embedding during training. However, due to the lack of oracle speaker label during evaluation, the speaker profile provided by an additional spectral clustering is used. Meanwhile, the results of using the oracle speaker profile on Eval and Test Set are also provided to show the impact of speaker profile accuracy. 
+
 ![baseline result](images/baseline_result.png)
\ No newline at end of file

--
Gitblit v1.9.1