From 40f1f529b0ebc7a571af6be3ca9df67cd5480d66 Mon Sep 17 00:00:00 2001
From: yhliang <429259365@qq.com>
Date: 星期四, 20 四月 2023 16:48:26 +0800
Subject: [PATCH] add m2met2 registration form
---
docs_m2met2/Dataset.md | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/docs_m2met2/Dataset.md b/docs_m2met2/Dataset.md
index c987aca..a897d67 100644
--- a/docs_m2met2/Dataset.md
+++ b/docs_m2met2/Dataset.md
@@ -2,7 +2,7 @@
## Overview of training data
In the fixed training condition, the training dataset is restricted to three publicly available corpora, namely, AliMeeting, AISHELL-4, and CN-Celeb. To evaluate the performance of the models trained on these datasets, we will release a new Test set called Test-2023 for scoring and ranking. We will describe the AliMeeting dataset and the Test-2023 set in detail.
## Detail of AliMeeting corpus
-AliMeeting contains 118.75 hours of speech data in total. The dataset is divided into 104.75 hours for training (Train), 4 hours for evaluation (Eval) and 10 hours as test set (Test) for scoring and ranking. Specifically, the Train and Eval sets contain 212 and 8 sessions, respectively. Each session consists of a 15 to 30-minute discussion by a group of participants. The total number of participants in Train and Eval sets is 456 and 25, respectively, with balanced gender coverage.
+AliMeeting contains 118.75 hours of speech data in total. The dataset is divided into 104.75 hours for training (Train), 4 hours for evaluation (Eval) and 10 hours as test set (Test) for scoring and ranking. Specifically, the Train, Eval and Test sets contain 212, 8 and 20 sessions, respectively. Each session consists of a 15 to 30-minute discussion by a group of participants. The total number of participants in Train, Eval and Test sets is 456, 25 and 60, respectively, with balanced gender coverage.
The dataset is collected in 13 meeting venues, which are categorized into three types: small, medium, and large rooms with sizes ranging from 8 m$^{2}$ to 55 m$^{2}$. Different rooms give us a variety of acoustic properties and layouts. The detailed parameters of each meeting venue will be released together with the Train data. The type of wall material of the meeting venues covers cement, glass, etc. Other furnishings in meeting venues include sofa, TV, blackboard, fan, air conditioner, plants, etc. During recording, the participants of the meeting sit around the microphone array which is placed on the table and conduct a natural conversation. The microphone-speaker distance ranges from 0.3 m to 5.0 m. All participants are native Chinese speakers speaking Mandarin without strong accents. During the meeting, various kinds of indoor noise including but not limited to clicking, keyboard, door opening/closing, fan, bubble noise, etc., are made naturally. For both Train and Eval sets, the participants are required to remain in the same position during recording. There is no speaker overlap between the Train and Eval set. An example of the recording venue from the Train set is shown in Fig 1.
--
Gitblit v1.9.1