| | |
| | | # Get Started |
| | | Speaker Attributed Automatic Speech Recognition (SA-ASR) is a task proposed to solve "who spoke what". Specifically, the goal of SA-ASR is not only to obtain multi-speaker transcriptions, but also to identify the corresponding speaker for each utterance. The method used in this example is referenced in the paper: [End-to-End Speaker-Attributed ASR with Transformer](https://www.isca-speech.org/archive/pdfs/interspeech_2021/kanda21b_interspeech.pdf). |
| | | To run this receipe, first you need to install FunASR and ModelScope. ([installation](https://alibaba-damo-academy.github.io/FunASR/en/installation.html)) |
| | | There are two startup scripts, `run.sh` for training and evaluating on the old eval and test sets, and `run_m2met_2023_infer.sh` for inference on the new test set of the Multi-Channel Multi-Party Meeting Transcription 2.0 ([M2MET2.0](https://alibaba-damo-academy.github.io/FunASR/m2met2/index.html)) Challenge. |
| | | There are two startup scripts, `run.sh` for training and evaluating on the old eval and test sets, and `run_m2met_2023_infer.sh` for inference on the new test set of the Multi-Channel Multi-Party Meeting Transcription 2.0 ([M2MeT2.0](https://alibaba-damo-academy.github.io/FunASR/m2met2/index.html)) Challenge. |
| | | Before running `run.sh`, you must manually download and unpack the [AliMeeting](http://www.openslr.org/119/) corpus and place it in the `./dataset` directory: |
| | | ```shell |
| | | dataset |
| | |
| | | </tr> |
| | | <tr> |
| | | <td>oracle profile</td> |
| | | <td>31.93</td> |
| | | <td>32.75</td> |
| | | <td>48.56</td> |
| | | <td>53.33</td> |
| | | <td>32.05</td> |
| | | <td>32.70</td> |
| | | <td>47.40</td> |
| | | <td>52.57</td> |
| | | </tr> |
| | | <tr> |
| | | <td>cluster profile</td> |
| | | <td>31.94</td> |
| | | <td>32.77</td> |
| | | <td>55.49</td> |
| | | <td>58.17</td> |
| | | <td>32.05</td> |
| | | <td>32.70</td> |
| | | <td>53.76</td> |
| | | <td>55.95</td> |
| | | </tr> |
| | | </table> |
| | | |