• 公告板
  • 版本库
  • filestore
  • 活动
  • 搜索
python / FunASR-XL
FUNASR训练
  • 概况
  • 操作记录
  • 提交次数
  • 目录
  • 文档
  • 派生
  • 对比
编辑 | blame | 历史 | 原始文档

Papers

FunASR have implemented the following paper code

Speech Recognition Models

  • Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition, INTERSPEECH 2022.
  • Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model, arXiv preprint arXiv:2010.14099, 2020.
  • San-m: Memory equipped self-attention for end-to-end speech recognition, INTERSPEECH 2020
  • Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition, INTERSPEECH 2020
  • Conformer: Convolution-augmented Transformer for Speech Recognition, INTERSPEECH 2020
  • Sequence-to-sequence learning with Transducers, NIPS 2016

Multi-talker Speech Recognition Models

  • MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario, ICASSP 2022

Voice Activity Detection

  • Deep-FSMN for Large Vocabulary Continuous Speech Recognition, ICASSP 2018

Punctuation Restoration

  • CT-Transformer: Controllable time-delay transformer for real-time punctuation prediction and disfluency detection, ICASSP 2018

Language Models

  • Attention Is All You Need, NEURIPS 2017

Speaker Verification

  • X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION, ICASSP 2018

Speaker diarization

  • Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis, EMNLP 2022

Timestamp Prediction

  • Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model, arXiv:2301.12343

v1.9.1