游雁
2024-10-16 98e2c546a08917f450d32d63968affd5b975ad2a
funasr tables
2个文件已修改
167 ■■■■ 已修改文件
docs/tutorial/README_zh.md 21 ●●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/tutorial/Tables_zh.md 146 ●●●● 补丁 | 查看 | 原始文档 | blame | 历史
docs/tutorial/README_zh.md
@@ -442,22 +442,21 @@
### 查看注册表
```python
```plaintext
from funasr.register import tables
tables.print()
```
支持查看指定类型的注册表:`tables.print("model")`
支持查看指定类型的注册表:\`tables.print("model")\`
### 注册新模型
### 注册模型
```python
from funasr.register import tables
@tables.register("model_classes", "MinMo_S2T")
class MinMo_S2T(nn.Module):
@tables.register("model_classes", "SenseVoiceSmall")
class SenseVoiceSmall(nn.Module):
  def __init__(*args, **kwargs):
    ...
@@ -479,10 +478,14 @@
```
然后在config.yaml中指定新注册模型
在需要注册的类名前加上 `@tables.register("model_classes","SenseVoiceSmall")`,即可完成注册,类需要实现有:__init__,forward,inference方法。
```yaml
model: MinMo_S2T
完整代码:[https://github.com/modelscope/FunASR/blob/main/funasr/models/sense\_voice/model.py#L443](https://github.com/modelscope/FunASR/blob/main/funasr/models/sense_voice/model.py#L443)
注册完成后,在config.yaml中指定新注册模型,即可实现对模型的定义
```python
model: SenseVoiceSmall
model_conf:
  ...
```
docs/tutorial/Tables_zh.md
@@ -1,4 +1,4 @@
# FunASR-1.x.x 注册模型教程
# FunASR-1.x.x 注册模型教程
1.0版本的设计初衷是【**让模型集成更简单**】,核心feature为注册表与AutoModel:
@@ -11,7 +11,7 @@
*   统一学术与工业模型推理训练脚本;
    
![image](https://alidocs.oss-cn-zhangjiakou.aliyuncs.com/a/6Ea1DxkZVte8y0g2/b78f122bd40b485687e5e13faa78ae850521.png)
![image](https://alidocs.oss-cn-zhangjiakou.aliyuncs.com/a/6Ea1DxkZVte8y0g2/150e0eafd1c34f2dbb9360ccb5db4dc40521.png)
# 快速上手
@@ -89,14 +89,14 @@
res = model.generate(input=[str], output_dir=[str])
```
*   wav文件路径, 例如: asr\_example.wav
*   pcm文件路径, 例如: asr\_example.pcm,此时需要指定音频采样率fs(默认为16000)
*   音频字节数流,例如:麦克风的字节数数据
*   wav.scp,kaldi 样式的 wav 列表 (`wav_id \t wav_path`), 例如:
*   *   wav文件路径, 例如: asr\_example.wav
    *   pcm文件路径, 例如: asr\_example.pcm,此时需要指定音频采样率fs(默认为16000)
    *   音频字节数流,例如:麦克风的字节数数据
    *   wav.scp,kaldi 样式的 wav 列表 (`wav_id \t wav_path`), 例如:
```plaintext
asr_example1  ./audios/asr_example1.wav
@@ -121,76 +121,82 @@
## 模型资源目录
![image.png](https://alidocs.oss-cn-zhangjiakou.aliyuncs.com/res/8oLl9y628rBNlapY/img/f16961f1-bdfb-4638-83d5-e4cb13a5a4a4.png)
![image.png](https://alidocs.oss-cn-zhangjiakou.aliyuncs.com/res/8oLl9y628rBNlapY/img/cab7f215-787f-4407-885a-14dc89ae9e02.png)
**模型链接为:**[https://www.modelscope.cn/models/iic/SenseVoiceSmall/files](https://www.modelscope.cn/models/iic/SenseVoiceSmall/files)
**配置文件**:config.yaml
```yaml
model: SenseVoiceLarge
model_conf:
  lsm_weight: 0.1
  length_normalized_loss: true
  activation_checkpoint: true
  sos: <|startoftranscript|>
  eos: <|endoftext|>
  downsample_rate: 4
  use_padmask: true
encoder: SenseVoiceEncoder
encoder: SenseVoiceEncoderSmall
encoder_conf:
  input_size: 128
  attention_heads: 20
  linear_units: 1280
  num_blocks: 32
  dropout_rate: 0.1
  positional_dropout_rate: 0.1
  attention_dropout_rate: 0.1
  kernel_size: 31
  sanm_shfit: 0
  att_type: self_att_fsmn_sdpa
  downsample_rate: 4
  use_padmask: true
  max_position_embeddings: 2048
  rope_theta: 10000
frontend: WhisperFrontend
frontend_conf:
  fs: 16000
  n_mels: 128
  do_pad_trim: false
  filters_path: null
    output_size: 512
    attention_heads: 4
    linear_units: 2048
    num_blocks: 50
    tp_blocks: 20
    dropout_rate: 0.1
    positional_dropout_rate: 0.1
    attention_dropout_rate: 0.1
    input_layer: pe
    pos_enc_class: SinusoidalPositionEncoder
    normalize_before: true
    kernel_size: 11
    sanm_shfit: 0
    selfattention_layer_type: sanm
tokenizer: SenseVoiceTokenizer
model: SenseVoiceSmall
model_conf:
    length_normalized_loss: true
    sos: 1
    eos: 2
    ignore_id: -1
tokenizer: SentencepiecesTokenizer
tokenizer_conf:
  vocab_path: null
  is_multilingual: true
  num_languages: 8749
  bpemodel: null
  unk_symbol: <unk>
  split_with_space: true
dataset: SenseVoiceDataset
frontend: WavFrontend
frontend_conf:
    fs: 16000
    window: hamming
    n_mels: 80
    frame_length: 25
    frame_shift: 10
    lfr_m: 7
    lfr_n: 6
    cmvn_file: null
dataset: SenseVoiceCTCDataset
dataset_conf:
  index_ds: IndexDSJsonl
  batch_sampler: BatchSampler
  batch_sampler: EspnetStyleBatchSampler
  data_split_num: 32
  batch_type: token
  batch_size: 12000
  sort_size: 64
  batch_size: 14000
  max_token_length: 2000
  min_token_length: 60
  max_source_length: 2000
  min_source_length: 60
  max_target_length: 150
  max_target_length: 200
  min_target_length: 0
  shuffle: true
  num_workers: 4
  sos: ${model_conf.sos}
  eos: ${model_conf.eos}
  IndexDSJsonl: IndexDSJsonl
  retry: 20
train_conf:
  accum_grad: 1
  grad_clip: 5
  max_epoch: 5
  keep_nbest_models: 200
  avg_nbest_model: 200
  max_epoch: 20
  keep_nbest_models: 10
  avg_nbest_model: 10
  log_interval: 100
  resume: true
  validate_interval: 10000
@@ -198,11 +204,10 @@
optim: adamw
optim_conf:
  lr: 2.5e-05
  lr: 0.00002
scheduler: warmuplr
scheduler_conf:
  warmup_steps: 20000
  warmup_steps: 25000
```
@@ -222,8 +227,8 @@
  "file_path_metas": {
    "init_param":"model.pt", 
    "config":"config.yaml",
    "tokenizer_conf": {"vocab_path": "tokens.tiktoken"},
    "frontend_conf":{"filters_path": "mel_filters.npz"}}
    "tokenizer_conf": {"bpemodel": "chn_jpn_yue_eng_ko_spectok.bpe.model"},
    "frontend_conf":{"cmvn_file": "am.mvn"}}
}
```
@@ -231,22 +236,21 @@
### 查看注册表
```python
```plaintext
from funasr.register import tables
tables.print()
```
支持查看指定类型的注册表:`tables.print("model")`
支持查看指定类型的注册表:\`tables.print("model")\`
### 新注册
### 注册模型
```python
from funasr.register import tables
@tables.register("model_classes", "MinMo_S2T")
class MinMo_S2T(nn.Module):
@tables.register("model_classes", "SenseVoiceSmall")
class SenseVoiceSmall(nn.Module):
  def __init__(*args, **kwargs):
    ...
@@ -268,10 +272,14 @@
```
在config.yaml中指定新注册模型
在需要注册的类名前加上 `@tables.register("model_classes","SenseVoiceSmall")`,即可完成注册,类需要实现有:__init__,forward,inference方法。
```yaml
model: MinMo_S2T
完整代码:[https://github.com/modelscope/FunASR/blob/main/funasr/models/sense\_voice/model.py#L443](https://github.com/modelscope/FunASR/blob/main/funasr/models/sense_voice/model.py#L443)
注册完成后,在config.yaml中指定新注册模型,即可实现对模型的定义
```python
model: SenseVoiceSmall
model_conf:
  ...
```