NSF-HiFiGAN with 44.1 kHz sampling rate
This release contains the first formal public release of the DiffSinger Community Vocoder Project, which includes:
- A pretrained model for inference
- A pretrained model for fine-tuning
- An ONNX model for lightweight and portable deployment
Overview
Architecture: NSF-HiFiGAN
Training data: ~93h singing voice
Training step: over 1m
Sampling rate: 44100
Number of mel bins: 128
Hop size: 512
Window size: 2048
Mel frequency (input): 40-16000 Hz
Notice
Pretrained models are released under the Attribution-NonCommercial-ShareAlike 4.0 International license. Please read the notice in the folder if you want to redistribute these pretrained models.