I converted the model to half precision and updated it as a cog so it can be commercialized on replicate more easily.
sudo cog predict -i prompt="hi hows it going" -i voice="A Well spoken english male clear voice no background noise"
virtualenv .env
source .env/bin/activate
pip install -r requirements.txt
See original parler paper.
mkdir models
cd models
git clone [email protected]:spaces/parler-tts/parler_tts_mini
Uncomment code in predict.py to do that, run it and then copy missing files over from the old full precision model folder.
I did that/thats what this repo is.
see predict.py
cog push
pytest .
flake8 predict.py
- Use a efficient output format not wav
- Support for more expressive and emotive voices
- Support for more languages
- Support for more voices
- Support for more accents
- Eleven labs style voice clone.
- Voice style transfer
- SunoAI - Inpainting of audio/generating anything audio
- Music style transfer
Let me know if these are of interest or if these have been done please link!
See text to speech models on Text-generator.io https://text-generator.io/
AI Chat characters https://netwrck.com https://netwrck.com
AI Art Generation https://aiart-generator.io https://aiart-generator.io