UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Audio Samples
Speech Tokenization & Resynthesis
| Ground Truth 256k bps |
|||||
| SpeechTokenizer 500 bps |
|||||
| HuBERT + Unit-HiFiGAN 500 bps |
|||||
| UniWav 500 bps |
|||||
| SpeechTokenizer 1k bps |
|||||
| UniWav 1k bps |
In-context Text-to-Speech
| Text | UniWav | Ground Truth |
|---|---|---|
| on arriving at home at my own residence i found that our salon was filled with a brilliant company | ||
| at the inception of plural marriage among the latter day saints there was no law national or state against its practise | ||
| we are losing time and the fact is i have not come all this way to take a little sail upon a pond on a raft | ||
| it was the first great sorrow of his life it was not so much the loss of the cotton itself but the fantasy the hopes the dreams built around it | ||
| for some years it was not found feasible to operate motors on alternating current circuits and that reason was often urged against it seriously |