E3 TTS
The E3 TTS model uses end-to-end diffusion based text-to-speech generation to create highly realistic speech without any intermediate representations.
The E3 TTS model uses end-to-end diffusion based text-to-speech generation to create highly realistic speech without any intermediate representations.