×
Zero-shot Text to Audio Retrieval on AudioCaps ... Contact us on: hello@paperswithcode.com . Papers With Code is a free resource with all data licensed under CC- ...
Missing: carat url? q= https://
The current state-of-the-art on Clotho is LanguageBind(FT). See a full comparison of 6 papers with code.
Missing: carat url? q= https://
Zero-shot Text to Audio Retrieval. 5 papers with code • 2 benchmarks • 2 datasets. This task has no description! Would you like to contribute one? Benchmarks.
Missing: carat url? q= https:// sota/
The current state-of-the-art on Clotho is InternVideo2-6B. See a full comparison of 7 papers with code.
Missing: carat url? q= https://
The current state-of-the-art on AudioCaps is WavCaps. See a full comparison of 5 papers with code.
Missing: carat url? q= https://
It contains two collections of datasets: unlabelled audio recordings of radio news and talk shows programs (160 hours) and labelled data (over 80 hours) ...
Oct 23, 2019 · We investigate multi-speaker modeling for end-to-end text-to-speech synthesis and study the effects of different types of state-of-the-art ...
Nov 21, 2023 · The paper proposes a new model that uses a generative text-based LLM and neural audio codec to perform large-scale, zero-shot text-to-speech.
In order to show you the most relevant results, we have omitted some entries very similar to the 8 already displayed. If you like, you can repeat the search with the omitted results included.