Unsupervised cross-domain image generation Y Taigman, A Polyak, L Wolf arXiv preprint arXiv:1611.02200, 2016 | 1194 | 2016 |
Make-a-video: Text-to-video generation without text-video data U Singer, A Polyak, T Hayes, X Yin, J An, S Zhang, Q Hu, H Yang, ... arXiv preprint arXiv:2209.14792, 2022 | 657 | 2022 |
Make-a-scene: Scene-based text-to-image generation with human priors O Gafni, A Polyak, O Ashual, S Sheynin, D Parikh, Y Taigman European Conference on Computer Vision, 89-106, 2022 | 335 | 2022 |
On generative spoken language modeling from raw audio K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ... Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021 | 245 | 2021 |
Speech resynthesis from discrete disentangled self-supervised representations A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ... arXiv preprint arXiv:2104.00355, 2021 | 218 | 2021 |
VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop Y Taigman, L Wolf, A Polyak, E Nachmani 6th International Conference on Learning Representations, 2017 | 209* | 2017 |
Channel-level acceleration of deep face representations A Polyak, L Wolf IEEE Access 3, 2163-2175, 2015 | 206 | 2015 |
Audiogen: Textually guided audio generation F Kreuk, G Synnaeve, A Polyak, U Singer, A Défossez, J Copet, D Parikh, ... arXiv preprint arXiv:2209.15352, 2022 | 163 | 2022 |
A Universal Music Translation Network N Mor, L Wolf, A Polyak, Y Taigman 7th International Conference on Learning Representations, 2019 | 152 | 2019 |
Direct speech-to-speech translation with discrete units A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ... arXiv preprint arXiv:2107.05604, 2021 | 113 | 2021 |
Fitting New Speakers Based on a Short Untranscribed Sample E Nachmani, A Polyak, Y Taigman, L Wolf Proceedings of the 35th International Conference on Machine Learning, 2018 | 100 | 2018 |
Pick-a-pic: An open dataset of user preferences for text-to-image generation Y Kirstain, A Polyak, U Singer, S Matiana, J Penna, O Levy Advances in Neural Information Processing Systems 36, 2024 | 93 | 2024 |
Knn-diffusion: Image generation via large-scale retrieval S Sheynin, O Ashual, A Polyak, U Singer, O Gafni, E Nachmani, ... arXiv preprint arXiv:2204.02849, 2022 | 88 | 2022 |
Text-free prosody-aware generative spoken language modeling E Kharitonov, A Lee, A Polyak, Y Adi, J Copet, K Lakhotia, TA Nguyen, ... arXiv preprint arXiv:2109.03264, 2021 | 82 | 2021 |
Text-to-4d dynamic scene generation U Singer, S Sheynin, A Polyak, O Ashual, I Makarov, F Kokkinos, N Goyal, ... arXiv preprint arXiv:2301.11280, 2023 | 70 | 2023 |
Scaling autoregressive multi-modal models: Pretraining and instruction tuning L Yu, B Shi, R Pasunuru, B Muller, O Golovneva, T Wang, A Babu, B Tang, ... arXiv preprint arXiv:2309.02591 2 (3), 2023 | 52 | 2023 |
Unsupervised creation of parameterized avatars L Wolf, Y Taigman, A Polyak Proceedings of the IEEE International Conference on Computer Vision, 1530-1538, 2017 | 52 | 2017 |
Textless speech emotion conversion using discrete and decomposed representations F Kreuk, A Polyak, J Copet, E Kharitonov, TA Nguyen, M Rivière, WN Hsu, ... arXiv preprint arXiv:2111.07402, 2021 | 50 | 2021 |
Unsupervised cross-domain singing voice conversion A Polyak, L Wolf, Y Adi, Y Taigman arXiv preprint arXiv:2008.02830, 2020 | 46 | 2020 |
TTS skins: Speaker conversion via ASR A Polyak, L Wolf, Y Taigman arXiv preprint arXiv:1904.08983, 2019 | 33 | 2019 |