Follow
Zhe Gan
Zhe Gan
Research Scientist, Apple
Verified email at apple.com - Homepage
Title
Cited by
Cited by
Year
Uniter: Universal image-text representation learning
YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu
European Conference on Computer Vision, 104-120, 2020
2669*2020
Attngan: Fine-grained text to image generation with attentional generative adversarial networks
T Xu, P Zhang, Q Huang, H Zhang, Z Gan, X Huang, X He
Proceedings of the IEEE conference on computer vision and pattern …, 2018
21362018
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Y Pu, Z Gan, R Henao, X Yuan, C Li, A Stevens, L Carin
NIPS, 2016
10042016
Patient knowledge distillation for bert model compression
S Sun, Y Cheng, Z Gan, J Liu
arXiv preprint arXiv:1908.09355, 2019
9082019
Less is more: Clipbert for video-and-language learning via sparse sampling
J Lei, L Li, L Zhou, Z Gan, TL Berg, M Bansal, J Liu
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
7142021
Hero: Hierarchical encoder for video+ language omni-representation pre-training
L Li, YC Chen, Y Cheng, Z Gan, L Yu, J Liu
arXiv preprint arXiv:2005.00200, 2020
5482020
Semantic compositional networks for visual captioning
Z Gan, C Gan, X He, Y Pu, K Tran, J Gao, L Carin, L Deng
Proceedings of the IEEE conference on computer vision and pattern …, 2017
5432017
Large-scale adversarial training for vision-and-language representation learning
Z Gan, YC Chen, L Li, C Zhu, Y Cheng, J Liu
Advances in Neural Information Processing Systems 33, 6616-6628, 2020
5372020
Freelb: Enhanced adversarial training for natural language understanding
C Zhu, Y Cheng, Z Gan, S Sun, T Goldstein, J Liu
International Conference on Learning Representations, 2020
5332020
Git: A generative image-to-text transformer for vision and language
J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang
arXiv preprint arXiv:2205.14100, 2022
5232022
Adversarial feature matching for text generation
Y Zhang, Z Gan, K Fan, Z Chen, R Henao, D Shen, L Carin
International conference on machine learning, 4006-4015, 2017
4722017
Relation-aware graph attention network for visual question answering
L Li, Z Gan, Y Cheng, J Liu
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
4222019
An empirical study of gpt-3 for few-shot knowledge-based vqa
Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang
Proceedings of the AAAI conference on artificial intelligence 36 (3), 3081-3089, 2022
4002022
An empirical study of training end-to-end vision-and-language transformers
ZY Dou, Y Xu, Z Gan, J Wang, S Wang, L Wang, C Zhu, P Zhang, L Yuan, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
3832022
Stylenet: Generating attractive visual captions with styles
C Gan, Z Gan, X He, J Gao, L Deng
Proceedings of the IEEE conference on computer vision and pattern …, 2017
3832017
Club: A contrastive log-ratio upper bound of mutual information
P Cheng, W Hao, S Dai, J Liu, Z Gan, L Carin
International conference on machine learning, 1779-1788, 2020
3562020
Discourse-aware neural extractive text summarization
J Xu, Z Gan, Y Cheng, J Liu
arXiv preprint arXiv:1910.14142, 2019
3472019
Generating informative and diverse conversational responses via adversarial information maximization
Y Zhang, M Galley, J Gao, Z Gan, X Li, C Brockett, B Dolan
Advances in Neural Information Processing Systems 31, 2018
3152018
Scaling up vision-language pre-training for image captioning
X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
2862022
Swinbert: End-to-end transformers with sparse attention for video captioning
K Lin, L Li, CC Lin, F Ahmed, Z Gan, Z Liu, Y Lu, L Wang
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
2722022
The system can't perform the operation now. Try again later.
Articles 1–20