Heiga Zen

Cited by

	All	Since 2019
Citations	21848	13518
h-index	53	36
i10-index	102	56

2800

1400

700

2100

2006200720082009201020112012201320142015201620172018201920202021202220232024104 144 233 323 548 412 466 639 740 763 961 1180 1494 1984 2129 2507 2524 2743 1615

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Keiichi TokudaNagoya Institute of TechnologyVerified email at nitech.ac.jp
Yoshihiko NankakuNagoya Institute of TechnologyVerified email at sp-nitech.org
Yu ZhangOpenAIVerified email at csail.mit.edu
Tomoki TodaNagoya UniversityVerified email at icts.nagoya-u.ac.jp
Andrew SeniorGoogle DeepMindVerified email at google.com
Junichi YamagishiNational Institute of Informatics, Tokyo, JapanVerified email at nii.ac.jp
Aäron van den OordGoogle DeepMindVerified email at google.com
Alan W BlackProfessor, Language Technologies Institute, Carnegie Mellon UniversityVerified email at cs.cmu.edu
Oriol VinyalsResearch Scientist at Google DeepMindVerified email at google.com
Yonghui WuGoogle BrainVerified email at google.com
Ye JiaMetaVerified email at google.com
Karen SimonyanChief Scientist, Microsoft AIVerified email at microsoft.com
Alex GravesUniversity of TorontoVerified email at cs.toronto.edu
koray kavukcuogluDeepMindVerified email at kavukcuoglu.org
Sander DielemanResearch Scientist, DeepMindVerified email at google.com
Nal KalchbrennerGoogle DeepMindVerified email at google.com
Ron J WeissGoogleVerified email at google.com
Takashi MasukoPreferred Networks, Inc.Verified email at ieee.org
Akinobu LeeProfessor, Nagoya Institute of TechnologyVerified email at nitech.ac.jp
Rob ClarkGoogleVerified email at google.com

Heiga Zen

Principal Scientist (Director), Google DeepMind

Verified email at google.com - Homepage

Speech Synthesis


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
WaveNet: A generative model for raw audio A van den Oord, S Dieleman, H Zen, K Simonyan, O Vinyals, A Graves, ... arXiv preprint arXiv:1609.03499, 2016	7049	2016
Statistical parametric speech synthesis H Zen, K Tokuda, AW Black Speech Communication 51 (11), 1039-1064, 2009	1619	2009
Statistical parametric speech synthesis using deep neural networks H Zen, A Senior, M Schuster IEEE International Conference on Acoustics, Speech, and Signal Processing …, 2013	1141	2013
Parallel WaveNet: Fast high-fidelity speech synthesis A Oord, Y Li, I Babuschkin, K Simonyan, O Vinyals, K Kavukcuoglu, ... arXiv preprint arXiv:1711.10433, 2017	964	2017
LibriTTS: A corpus derived from LibriSpeech for text-to-speech H Zen, V Dang, R Clark, Y Zhang, RJ Weiss, Y Jia, Z Chen, Y Wu Interspeech, 1526-1530, 2019	787	2019
The HMM-based speech synthesis system (HTS) version 2.0 H Zen, T Nose, J Yamagishi, S Sako, T Masuko, AW Black, K Tokuda Sixth ISCA Workshop on Speech Synthesis (SSW6), 294-299, 2007	706	2007
Wavegrad: Estimating gradients for waveform generation N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi, W Chan arXiv preprint arXiv:2009.00713, 2020	678	2020
Speech synthesis based on hidden Markov models K Tokuda, Y Nankaku, T Toda, H Zen, J Yamagishi, K Oura Proceedings of the IEEE 101 (5), 1234--1252, 2013	577	2013
An HMM-based speech synthesis system applied to English K Tokuda, H Zen, AW Black IEEE Workshop on Speech Synthesis, 227-230, 2002	517	2002
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis H Zen, H Sak 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015	379	2015
Statistical parametric speech synthesis AW Black, H Zen, K Tokuda IEEE International Conference on Acoustics, Speech, and Signal Processing …, 2007	311	2007
A hidden semi-Markov model-based speech synthesis system H Zen, K Tokuda, T Masuko, T Kobayasih, T Kitamura IEICE TRANSACTIONS on Information and Systems 90 (5), 825-834, 2007	308	2007
Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005 H Zen, T Toda, M Nakamura, K Tokuda IEICE TRANSACTIONS on Information and Systems 90 (1), 325-333, 2007	308	2007
Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends ZH Ling, SY Kang, H Zen, A Senior, M Schuster, XJ Qian, HM Meng, ... IEEE Signal Processing Magazine 32 (3), 35-52, 2015	295	2015
Hierarchical generative modeling for controllable speech synthesis WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ... arXiv preprint arXiv:1810.07217, 2018	270	2018
Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis H Zen, A Senior 2014 IEEE international conference on acoustics, speech and signal …, 2014	270	2014
Robust speaker-adaptive HMM-based text-to-speech synthesis J Yamagishi, T Nose, H Zen, ZH Ling, T Toda, K Tokuda, S King, S Renals IEEE Transactions on Audio, Speech, and Language Processing 17 (6), 1208-1230, 2009	247	2009
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019	203	2019
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024	196	2024
Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences H Zen, K Tokuda, T Kitamura Computer Speech & Language 21 (1), 153-173, 2007	195	2007

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors