Dong Yang

I am a second-year Ph.D. student at Saruwatari & Saito Lab , the University of Tokyo, Japan. I am supervised by Prof. Hiroshi Saruwatari and academically supervised by Dr. Yuki Saito.
My research focuses on Text-to-Speech (TTS) and language modeling. I am interested in applying language models to TTS systems, including better utilization of language model features and exploration of discrete token-based TTS models. I am also interested in enhancing current TTS models by integrating traditional signal processing methods.

Email (personal): ydqmkkx [at] gmail.com
Email (university): yangdong [at] g.ecc.u-tokyo.ac.jp
Address: Room #141, Engineering bldg. #6, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan

Google Scholar GitHub Hugging Face Hugging Face

Education
  • Shanghai Jiao Tong University, China

    B.S. in Mechanical Engineering Sept. 2015 - Jun. 2020

  • Waseda University, Japan

    Exchange student Apr. 2019 - Aug. 2019

  • The University of Tokyo, Japan

    M.S. in Information Science and Technology Apr. 2021 - Mar. 2023

  • The University of Tokyo, Japan

    Ph.D. in Information Science and Technology Apr. 2023 - Present


Experience
  • Zhejiang Lab, China

    Research Intern in the Institute of Artificial IntelligenceOct. 2020 - Feb. 2021

    Researched on resource scheduling optimization and reinforcement learning.

  • CyberAgent, Inc., Japan

    Research Intern in the Audio Team, AI LabSept. 2023 - Dec. 2023

    Researched on text-to-speech synthesis.


Awards & Honors
  • INTERSPEECH 2024 Best Student Paper Nominee [Shortlist]


Scholarship