Dong Yang

I am a Ph.D. candidate at Saruwatari & Saito Lab , the University of Tokyo, Japan. I am supervised by Prof. Hiroshi Saruwatari and academically supervised by Dr. Yuki Saito.
My research focuses on text-to-speech (TTS) synthesis, particularly flow matching-based and discrete token-based models. I am also interested in enhancing TTS systems by incorporating techniques from traditional signal processing.

Email (personal): ydqmkkx [at] gmail.com
Email (university): yangdong [at] g.ecc.u-tokyo.ac.jp
Address: Room #141, Engineering bldg. #6, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan

Google Scholar GitHub Hugging Face Hugging Face

Education
  • Shanghai Jiao Tong University, China

    B.S. in Mechanical Engineering Sept. 2015 - Jun. 2020

  • Waseda University, Japan

    Exchange student Apr. 2019 - Aug. 2019

  • The University of Tokyo, Japan

    M.S. in Information Science and Technology Apr. 2021 - Mar. 2023

  • The University of Tokyo, Japan

    Ph.D. in Information Science and Technology Apr. 2023 - Present


Experience
  • Zhejiang Lab, China

    Research Intern in the Institute of Artificial IntelligenceOct. 2020 - Feb. 2021

    Researched on resource scheduling optimization and reinforcement learning.

  • CyberAgent, Inc., Japan

    Research Intern in the Audio Team, AI LabSept. 2023 - Dec. 2023

    Researched on text-to-speech synthesis.


Awards & Honors
  • Shortlisted for the ISCA Best Student Paper Award 2024 (at INTERSPEECH 2024)

    [Shortlist] (original) [Shortlist] (Wayback Machine link) Aug. 2024

  • Winners of the Discrete Speech Challenge (TTS Track) (at INTERSPEECH 2024)

    [Certificate] Sept. 2024


Scholarship