Bio
Alexander Haojan (浩然) Liu is a 4th Ph.D. student in Computer Science at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). He is a member of the Spoken Language System (SLS) Group leading by Dr. James Glass. His research interests are in the field of machine learning, natural language processing (speech processing in particular), and computer vision. Currently, his work focuses on self-supervised learning of audio and their applications.
Prior to joining MIT, Alex received his M.S. and B.S. degrees in Computer Science & Information Engineering (CSIE) from National Taiwan University (NTU). He was a member of the Speech Processing Lab working with Lin-shan Lee and Prof. Hung-yi Lee in the area of machine learning and speech processing. During his undergraduate years, he worked with Yu-Chiang Frank Wang in computer vision and representation learning. Besides academic labs, he also spent time working at Meta AI (formerly known as Facebook AI Research) as a research intern.
Publications / Teaching / Honors / Side Projects
News
Inspired by Wei-Chiu Ma, I would like to commit 1-2 hours per week to provide suggestions and/or mentorships to junior students in need, especially those from underrepresented groups. Please fill out this form if you are interested.
I’m activelty looking for research collaboration with students outside of MIT on machine learning for audio and multi-modal data. Pleae feel free to drop me an email if you find my recent works interesting!
Selected Publications
Generative Pre-training for Speech with Flow Matching
Alexander H. Liu, Matthew Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu
International Conference on Learning Representations (ICLR) 2024
[ paper ]Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Alexander H. Liu(co-first), Sung-Lin Yeh(co-first), James Glass
In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024
[ paper ]DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Alexander H. Liu, Heng-Jui Chang, Michael Auli(co-last), Wei-Ning Hsu(co-last), James Glass(co-last)
In Advances in Neural Information Processing Systems (NeurIPS) 2023
[ paper | code ]Joint Audio and Speech Understanding
Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James Glass
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
[ paper | interactive demo ]Listen, Think, and Understand
Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James Glass
International Conference on Learning Representations (ICLR) 2024
[ paper | interactive demo ]Contrastive Audio-Visual Masked Autoencoder
Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James Glass
International Conference on Learning Representations (ICLR) 2023
[ paper | code ]Simple and Effective Unsupervised Speech Synthesis
Alexander H. Liu (co-first), Cheng-I Jeff Lai (co-first), Wei-Ning Hsu, Michael Auli, Alexei Baevskiv, James Glass
InterSpeech 2022
[ paper | demo ]Towards End-to-end Unsupervised Speech Recognition
Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski
Spoken Language Technology Workshop (SLT) 2022
[ paper | code ]Cross-Modal Discrete Representation Learning
Alexander H. Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James Glass
Annual Meeting of the Association for Computational Linguistics (ACL) 2022
[ paper ]Spoken moments: Learning Joint Audio-visual Representations from Video Descriptions
Mathew Monfort (co-first), SouYoung Jin (co-first), Alexander H. Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[ paper | dataset ]Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu, Yu-An Chung, James Glass
InterSpeech 2021
[ paper | code ]Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu (co-first), Tao Tu (co-first), Hung-yi Lee, Lin-shan Lee
In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
[ paper | demo ]
- Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-Aware Representation
Alexander H. Liu (co-first), Po-Yi Chen (co-first), Yen-Cheng Liu, Yu-Chiang Frank Wang
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
[ paper | oral | supplementary ]
- A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang
In Advances in Neural Information Processing Systems (NeurIPS) 2018
[ paper | code | supplementary & reviews ]
For the complete list, please visit google scholar.
Teaching
TA of Nature Language Processing MIT, Fall 2021
TA of Fundamentals of Speech Signal Processing NTU CSIE, Fall 2018 & Spring 2019
TA of Deep Learning for Human Language Processing NTU EE, Fall 2018
TA of Machine Learning and having it Deep and Structured NTU EE, Spring 2018
TA of Deep Learning for Computer Vision NTU GICE, Fall 2018
TA of Advanced Deep Learning NTU CSIE, Spring 2018
Honors
Advanced Speech Technologies Scholarship NTU EECS 2019
Verizon Media AI Scholarship Verizon Media, Taiwan 2019
Best Student Speaker Award 3rd AII Workshop 2019
1st Price, Formosa Spoken QA Challenge Ministry of Science and Technology, Taiwan 2019
Excellent Teaching Assistant Award NTU CSIE Dept. 2019
Technology Scholarship Foxconn Education Foundation 2019
Presidential Awards NTU CSIE 2017/2018