
Yichen (Eason) Lu
About Me
Hello! I am currently a Research Scientist at Anuttacon, where I work on fully duplex speech-to-speech conversational models and lead the research and development of general audio understanding models, with a focus on Large Audio Language Model (LALM) architectures. My work involves continual audio-text pretraining and post-training for comprehensive audio, speech, and music understanding.
I hold a B.S. in Computer Science from the UIUC and a M.S. in Artificial Intelligence from CMU. At CMU, I was actively involved in research at the WAVLab under the supervision of Prof. Shinji Watanabe.
My core research interests include general speech/audio language models, audio-visual fusion, and multimodal language models.
During my undergraduate studies, I was fortunate to work with Prof. Tarek Abdelzaher, Prof. Kris Hauser, and Prof. Yuxiong Wang, which greatly shaped my research journey.
I'm happy to collaborate on interesting projects. If you'd like to chat, just drop me an email ✨
News
- [Sep 2025] EMNLP 2025 System Demo accepted: ViDove (multimodal translation agent).
- [Aug 2025] We released Whispers from the Star, I am responsible for the audio understanding part.
- [Jan 2025] Started as Research Scientist at Anuttacon
- [2025] One papers accepted to Interspeech 2025: GALAXY (multimodal learning dataset).
- [2024] Oral presentation at EMNLP 2024: FastAdaSP - Multitask-Adapted Efficient Inference for Large Speech LM.
Experience
-
Research Scientist, Anuttacon – Mountain View, CA
Jan 2025 – Present -
Research Assistant, CMU (WavLab) – Pittsburgh, PA
July 2023 – Jan 2025 | Contributed to ESPNet, research on efficient inference and unified audio-visual language understanding -
MLE Intern, TrovaAI – Champaign, IL
May 2023 – Aug 2023 | Designed RAG pipeline and AI agent platform using LangChain -
MLE Intern, VMware, Inc. – Beijing, China
July 2021 – Jan 2022 | Built automatic data analysis system for machine translation
Selected Publications
Education
-
Carnegie Mellon University – M.S. in Artificial Intelligence
Aug 2023 – Dec 2024 | GPA: 3.76/4.00 | Coursework: Multimodal ML, LLM System, System Tool Chains for AI, Speech Recognition and Understanding -
University of Illinois at Urbana-Champaign – B.S. in Statistics & Computer Science
Aug 2019 – May 2023 | GPA: 3.93/4.00 (Highest Honor; Dean's List) | Coursework: Computer Vision, Autonomous Vehicle, Applied ML, OS, Linear Algebra, DB Design, Web Dev
Notable Projects
Developed an end-to-end multimodal translation agent for video subtitle generation in specific fields. Managed open-source repository with 10+ technical members and directed marketing team.
Impact: Officially cooperated with StarCraft 2 World Team League (one of the biggest SC2 E-Sports tournaments in the world), helping them translate the tournament content.
Launched the largest official Minecraft tournament in China, attracting over 1,000 contestants and over 1 million online viewers annually by leading a team of 30 members.
Achieved long-term cooperation with major internet companies in China including NetEase, Qihoo360, Tencent, and Youku; secured project investment from NetEase and JoyMe.com. Presented at ChinaJoy2016 with more than 325,000 entries.
Recent Blog Posts
- [Nov 6, 2023] 2023秋(Placeholder)
Academic Services
Program Committee: IEEE ASRU 2025, IEEE T-ASLP 2025, AAAI 2026, ICASSP
Misc.
- 🐱 I have a tuxedo cat named Brann
- 🎭 My favorite musical is Hamilton
- 🎬 I used to be an AMVer (Anime Music Video creator)