Yichen (Eason) Lu

Research Scientist @ Anuttacon · Multimodal & Audio Large Language Models

[Google Scholar] [GitHub] [Instagram] [LinkedIn] [CV/Resume]

📍 Mountain View, CA | 📧 mythsama@outlook.com

About Me

Hello! I am currently a Research Scientist at Anuttacon, where I work on fully duplex speech-to-speech conversational models and lead the research and development of general audio understanding models, with a focus on Large Audio Language Model (LALM) architectures. My work involves continual audio-text pretraining and post-training for comprehensive audio, speech, and music understanding.

I hold a B.S. in Computer Science from the UIUC and a M.S. in Artificial Intelligence from CMU. At CMU, I was actively involved in research at the WAVLab under the supervision of Prof. Shinji Watanabe.

My core research interests include general speech/audio language models, audio-visual fusion, and multimodal language models.

During my undergraduate studies, I was fortunate to work with Prof. Tarek Abdelzaher, Prof. Kris Hauser, and Prof. Yuxiong Wang, which greatly shaped my research journey.

I'm happy to collaborate on interesting projects. If you'd like to chat, just drop me an email ✨

News

[Sep 2025] EMNLP 2025 System Demo accepted: ViDove (multimodal translation agent).
[Aug 2025] We released Whispers from the Star, I am responsible for the audio understanding part.
[Jan 2025] Started as Research Scientist at Anuttacon
[2025] One papers accepted to Interspeech 2025: GALAXY (multimodal learning dataset).
[2024] Oral presentation at EMNLP 2024: FastAdaSP - Multitask-Adapted Efficient Inference for Large Speech LM.

Experience

Research Scientist, Anuttacon – Mountain View, CA
Jan 2025 – Present
Research Assistant, CMU (WavLab) – Pittsburgh, PA
July 2023 – Jan 2025 | Contributed to ESPNet, research on efficient inference and unified audio-visual language understanding
MLE Intern, TrovaAI – Champaign, IL
May 2023 – Aug 2023 | Designed RAG pipeline and AI agent platform using LangChain
MLE Intern, VMware, Inc. – Beijing, China
July 2021 – Jan 2022 | Built automatic data analysis system for machine translation

Selected Publications

FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model

Yichen Lu*, J Song*, CH Yang, Shinji Watanabe. EMNLP 2024 (Oral)

[Paper] [Code]

ViDove: A Translation Agent System with Multimodal Context and Memory-Augmented Reasoning

Yichen Lu*, Wei Dai*, Jiaen Liu*, et al. EMNLP 2025 System Demonstrations

[Paper] [Code]

SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation

Yichen Lu*, J Song*, X Chang, H Bian, S Maiti, Shinji Watanabe. Interspeech 2024 Workshop

[Paper]

→ View all 9 publications

Education

Carnegie Mellon University – M.S. in Artificial Intelligence
Aug 2023 – Dec 2024 | GPA: 3.76/4.00 | Coursework: Multimodal ML, LLM System, System Tool Chains for AI, Speech Recognition and Understanding
University of Illinois at Urbana-Champaign – B.S. in Statistics & Computer Science
Aug 2019 – May 2023 | GPA: 3.93/4.00 (Highest Honor; Dean's List) | Coursework: Computer Vision, Autonomous Vehicle, Applied ML, OS, Linear Algebra, DB Design, Web Dev

Notable Projects

ViDove - Multimodal Translation Agent

Founder & Researcher

Developed an end-to-end multimodal translation agent for video subtitle generation in specific fields. Managed open-source repository with 10+ technical members and directed marketing team.

Impact: Officially cooperated with StarCraft 2 World Team League (one of the biggest SC2 E-Sports tournaments in the world), helping them translate the tournament content.

[GitHub Repo]

SUGAR MASSES CREATIVE - China Minecraft Construction Summit

Founder, Developer & Director (Feb 2014 – Dec 2018)

Launched the largest official Minecraft tournament in China, attracting over 1,000 contestants and over 1 million online viewers annually by leading a team of 30 members.

Achieved long-term cooperation with major internet companies in China including NetEase, Qihoo360, Tencent, and Youku; secured project investment from NetEase and JoyMe.com. Presented at ChinaJoy2016 with more than 325,000 entries.

Recent Blog Posts

[May 28, 2026] 关于 Agent 的一些思考 — Model as Harness System
[Nov 1, 2025] 随笔 2025.11.01 — 关于成长、自洽与热爱

→ View all blog posts

Academic Services

Program Committee: IEEE ASRU 2025, IEEE T-ASLP 2025, AAAI 2026, ICASSP

Misc.

🐱 I have a tuxedo cat named Brann
🎭 My favorite musical is Hamilton
🎬 I used to be an AMVer (Anime Music Video creator)

Visitor Map

Last Updated: