Yougen Yuan - AI Researcher & Engineer

About Me

I am a Senior AI Algorithm Engineer at Tencent, specializing in Large Language Models (LLMs), Vision-Language Models (VLLMs), speech processing, multimodal learning, clustering, and retrieval systems. My research focuses on developing safe and secure AI technologies to ensure content safety for multimodal samples.

Professional Background

Current Position: Senior AI Algorithm Engineer at Tencent
Education: Ph.D. in Speech Processing, Northwestern Polytechnical University (2014-2019)
Research Interests: LLMs/VLLMs, Speech Processing, Multimodal Learning, Clustering, Retrieval

Key Expertise

Core Research Areas

Large Language Models: Development and optimization of text-based and multimodal LLMs
Speech Processing: Audio signal processing, speech recognition, and synthesis
Multimodal Learning: Integration of visual, textual, and audio modalities
Clustering & Retrieval: Advanced algorithms for information organization and search

Technical Skills

Programming Languages: Python, C++, Bash
Frameworks & Libraries: PyTorch, Transformers, OpenCV
Tools: Git, Huggingface, LLamaFactory, EasyR1, ms-swift, Venus, Taiji
Cloud Platforms: Tencent Cloud

Recent Focus

Currently working on advancing multimodal AI systems that can effectively process and understand information across different data types (text, images, audio) to create more intelligent and context-aware applications.

Academic Contributions

I have published multiple papers in top-tier conferences and journals, contributing to the advancement of AI technologies in speech processing and multimodal learning.

Contact Information

Feel free to reach out for collaborations, research discussions, or professional opportunities:

Email: yougenyuan@gmail.com
GitHub: ygyuan
LinkedIn: Yougen Yuan
Google Scholar: Yougen Yuan
ORCID: 0000-0002-2490-566X

Last updated: March 2026