Yougen Yuan - AI Researcher & Engineer
About Me
I am a Senior AI Algorithm Engineer at Tencent, specializing in Large Language Models (LLMs), Vision-Language Models (VLLMs), speech processing, multimodal learning, clustering, and retrieval systems. My research focuses on developing safe and secure AI technologies to ensure content safety for multimodal samples.
Professional Background
- Current Position: Senior AI Algorithm Engineer at Tencent
- Education: Ph.D. in Speech Processing, Northwestern Polytechnical University (2014-2019)
- Research Interests: LLMs/VLLMs, Speech Processing, Multimodal Learning, Clustering, Retrieval
Key Expertise
Core Research Areas
- Large Language Models: Development and optimization of text-based and multimodal LLMs
- Speech Processing: Audio signal processing, speech recognition, and synthesis
- Multimodal Learning: Integration of visual, textual, and audio modalities
- Clustering & Retrieval: Advanced algorithms for information organization and search
Technical Skills
- Programming Languages: Python, C++, Bash
- Frameworks & Libraries: PyTorch, Transformers, OpenCV
- Tools: Git, Huggingface, LLamaFactory, EasyR1, ms-swift, Venus, Taiji
- Cloud Platforms: Tencent Cloud
Recent Focus
Currently working on advancing multimodal AI systems that can effectively process and understand information across different data types (text, images, audio) to create more intelligent and context-aware applications.
Academic Contributions
I have published multiple papers in top-tier conferences and journals, contributing to the advancement of AI technologies in speech processing and multimodal learning.
Contact Information
Feel free to reach out for collaborations, research discussions, or professional opportunities:
- Email: yougenyuan@gmail.com
- GitHub: ygyuan
- LinkedIn: Yougen Yuan
- Google Scholar: Yougen Yuan
- ORCID: 0000-0002-2490-566X
Last updated: March 2026
