Xuehai Pan (/ʃwɛˈhaɪ pæn/, 潘学海 in Mandarin, [email protected]) is a final-year Ph.D. student in Applied Computer Science at Peking University. His research interests lie in the intersection of Reinforcement Learning, Multi-Agent Systems, and Distributed Computing, with a focus on developing scalable and automated algorithms and exploring their theoretical and practical aspects. He has a solid background in both research and engineering, having obtained a B.S. degree in Physics with honors and a B.S. degree in Computer Science (double major) from Peking University before pursuing his Ph.D. degree. His academic journey is embellished with achievements such as winning gold medals in the Chinese Physics Olympiad (CPhO) and the Asian Physics Olympiad (APhO) during high school.
Xuehai is now working on pioneering research in the development of Large Language Models (LLMs) while ensuring they align with human intentions and values through AI Alignment techniques (essentially balancing between helpfulness and harmlessness). Specifically, he is exploring automated data syntactic, red teaming, and evolutional training via multi-agent interaction and self-play. The ultimate goal is to build a scalable and fully automated system, including training, evaluation, inference, and governance.
Beyond academia, Xuehai is an open-source enthusiast and an active contributor to influential projects such as PyTorch, CPython, Ray, Transformers, DeepSpeed, Gymnasium (formerly OpenAI Gym), Homebrew, etc. He enjoys dedicating his spare time to helping people and sharing knowledge in the community, further enriching his impact beyond his research pursuits.