Summary

Overview

Work History

Education

Skills

Timeline

Hongrui Zhan

Computer Science

Hefei,AH

Summary

Skilled in C++ and Python, I complete innovative ML solutions in Megatron-LM, enhancing MoE model efficiency and training speed. I tried to introduce expert parallelism into the vLLM framework and made corresponding CUDA Kernel adaptations.I have a strong passion for learning technical knowledge in the fields of AI and HPC, and can quickly master the techniques and principles involved.

Overview

year of professional experience

Work History

Adding Expert Parallelism in VLLM

02.2024 - Current

Developing tensor parallelism and expert parallelism independent of non-MoE parts in vLLM.
Modify some cuda kernels related to MoE operators in vLLM to adapt to flexible expert parallelism.
Find some performance optimization opportunities for MoE model inference.

LazyCopy for MoE All-to-All Communication

10.2024 - 12.2024

Discover the redundancy in MoE model training, design and implementation a no redundancy All-to-All method in Megatron-LM framework.
Develop some simple cuda kernels as a torch extension to speedup operations of tensor indexes.
Evaluate the training speedup of lazycopy All-to-All methods.

Communication-Friendly MoE Structure

05.2024 - 08.2024

Design communication-efficient MoE structure (which is called BigMac) and implement it in Megatron-LM framework.
Develop tensor parallelism for BigMac structure in Megatron-LM framework.
Evaluate the training speedup of the new model and confirm that the model accuracy loss compared to the baseline is acceptable.

Education

Master of Science - Computer Science

University of Science And Technology of China

Hefei, Anhui

04.2001 -

Bachelor of Science - Computer Science

ShanDong University

Qing Dao

04.2001 -

Skills

Proficient in C and Python development in ML training and inference framework

Timeline

LazyCopy for MoE All-to-All Communication

10.2024 - 12.2024

Communication-Friendly MoE Structure

05.2024 - 08.2024

Adding Expert Parallelism in VLLM

02.2024 - Current

Master of Science - Computer Science

University of Science And Technology of China

04.2001 -

Bachelor of Science - Computer Science

ShanDong University

04.2001 -

Hongrui Zhan

Summary

Overview

Work History

Adding Expert Parallelism in VLLM

LazyCopy for MoE All-to-All Communication

Communication-Friendly MoE Structure

Education

Master of Science - Computer Science

Bachelor of Science - Computer Science

Skills

Timeline

LazyCopy for MoE All-to-All Communication

Communication-Friendly MoE Structure

Adding Expert Parallelism in VLLM

Master of Science - Computer Science

Bachelor of Science - Computer Science

Similar Profiles

Salena SaeleeSalena Saelee

Arianna SmithArianna Smith

Aiden PlattAiden Platt

Ivy LeIvy Le

Pei PeiPei Pei