

Skilled in C++ and Python, I complete innovative ML solutions in Megatron-LM, enhancing MoE model efficiency and training speed. I tried to introduce expert parallelism into the vLLM framework and made corresponding CUDA Kernel adaptations.I have a strong passion for learning technical knowledge in the fields of AI and HPC, and can quickly master the techniques and principles involved.
Proficient in C and Python development in ML training and inference framework
Familiar with CUDA development
Experience in distributed deployment of LLM training and inference tasks
Able to quickly master new technologies in the LLM field and have a strong interest in AI infra technology