
During July 2025, Zhihe Wang developed ASCEND NPU training support for Qwen3-8B and Qwen3-14B models in the volcengine/verl repository, focusing on scalable large-model training. Leveraging deep learning and model training expertise, Wang implemented Direct Alignment Policy Optimization (DAPO) to optimize performance on ASCEND hardware. He created shell scripts to automate training job configuration, including model paths, data inputs, and hardware-specific parameters, using Shell and RST for scripting and documentation. By validating the workflow and documenting the setup, Wang improved reproducibility and onboarding, delivering a robust foundation for future model iteration and efficient NPU-accelerated machine learning development.

Concise monthly summary for 2025-07 focusing on business value and technical achievements for volcengine/verl. Delivered ASCEND NPU training support for Qwen3-8B and Qwen3-14B using DAPO, along with automation scripts to streamline training workflows. This month’s efforts improved large-model training capability on ASCEND hardware and prepared the team for scalable iteration with optimized performance settings.
Concise monthly summary for 2025-07 focusing on business value and technical achievements for volcengine/verl. Delivered ASCEND NPU training support for Qwen3-8B and Qwen3-14B using DAPO, along with automation scripts to streamline training workflows. This month’s efforts improved large-model training capability on ASCEND hardware and prepared the team for scalable iteration with optimized performance settings.
Overview of all repositories you've contributed to across your timeline