
Mingxiao Li enhanced the Tencent/digitalhuman repository by extending the Llava model’s forward method to support labels and text embeddings, enabling more flexible model training and evaluation. Using Python and deep learning frameworks, Mingxiao introduced a dynamic loss-control mechanism that allows switching between loss calculation strategies through a new method attribute and setter function. To improve integration reliability, Mingxiao fixed import paths for Llama components, ensuring correct module usage, and addressed loss tensor dtype mismatches to stabilize training in the Deepseed environment. The work demonstrated strong debugging and model development skills, resulting in a more robust and adaptable machine learning pipeline.

Monthly work summary for 2025-05 for Tencent/digitalhuman focused on delivering core model enhancements, stabilizing training, and improving integration reliability. Key features delivered include extending the Llava model forward to support labels and text embeddings, and introducing a dynamic loss-control mechanism to switch loss strategies. Major bug fixes addressed import path correctness for Llama components and ensured loss dtype integrity during training, enhancing stability in the Deepseed environment.
Monthly work summary for 2025-05 for Tencent/digitalhuman focused on delivering core model enhancements, stabilizing training, and improving integration reliability. Key features delivered include extending the Llava model forward to support labels and text embeddings, and introducing a dynamic loss-control mechanism to switch loss strategies. Major bug fixes addressed import path correctness for Llama components and ensured loss dtype integrity during training, enhancing stability in the Deepseed environment.
Overview of all repositories you've contributed to across your timeline