Tencent

Large Model Algorithm Researcher

Singapore-CapitaSky Full time

Business Unit

Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails

1. Responsible for the core technology development in the Post-Training phase of large language models, building and optimizing a high-quality Reward System. Continuously enhance the model's capabilities in complex instruction adherence, logical reasoning, and value alignment through Reward Modeling (RM) and Reinforcement Learning (RL) algorithms.
2. Conduct in-depth research and optimization of post-training algorithms such as RLHF to improve model training stability and final outcomes.
3.Manage and synthesize data in the post-training phase, design an efficient data feedback loop mechanism, utilize techniques like SFT and Self-Instruct to generate high-quality training data, and establish a closed-loop signal modeling system from User Feedback to model iteration. 4. Perform comprehensive evaluation and analysis of post-training models, develop scientific evaluation metrics, and keep up with cutting-edge technology trends, quickly translating the latest research results into business value.

Who We Look For

1.Master's degree or higher in Computer Science, Software Engineering, Artificial Intelligence, or related fields.
2.Deep understanding of the Transformer architecture and the principles of large language model training, with substantial research and practical experience in one of the post-training areas such as LLM Alignment, RLHF, or Reward Modeling.
3. Solid foundation in algorithms and engineering implementation capabilities, proficient in Python, and familiar with deep learning frameworks such as PyTorch or TensorFlow.
4. Practical experience in distributed training, familiar with large-scale training and inference frameworks like Megatron-LM, DeepSpeed, and vLLM. Experience in training or tuning models with billions or hundreds of billions of parameters is preferred.
5. Excellent research skills, with a record of high-quality publications (NeurIPS, ICLR, ICML, ACL, EMNLP, etc.) or contributions to high-impact projects in the open-source community (e.g., HuggingFace) preferred.
6. Strong technical enthusiasm and self-motivation, adept at analyzing and solving complex problems, with good teamwork and communication skills.

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.