Responsibilities
Team Introduction: ByteDance Doubao Big Model Team was established in 2023 and is committed to developing the industry's most advanced AI big model technology, becoming a world-class research team, and contributing to the development of science and technology and society. The Doubao Big Model team has a long-term vision and determination in the field of AI. Its research directions cover NLP, CV, voice, etc. It has laboratories and research positions in China, Singapore, the United States and other places. Relying on the platform's abundant data, computing and other resources, the team has continued to invest in related fields and has launched a self-developed general large model that provides multimodal capabilities. Downstream support for more than 50 businesses such as Doubao, Kouzi, and Jimeng is also available to corporate customers through Volcano Engine. Currently, Doubao APP has become the AIGC application with the largest number of users in the Chinese market. 1. Responsible for a series of tasks such as synthesis, cleaning, weight allocation, and source expansion of code pre-training data, and continuously improve the quality of code pre-training and mid-training data responsible for exploring the relationship between the ratio of pre-training small field data and the final effect developing data synthesis links to solve key problems in code models 2. Responsible for exploring deep reasoning technology, exploring the scaling laws of test-time compute and model effects, participating in a series of optimization processes of post-training reward models and reinforcement learning algorithms, and exploring the data flywheel from online code completion data to RL processes 3. Focus on the optimization and innovation of the reward model in code reinforcement learning including cooperating with the SFT stage to solve scenarios with poor discrimination ability, exploring synthetic data for pre-training of code reward models, organizing annotation personnel to annotate code reward models, frontier exploration of critics, and quality filtering and expansion of executable code and unit tests in the reinforcement learning process.
Qualifications
1. Bachelor degree or above, major in computer science, physics, mathematics, neuroscience or related majors 2. Have a solid foundation in computer science and programming skills, be familiar with common algorithms and data structures, and have good programming habits 3. Be familiar with the basic technology and model structure of language models, and have faith and enthusiasm for the future of AI 4. Be conscientious and meticulous in work, have strong planning, have a spirit of inquiry, have a strong belief in seeking truth from facts in R&D work, and pursue the final effect. Do not blindly pursue novel methods, but have a strong sense of responsibility for the work content. Bonus points: 1. Priority will be given to those with NOI and ACM competition experience priority will be given to those who have excellent data-driven workers in the recommendation and search fields 2. Familiar with reinforcement learning related technologies and details, and have been deeply involved in reinforcement learning projects or language model projects 3. Have strong engineering capabilities, can quickly become familiar with the use of ByteDance's internal and external platform tools, and have the awareness of actively improving efficiency.