Responsibilities
Responsible for machines Research and development of DSA AI accelerator in the learning system. The research direction includes starting from the machine learning platform/system, better evaluating, introducing, and using DSA AI accelerator, and supporting business AI model training and reasoning with higher cost performance. The main directions of work include: 1. Responsible for evaluating and introducing DSA accelerators that meet machine learning platforms/systems and business requirements 2. Responsible for developing the use process of DSA accelerators in machine learning platforms/systems, supporting business algorithms on top and shielding on bottom Hardware details 3. Responsible for the research and introduction of forward-looking technologies in the DSA field, such as: the latest DSA architecture, parallel computing mode, heterogeneous computing systems, compilation technology, etc. 4. In-depth cooperation with the business department to jointly optimize performance and precision.
Qualifications
1. Familiar with C/C++ and Python language in Linux environment 2. Have a solid foundation in computer science and programming skills, Familiar with common algorithms and data structures, and have good programming habits 3. Be proficient in using at least one mainstream machine learning framework (TensorFlow / PyTorch, etc.), and be familiar with the internal implementation of the framework 4. Familiar with at least one classic deep learning model and its Application scenarios, such as ResNet, BERT, etc. 5. Have good working document habits, and write and update workflow and technical documents in a timely manner as required. Other bonus requirements: 1. Understand AI DSA architecture, understand common AI chip architectures and their advantages and disadvantages, understand common AI compiler solutions and their advantages and disadvantages (such as XLA, TVM, MLIR) 2. Understand GPU hardware architecture, Understand the GPU software stack (CUDA, cuDNN), and have the ability to analyze GPU performance 3. Understand common parallel computing models and algorithms, understand various parallel programming models and their advantages and disadvantages 4. Understand model pruning, quantization and other optimization methods Principles, relevant experience in model optimization.