Intel

AI GPU Arch Perf Optimization Intern

PRC, Shanghai Full time

Job Details:

Job Description:

The Role and Impact As an AI Architecture and Performance Optimization Graduate Intern, you will join Intel's GPU Compute Architecture team and contribute to core GPU kernel optimization and GPU IP validation using real AI workloads. Your work will directly support hardware/software codesign and help shape the performance of nextgeneration Intel GPU and AI accelerator platforms, while giving you handson exposure to GPU architecture and lowlevel performance engineering. Key Responsibilities Analyze and optimize core GPU compute kernels for AI and numerical workloads (e.g., GEMM, Attention, operator fusion). Reproduce representative AI inference and training workloads for GPU IP validation. Perform GPU performance profiling and analysis to identify compute, memory, and pipeline bottlenecks. Build performance profiles and models to understand architecture level performance behavior. Provide workload and kernel level insights to support GPU architecture design and HW/SW codesign efforts.

Qualifications:

Minimum Qualifications Currently pursuing a Bachelor's, Master's, or PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. Proficiency in Python for analysis, experimentation, or tooling. Solid understanding of AI fundamentals, including common models and algorithms. Strong interest in GPU architecture, GPU programming, parallel computing, and performance optimization. Basic knowledge of computer systems, such as CPU/GPU architecture, memory systems, and performance analysis. Preferred Qualifications Experience with GPU kernels or programming models (e.g., CUDA, OpenCL, SYCL, Triton). Exposure to performance optimization, compiler, or parallel computing coursework, research, or internships. Strong analytical and problem solving skills, with the ability to reason from profiling data. Interest in AI systems and infrastructure, beyond model level development. Ability to work effectively in a collaborative, cross functional engineering environment.

          

Job Type:

Student / Intern

Shift:

Shift 1 (China)

Primary Location:

PRC, Shanghai

Additional Locations:

PRC, Beijing

Business group:

At the Data Center Group (DCG), we're committed to delivering exceptional products and delighting our customers. We offer both broad-market Xeon-based solutions and custom x86-based products, ensuring tailored innovation for diverse needs across general-purpose compute, web services, HPC, and AI-accelerated systems. Our charter encompasses defining business strategy and roadmaps, product management, developing ecosystems and business opportunities, delivering strong financial performance, and reinvigorating x86 leadership. Join us as we transform the data center segment through workload driven leadership products and close collaboration with our partners.

Posting Statement:

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Position of Trust

N/A

Work Model for this Role

This role will require an on-site presence. * Job posting details (such as work model, location or time type) are subject to change.

*

ADDITIONAL INFORMATION: Intel is committed to Responsible Business Alliance (RBA) compliance and ethical hiring practices. We do not charge any fees during our hiring process. Candidates should never be required to pay recruitment fees, medical examination fees, or any other charges as a condition of employment. If you are asked to pay any fees during our hiring process, please report this immediately to your recruiter.