Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

At Tenstorrent, we believe the future of computing must be open, which is why our interns don’t just watch from the sidelines - they help build the core of it. We provide a "code-to-career" pipeline where students collaborate with industry experts to solve high-stakes problems in RISC-V and AI hardware-software co-design. By joining us, you are taking an internship to democratize high-performance computers that are accessible to everyone.

In this role, you will implement state of art ML models on Tenstorrent hardware using Python and C++, focusing on pushing both accuracy and inference speed. You will work hands-on with Tenstorrent’s open-source software stack (tt-metalium, tt-nn, tt-llk), taking models from framework to silicon and iterating on performance. You will own a well-defined engineering project under the guidance of a dedicated mentor, with direct impact on how real workloads run on our chips. We are looking for a minimum of 3 months for this role with the potential for extension to 6 months.

This role is onsite, based in our Belgrade office.

Who You Are

Enrolled in the final year of BSc or MSc studies in Computer Science, Computer Engineering, Software Engineering, Electronics, Math, or a related field.
Solid coding skills in Python and C++, with a basic understanding of machine learning concepts and frameworks.
You have a passion for programming, are eager to learn, and enjoy solving complex performance and optimization problems.
You are collaborative, open to feedback, and excited to work closely with experienced engineers and a dedicated mentor.

What We Need

Implement functional ML models on Tenstorrent hardware using Python and popular ML frameworks like PyTorch.
Benchmark, analyze, and optimize the performance of the implemented model's inference using existing tools and coding in C++ and Python.
Collaborate with experienced engineers to validate the accuracy of implemented models and iterate on improvements.
Contribute to performance optimization efforts where success is measured by achieving both high accuracy and fast execution (inference) of ML models on Tenstorrent hardware.

What You Will Learn

How to implement state-of-the-art ML models on Tenstorrent hardware using Python, C++, and popular ML frameworks like PyTorch.
Techniques for benchmarking, analyzing, and optimizing the performance of ML model inference using existing tools and code in C++ and Python.
How to use (and potentially debug and fix) Tenstorrent’s open-source software libraries, such as tt-metalium, tt-nn, and tt-llk.
How to collaborate with experienced engineers, apply various problem-solving techniques, and drive a well-defined engineering project under the guidance of a dedicated mentor.

Hiring Timelines

This internship opportunity is available throughout our 3 terms with the following corresponding recruitment cycles:

Winter Term: Mar–May work term, Nov–Jan recruit.
Summer Term: Jul/Aug–Sep work term, Jan–Apr/May recruit.
Fall Term: Oct–Dec work term, Apr–May recruit.

Please note these timelines are for reference only. Actual timelines may vary.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

ML Models Implementation & Performance Optimization, Intern (Serbia)

Related Jobs

Talent Advisor

RF Test Engineering Intern

Data Engineer / Integrations Specialist - Contract

Enterprise Account Manager

Oracle Finance Consultant

Senior DevOps Engineer, Crypto