Zoox’s internship program provides hands-on experiences with state of the art technology, mentorship from some of the industry's brightest minds, and the opportunity to play a part in our success. Internships at Zoox are reserved for those who demonstrate outstanding academic performance, activities outside their course work, aptitude, curiosity, and a passion for Zoox's mission.
Perception at Zoox is the "Retina of Zoox" — the system responsible for understanding the world around the autonomous vehicle.]
As an MLE intern working on Perception, you may be assigned to one of the following teams:
On the Offline Driving Intelligence team, you will develop advanced multimodal large language models that enhance scenario understanding and driving. You'll develop and fine-tune models with driving data, ensuring models can efficiently identify hazards, interpret driving restrictions, drive and answer questions about the scenario. Working alongside world-class engineers and researchers, you'll leverage premium sensor data and cutting-edge infrastructure to validate your algorithms in real-world conditions, directly impacting productivity, safety and the capability of Zoox's autonomous system.
On the Perception Attributes team, you will collect and generate datasets for specialized vehicle classification and semantic enrichment, design and frame machine learning problems for real-world autonomous driving scenarios and train and evaluate state-of-the-art machine learning models with a focus on computer vision. You will also collaborate with engineers to deploy models for real-time inference on our vehicles, and contribute to improving our vehicle's ability to recognize and respond to emergency vehicles, school buses, construction vehicles, and other specialized road actors.
On the Perception Scene Understanding team, you will develop advanced ML models that perceive our vehicle's surroundings to identify hazards and driving restrictions. You will utilize vision-language models for detecting rare events and ensuring safe driving in these situations. You'll work with state-of-the-art machine learning models that operate in real-time on our robotaxi platform with minimal latency. Collaborating with world-class engineers and researchers across sensors, planning, and other teams, you'll have access to premium sensor data and cutting-edge infrastructure to validate your algorithms in real-world conditions.
On the Occupancy and Rare Events team, you will develop multimodal foundation models that serve as the common backbone for on-vehicle perception, enhancing the system's ability to detect long-tail events and generalize to new geofences. In this role, you will develop effective tokenization techniques for Vision, Lidar, and Radar modalities, leverage LLM techniques to align token embeddings across modalities into a common feature space supporting various 3D tasks (detection, segmentation, tracking, feature matching, dense depth), You'll collaborate with top-notch engineers across PCP, MLInfra, and Offboard Driving Intelligence teams, utilizing Zoox's large-scale dataset to train and evaluate models that directly impact the autonomous system's real-world performance.
On the perception optimization team, you will build optimized inference pipelines for on-bot algorithms. A major focus of optimization is ML models, with techniques such as quantization, pruning, and advanced transformer optimizations such as token pruning, merging and layer pruning being used to deploy large models into the bot to operate at real time. In this role, you will experiment with optimizing SOTA large ML models to make them fit into on-bot compute, including both post-training optimization (e.g. quantization) as well as architectural approaches (e.g. token merging).