Senior Perception Engineer, Obstacle Foundation Models - Autonomous Vehicles
NVIDIA
Intelligent machines powered by artificial intelligence—computers that can learn, reason, and interact with people—are transforming every industry. GPU-accelerated deep learning provides the foundation for machines to perceive, reason, and solve complex problems. NVIDIA GPUs run deep learning algorithms that simulate aspects of human intelligence, acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world.
We are seeking an exceptional Senior Perception Engineer to help design and productize NVIDIA’s next-generation autonomous driving perception stack. You will work on the core 3D obstacle perception pipeline, contribute to architecture and algorithm design, and remain deeply hands-on with implementation, including modern transformer-based, multi-modal, and vision-language techniques where they add real value.
What you’ll be doing:
Develop and improve the technical design, architecture, and roadmap for 3D obstacle perception to support end-to-end autonomous driving functionalities, leveraging state-of-the-art CNN and transformer-based architectures where appropriate.
Design and implement advanced 3D perception models using multi-camera inputs and/or multi-sensor fusion (camera, radar, lidar) for obstacle detection and tracking, including opportunities to explore BEV and transformer-based 3D perception.
Build efficient, production-grade deep learning models: define objectives with the team, select and prototype architectures, run experiments, and follow best practices for training and evaluation, using techniques such as large-scale pretraining, distillation, and parameter-efficient fine-tuning (e.g., LoRA).
Help define and maintain KPI frameworks to quantify perception performance; analyze large-scale real and synthetic datasets to identify failure modes and systematically improve accuracy, robustness, and efficiency, incorporating approaches like self-supervised and representation learning when beneficial.
Contribute to the data strategy for perception: specify data and labeling requirements, help prioritize data collection and annotation, and collaborate with data and ground-truth teams, including model-assisted workflows (e.g., active learning, auto-labeling, vision-language models (VLMs)) and model-in-the-loop tooling.
Collaborate with safety, systems, and software teams to ensure perception solutions meet product requirements for safety, latency, resource usage, and software robustness, and are ready for deployment at scale.
What we need to see:
8+ years of hands-on experience developing deep learning–based perception or closely related systems for complex real-world problems, with strong proficiency in frameworks such as PyTorch and a track record of taking models from prototype to production.
Proven experience in data-driven development, including close collaboration with data, labeling, and ground-truth teams on data strategy, labeling quality, and iterative model improvement.
Strong programming skills in Python and/or C++, with experience building reliable, high-performance, production-quality software.
Excellent communication and collaboration skills, with the ability to work effectively across multidisciplinary teams.
BS/MS/PhD in Computer Science, Electrical Engineering, or related fields (or equivalent experience).
Ways to stand out from the crowd:
Experience designing and deploying perception solutions for autonomous driving or robotics using camera-based deep learning at scale.
Hands-on experience architecting and deploying DNN-based perception pipelines on embedded or real-time platforms, including optimization for latency, memory, and compute constraints, and experience with modern architectures such as CNNs and transformers, plus familiarity with techniques like large-scale pretraining, parameter-efficient fine-tuning (e.g., LoRA), or vision-language models (VLMs).
Strong publication record or recognized contributions in deep learning, computer vision, or autonomous systems at leading conferences/journals (e.g., CVPR, ICCV, NeurIPS, IROS).
Deep understanding of 3D computer vision fundamentals, including camera modeling and calibration (intrinsic and extrinsic), multi-view geometry, and 3D representations, ideally with experience applying these concepts in transformer-based 3D or BEV perception pipelines.
Experience with CUDA development and optimizing training or inference pipelines through custom CUDA kernels or other GPU-accelerated components.
You will also be eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.