Senior Software Engineer, Metropolis Vision AI
NVIDIA
NVIDIA's technology is at the heart of the AI revolution, touching people across the planet by powering everything from self-driving cars, robotics, co-pilots, and more. Join us at the forefront of technological advancement in intelligent assistants and information retrieval. Metropolis is transforming how the physical world is perceived and understood using advanced computer vision and deep learning. Our team builds large-scale distributed Vision AI platforms that power intelligent spaces, smart cities, retail analytics, and digital twins. This role offers the opportunity to contribute to core components of a strategic platform with high visibility and real-world impact. As a System Software Engineer for Vision AI, you will develop and optimize high-performance vision systems that turn massive streams of video, image, and 3D data into actionable insights. You will collaborate with specialists in perception, simulation, and large models to bring research into production at scale.
What you’ll be doing:
Implementing and optimizing high-performance Metropolis Vision AI pipelines for real-time and streaming scenarios using computer vision and deep learning models.
Developing and refining large-scale distributed services responsible for processing video, image, and 3D data in both edge and cloud settings.
Contributing to multi-modal perception capabilities that combine 2D, 3D, and temporal information to understand complex real-world scenes.
Using simulation and synthetic data tools to build, test, and validate perception algorithms at scale.
Profiling and tuning GPU-accelerated inference pipelines to meet strict latency, efficiency, and reliability targets.
Collaborating with partner teams across product, research, and platform to translate requirements into clear technical builds and robust implementations.
Participating in technical reviews and contributing to guidelines for code quality and testing.
What we need to see:
BS, MS or PhD in Computer Science, Electrical Engineering, or a related field, or equivalent experience.
8+ years of professional software development experience using modern C++ (14/17/20) and Python on Linux.
Strong computer science fundamentals, including algorithms, data structures, concurrency, and distributed systems concepts.
Experience in computer vision and deep learning, with a history of deploying production systems in these fields.
Experience building and debugging high-performance, concurrent systems, including multi-threading, asynchronous I/O, and efficient memory management.
Proficiency working in Linux-based environments with containers and microservices, integrating AI components into scalable back-end services.
Ability to rapidly prototype vision models and pipelines, then evolve them into production-quality services.
Practical experience with PyTorch in training, fine-tuning, and deploying models for vision tasks.
Strong analytical and problem-solving skills, with a data-driven approach to performance optimization and system build.
Excellent written and verbal English communication skills, with demonstrated success collaborating across time zones and functions.
Ways to stand out from the crowd :
Practical experience delivering end-to-end computer vision applications in production, such as video analytics, smart cities, autonomous systems, retail analytics, industrial inspection, or digital twins.
Practical experience with GPU acceleration (such as CUDA, TensorRT, or comparable technologies) and low-level optimization for inference and pre/post-processing.
Experience in simulation and synthetic data creation employing tools such as Omniverse, Unreal Engine, Unity, or similar digital-twin platforms.
Background in vision-language models or related multi-modal AI, including integrating these models into real products.
Background in multimedia, including video-centric processing and delivery (such as codecs, video pipelines, or media frameworks) and integrating vision models into multimedia workflows.