As a GPU Specialist Cloud Engineer (CE) within the Oracle Cloud Infrastructure (OCI) Pre-Sales organization, you will serve as the primary technical authority for high-performance computing (HPC) and Artificial Intelligence infrastructure. You are not just a generalist; you are the bridge between complex silicon capabilities and transformative business outcomes.

You will partner with Enterprise Sales teams to lead the technical discovery, architectural design, and proof-of-concept (PoC) execution for customers building the next generation of Large Language Models (LLMs), generative AI applications, and computationally intensive simulations. This role requires a deep understanding of NVIDIA/AMD hardware stacks, RDMA networking, and the software orchestration layers that make massive-scale GPU clusters hum.

Core Responsibilities

1. Strategic Technical Advisory

Architectural Design: Design end-to-end AI infrastructure solutions on OCI, focusing on Superclusters that leverage NVIDIA H200/B300/GB300 or AMD Instinct™ accelerators.
Optimization: Advise customers on right-sizing GPU shapes based on workload requirements (e.g., training vs. inference, FP8 vs. FP16 precision).
Networking Excellence: Design high-throughput, low-latency interconnect fabrics using RoCE v2 (RDMA over Converged Ethernet) and OCI’s non-blocking leaf-spine architecture.

2. Hands-on Execution & Validation

Proof of Concept (PoC): Lead deep-dive technical evaluations, demonstrating OCI’s superior price-performance ratios for model training and fine-tuning.
Stack Integration: Assist customers in deploying and optimizing the NVIDIA AI Enterprise stack, Triton Inference Server, and NeMo Framework on OCI.
Performance Tuning: Work directly with engineering teams to troubleshoot "bottlenecks"—whether they reside in the kernel, the NCCL (NVIDIA Collective Communications Library) configuration, or the storage IOPS.

3. Thought Leadership & Enablement

Content Creation: Develop whitepapers, reference architectures, and blog posts detailing OCI’s competitive advantages in the AI sovereign cloud and private AI spaces.
Market Intelligence: Stay ahead of the curve on the evolving landscape of AI accelerators, interconnects (InfiniBand vs. Ethernet), and distributed training frameworks (PyTorch, JAX, DeepSpeed).

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.

We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.

Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.

We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States.

Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

As a GPU Specialist Cloud Engineer (CE) within the Oracle Cloud Infrastructure (OCI) Pre-Sales organization, you will serve as the primary technical authority for high-performance computing (HPC) and Artificial Intelligence infrastructure. You are not just a generalist; you are the bridge between complex silicon capabilities and transformative business outcomes. You will partner with Enterprise Sales teams to lead the technical discovery, architectural design, and proof-of-concept (PoC) execution for customers building the next generation of Large Language Models (LLMs), generative AI applications, and computationally intensive simulations. This role requires a deep understanding of NVIDIA/AMD hardware stacks, RDMA networking, and the software orchestration layers that make massive-scale GPU clusters hum.

Career Level - IC5

Required Technical Competencies

Domain	Expertise Required
GPU Architecture	Deep knowledge of CUDA cores, Tensor Cores, HBM3 memory, and NVLink/NVSwitch topologies.
Networking	Mastery of RDMA, RoCE, and high-speed fabric management for multi-node distributed training.
Storage	Experience with high-performance parallel file systems like Lustre, Weka, or OCI’s High-Performance Storage for feeding data to GPUs at scale.
Orchestration	Proficiency in Kubernetes (OKE) for AI, Slurm for batch job scheduling, and NVIDIA GPU Operator.
AI Frameworks	Hands-on experience with PyTorch, TensorFlow, and libraries for distributed computing like Megatron-LM.

Candidate Qualifications

Education: Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related quantitative field.
Experience: 10+ years in Pre-Sales Engineering, Systems Architecture, or HPC. At least 3 years specifically focused on GPU-accelerated computing.
The "OCI Edge": Familiarity with OCI’s "Off-Box" virtualization and how it enables "Bare Metal" performance in a cloud environment.
Communication: The ability to explain the difference between latency and throughput to a CTO, while being able to debug a Python script with a Data Scientist.

Apply now

See more open positions at Oracle

Privacy policy Cookie policy

Our Mission

Our History

Our Team

Our Board of Trustees

Board of Trustees Student Nominations

Audited Financials

Careers

Mentorship

Apprenticeship Pathway Program

Talent Network

Founders

Membership

Lifetime Membership

Responsible AI Certification (RAIC)

Apprenticeship Pathway Program Apprentice

Apprenticeship Pathway Program Industry Partners

NEXT

Tech Collabs

GHC

Donate

Recurring Donate

Sponsors & Partner Opportunities

Membership Sponsorship

Our Communities

Systers

Gift Membership

Case Studies & White Papers

Technical Equity Experience Study (TechEES)

Impact Reports

Visual Impact Report

Top Companies

Pass It On Awards

AnitaB.org Tech Journey Scholarship

Our Resources

Blog

Podcast

Become a Member

AnitaB.org Talent Network

Master Principal Cloud Engineer - GPU & AI Infrastructure