Software Engineer — LLM Systems, Generative AI Infrastructure & Agentic Platforms
Apple
Software Engineering, Other Engineering, Data Science
Cupertino, CA, USA
USD 147,400-272,100 / year + Equity
Posted on Apr 7, 2026
The Intelligence Platform team builds scalable, production-grade systems that power high-quality, user-centric intelligence across Apple’s operating systems. We focus on designing and operating large-scale ML systems leveraging Generative AI, Large Language Models, RAG architectures, and emerging agentic AI patterns. Our goal is to deliver reliable, low-latency, and privacy-preserving AI capabilities at scale.
We are looking for a Software Engineer with strong systems and engineering expertise to build and scale LLM-powered systems in production. This role focuses on designing robust infrastructure for LLM serving, tool-use orchestration, and agentic workflows. You will work at the intersection of ML and systems engineering—translating advanced AI capabilities into efficient, scalable, and reliable systems. You will play a key role in shaping system architecture, optimizing performance, and ensuring production readiness of LLM-driven features across Apple platforms.
- * Design and build scalable systems for LLM inference, orchestration, and agentic workflows (e.g., tool-use pipelines, multi-step reasoning systems).
- * Productionize LLM-based solutions with a focus on latency, throughput, reliability, and scalability.
- * Architect and maintain infrastructure for model serving, batching, caching, and context management.
- * Develop and optimize pipelines for RAG systems, retrieval infrastructure, and data flow across components.
- * Partner with modeling teams to integrate models into production systems, ensuring alignment with performance and product requirements.
- * Build monitoring, evaluation, and feedback systems to ensure high-quality and robust model behavior in production.
- * Drive system-level optimizations across the stack, including distributed systems, concurrency, and resource management.
- Strong software engineering background with experience building distributed systems or large-scale production services.
- Experience deploying and operating ML/LLM systems in production environments.
- Solid understanding of systems design, performance optimization, and scalability trade-offs.
- Proficiency in programming and building reliable backend systems.
- Familiarity with LLM architectures and inference workflows.
- Experience with LLM serving systems, inference optimization, batching strategies, or caching (KV/prefix).
- Experience designing agentic systems, tool orchestration frameworks, or multi-turn pipelines.
- Familiarity with RAG systems, retrieval infrastructure, and vector databases.
- Experience with on-device / hybrid ML systems and constraints (latency, memory, privacy).
- Ability to lead system design discussions and influence architecture decisions across teams.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.