AI/ML Research Engineer - Camera & Photos
Apple
Software Engineering, Data Science
Cupertino, CA, USA
USD 147,400-272,100 / year + Equity
Posted on Mar 17, 2026
Apple devices capture and edit trillions of photos and videos every year. A new team within Camera and Photos Software is being formed to push the boundaries of how visual intelligence is built, evaluated, and delivered across Apple's imaging stack. You'll work with modern deep learning, from vision-language models and generative approaches to robust evaluation methods, owning models end-to-end from research through handoff. This is a founding role for a hands-on engineer who wants to shape a team's technical direction and ship ML that impacts billions of users. Join us in building the next generation of visual intelligence at Apple, from perception to generation, research to product, at a scale few teams in the world can match.
As a machine learning research engineer on this team, you will work across the full ML lifecycle, from problem formulation and data strategy to model development and rigorous evaluation. You'll build and adapt deep networks including vision-language models, generative models, and learned perceptual representations to solve challenging problems in Apple's camera and photos ecosystem. This is a cross-functional role. You will collaborate with teams across imaging algorithms, ML frameworks, and system performance to define problems, align on metrics, and ship solutions that work on real-world content at Apple scale.
- Research and develop ML models for visual understanding, generation, and evaluation, spanning architectures such as vision-language models, diffusion models, and learned perceptual representations.
- Own data strategy end-to-end: define data needs, drive collection and labeling, and build the datasets that make models succeed.
- Prototype rapidly, iterate on ideas, and translate successful research into reliable code, tools, and pipelines.
- Explore agentic and automated workflows that accelerate model development, evaluation, and integration.
- Collaborate cross-functionally with imaging, frameworks, and product teams to define problems and ship solutions.
- Improve model robustness by identifying failure modes, designing targeted evaluations, and closing gaps in real-world performance.
- Strong fundamentals in machine learning and deep learning.
- Demonstrated ability to develop and evaluate ML models for vision problems, whether in computer vision, computational imaging, generative modeling, or multimodal learning.
- Hands-on experience training or fine-tuning deep networks, including data preparation, experimentation, and rigorous evaluation.
- Proficiency in Python and experience with PyTorch or similar deep learning frameworks.
- Master's or Ph.D. in Computer Science, Electrical Engineering, Applied Mathematics, or a related field.
- Experience with vision-language models, multi-modal foundation models, or diffusion-based generative models, including fine-tuning, evaluation, or adaptation for real-world tasks.
- Familiarity with classical computer vision, image processing, or camera/imaging pipelines.
- Experience with perceptual quality modeling or learned evaluation metrics is a plus.
- Experience working with large-scale image or video datasets and performance-sensitive systems.
- Interest in agentic AI systems and automated ML workflows.
- Curious, inventive, and energized by hard, ambiguous problems.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.