Software Engineer (AI/ML)
Microsoft
Software Engineer (AI/ML)
Multiple Locations, United States
Save
Overview
The Worldwide Fleet Resources Lifecycle Management team is dedicated to transforming how Microsoft oversees and optimizes its global fleet of hardware resources. This includes enhancing operational efficiency, minimizing costs, and advancing sustainability initiatives across the organization. The team plays a critical role in automating the verification, management, and delivery of new hardware to Microsoft datacenters, which support services such as Azure, High Performance Computing (HPC), Microsoft Office, and Edge Computing.
By enabling seamless capacity expansion across Microsoft’s cloud services, the team contributes to the integration of advanced hardware platforms into the cloud infrastructure. Leveraging intelligent technologies and data-driven insights, they are redefining standards in fleet resource management and infrastructure scalability. Their work ensures that Microsoft can meet growing demand while maintaining reliability and performance across its global operations.
As a Software Engineer on the Worldwide Fleet Resources Lifecycle Management team, you will support the onboarding of new hardware into the Azure cloud and contribute to the integration of intelligence into tools, processes, and systems across the organization. You will be involved in gathering requirements, designing solutions, and implementing features that enable new technologies. This role offers the opportunity to grow your skills in both software and hardware, collaborate with various Azure teams, and work with emerging technologies to drive meaningful impact.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Qualifications
Required Qualifications:
- Bachelor's Degree in Computer Science, or related technical discipline with proven experience coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python.
- OR equivalent experience.
- Experience in a technical role applying machine learning or mathematical optimization techniques to real-world problems.
- Experience building or integrating solutions using large language models (LLMs) or AI agents.
Other Requirements:
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
- Bachelor's Degree in Computer Science
- OR related technical field AND 1+ year(s) technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python
- OR Master's Degree in Computer Science or related technical field with proven experience coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
-
Experience working independently and collaboratively within cross-functional teams, with a strong understanding of software engineering fundamentals including data structures, algorithms, testing methodologies, and design patterns
- Commitment to continuous personal and team development through a growth mindset.
-
Experience applying machine learning principles, including theoretical foundations such as algorithmic behavior, model architectures, optimization techniques, and statistical learning
- Experience developing cloud-based solutions and implementing Machine Learning Operations (MLOps) strategies for model deployment, monitoring, and governance.
Software Engineering IC2 - The typical base pay range for this role across the U.S. is USD $84,200 - $165,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $109,000 - $180,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications for the role until October 15, 2025.
#azurecorejobs
#IC2SWE
#EiP
Responsibilities
- Collaborates with stakeholders to identify user requirements, create design documents, and develop scalable systems and services. Works with product managers, engineers, and infrastructure teams to deliver impactful solutions.
- Utilizes strong software engineering fundamentals, including clean architecture, modular design, thorough testing, and peer reviews for reliable codebases. Develops and optimizes code to enhance performance, maintainability, effectiveness, and return on investment (ROI).
- Develops and deploys scalable AI-driven tools, algorithms, and machine learning (ML) models to enhance efficiency, reliability, and productivity. Collaborates with data scientists and product teams to align solutions with business objectives and deliver measurable value. Optimizes AI/ML models for performance and ensures seamless production integration.
- Breaks down larger features into work items and supports planning, ensuring alignment with business priorities. Estimates engineering effort and tracks progress to ensure that all tasks are completed efficiently and effectively.
- Serves as the Designated Responsible Individual (DRI) for monitoring, troubleshooting, and restoring production systems during on-call rotations. Leads live-site incident response, conducts root cause analysis, and implements long-term improvements to enhance system reliability and operational readiness.
- Demonstrates a commitment to continuous learning, staying up to date with evolving technologies and best practices. Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns to improve product availability, reliability, efficiency, observability, and performance. Actively shares knowledge and contributes to a team culture that values technical excellence and growth.
- Follows organizational policies to ensure security, privacy, safety, and accessibility standards. Demonstrates ownership and promotes a learning-oriented, inclusive team environment. Practices secure coding, data governance, and respectful collaboration within a mission-driven workplace.