Software Reliability Engineer II

Microsoft

Microsoft

Posted on Oct 29, 2025

Software Reliability Engineer II

Bangalore, Karnataka, India

Save

Share job

Date posted
Oct 29, 2025
Job number
1903144
Work site
3 days / week in-office
Travel
0-25 %
Role type
Individual Contributor
Profession
Software Engineering
Discipline
Software Engineering
Employment type
Full-Time

Overview

Are you looking for an opportunity to work with the latest Azure offerings and push the limits of cloud computing? Do you want to solve real-life challenges in intelligent cloud and enable customers to achieve more? If so, we have the perfect role for you!

We are part of the Microsoft Specialized Cloud organization, delivering Azure to customers on their premises. Our team is responsible for building innovative products and services ecosystems that bring Azure Edge computing to locations where customers are running their business.

As a Software Reliability Engineer in the Microsoft Specialized Cloud team, you will leverage end-to-end technical expertise in large scale distributed systems' infrastructure, code, inter- and intra-service dependencies, and operations to proactively and continuously improve the reliability, performance, efficiency, latency, and scalability of Edge services and products operating at scale. You will partner with software engineering product teams by suggesting scalable ways to optimize code, sharing expertise and insights drawn from working across related services or products, and participating in incident response throughout development and operations lifecycles. You will develop code, scripts, systems, and/or tools that reduce operational burden by automating complex and repetitive tasks, enable product engineering teams to increase the velocity at which they can safely deploy changes to production, and monitor the effects of changes across systems, services, and/or products. You will analyse telemetry data to identify patterns and trends that drive continuous improvement, and highlight opportunities to improve quality and reliability of our products and services. You will participate in on-call rotations to resolve live site incidents, minimize customer impact, and document solutions and insights that inform ongoing improvements to infrastructure, code, tools, and/or processes that prevent the recurrence of similar issues.

If you're ready to take on the challenge of working with highly motivated engineers, with the latest of the Azure offering and taking them to new heights, then we'd love to hear from you! Please apply today and let's build the future together!

Qualifications

  • 4+ years of experience in Software Development/SRE
  • Bachelor’s/master's degree or equivalent in Computer science or related field required
  • A strong Computer Science background with solid C#, Java, C/C++ programming (mostly scripting and automation)
  • Debugging skills is highly desired
  • Experience with AI/ML and LLMs is highly preferred
  • Knowledge of Microsoft Azure, AWS or similar cloud computing platforms is preferred
  • Strong skills in Networking, Storage, and Virtulization
  • Prior experience in working in hyperconverged infra
  • Prior experience in working with fortune 500 customers

Responsibilities

  • Acts as a Designated Responsible Individual (DRI) working on call to monitor service for degradation, downtime, or interruptions. Alerts stakeholders as to the status and gains approval to restore system/product/service for simple problems. Responds within Service Level Agreement (SLA) timeframe. Escalate issues to appropriate owners.
  • Contributes to efforts to collect, classify, and analyze data with little oversight on a range of metrics (e.g., health of the system, where bugs might be occurring). Contributes to the refinement of product features by escalating findings from analyses to inform decisions regarding the engineering of products.
  • Contributes to the development of automation within production and deployment of a complex product feature. Runs code in simulated, or other non-production environments to confirm functionality and error-free runtime for products with little to no oversight.
  • Maintains communication with key partners across the Microsoft ecosystem of engineers. Considers partners across teams and their end goals for products to drive and achieve desirable user experiences and fitting the dynamic needs of partners/customers through product development.
  • Maintains operations of live service as issues arise on a rotational, on-call basis. Implements solutions and mitigations to more complex issues impacting performance or functionality of Live Site service and escalates as necessary. Reviews and writes issues postmortem and shares insights with the team.
  • Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions. Alerts stakeholders as to status and initiates actions to restore system/product/service for simple problems and complex problems when appropriate. Responds within Service Level Agreement (SLA) timeframe. Drives efforts to reduce incident volume, looking globally at incidences and providing broad resolutions. Escalates issues to appropriate owners.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.