Sr Software Engineer - Site Reliability - DevOps

Workday

Workday

Software Engineering
Prague, Czechia
Posted on Oct 21, 2025

Your work days are brighter here.

We’re obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you’ll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthusiasm. We’re in this together, tackling big challenges with bold ideas and genuine care. We look for curious minds and courageous collaborators who bring sun-drenched optimism and drive. Whether you're building smarter solutions, supporting customers, or creating a space where everyone belongs, you’ll do meaningful work with Workmates who’ve got your back. In return, we’ll give you the trust to take risks, the tools to grow, the skills to develop and the support of a company invested in you for the long haul. So, if you want to inspire a brighter work day for everyone, including yourself, you’ve found a match in Workday, and we hope to be a match for you too.

About the Team

We’re a small SRE team based in Prague (Masaryčka office), supporting the People Analytics product. We work with modern technologies like Docker, Kubernetes, AWS, GCP, Jenkins, and rely on strong alerting and monitoring practices as part of our daily operations. Our team collaborates closely with colleagues in the USA and India, ensuring reliability, scalability, and performance across a global environment.

About the Role

Site Reliability Engineer (SRE)

We are looking for an experienced Site Reliability Engineer to join our team and help us build, scale, and maintain robust systems that enable high availability, reliability, and performance across our infrastructure.

Basic Qualifications:

  • 8+ years of experience as a Site Reliability Engineer (SRE), designing and implementing resilient infrastructure, proactively identifying, and mitigating potential issues.

  • Hands-on experience with Docker, Kubernetes, and public cloud platforms (AWS, GCP).

  • Strong experience with CI/CD systems (e.g., Jenkins) and familiarity with the CNCF Continuous Integration & Delivery ecosystem (Continuous Integration & Delivery in CNCF Landscape)

  • Proficiency in Bash and Python scripting for automation and tooling.

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

Preferred Qualifications:

  • Deep understanding of Design for Maintainability principles and their application in scalable system architectures.

  • Proven experience in designing and implementing process automations using advanced scripting and tooling to improve reliability and efficiency.

  • Strong knowledge of Reliability Analysis and Root Cause Analysis (RCA) methodologies to identify and address critical failure modes.

  • Solid grasp of Reliability Engineering principles to drive system availability and performance improvements.

  • Experience defining, monitoring, and alerting on Service-Level Objectives (SLOs) and related metrics.

  • Strong software development background with experience building SRE tools and automation frameworks in languages such as Python or Go.

  • Demonstrated ability to collaborate effectively with Development, Operations, and Security teams to drive reliability-focused initiatives.

  • Excellent technical documentation skills, with experience creating detailed architectural and operational documentation.

  • Proven ability to troubleshoot complex distributed systems, leveraging observability and diagnostics tools.

About You

  • You communicate clearly and effectively, both verbally and in writing.

  • You thrive in a fast-paced, global environment, collaborating with teams across the USA and India.

  • You enjoy solving complex problems and thinking critically to improve reliability and performance.

  • You take ownership of monitoring, alerting, and automation, ensuring systems run smoothly.

  • You work seamlessly with engineering, operations, and product teams to drive continuous improvement.

  • You understand SLOs, observability, and incident response principles and apply them in practice.



Our Approach to Flexible Work

With Flex Work, we’re combining the best of both worlds: in-person time and remote. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. We know that flexibility can take shape in many ways, so rather than a number of required days in-office each week, we simply spend at least half (50%) of our time each quarter in the office or in the field with our customers, prospects, and partners (depending on role). This means you'll have the freedom to create a flexible schedule that caters to your business, team, and personal needs, while being intentional to make the most of time spent together. Those in our remote "home office" roles also have the opportunity to come together in our offices for important moments that matter.

Are you being referred to one of our roles? If so, ask your connection at Workday about our Employee Referral process!

At Workday, we value our candidates’ privacy and data security. Workday will never ask candidates to apply to jobs through websites that are not Workday Careers.

Please be aware of sites that may ask for you to input your data in connection with a job posting that appears to be from Workday but is not.

In addition, Workday will never ask candidates to pay a recruiting fee, or pay for consulting or coaching services, in order to apply for a job at Workday.