Hero Image

AnitaB.org Talent Network

Connecting women in tech with the best professional opportunities!

Systems Administrator , Global Operations Support Engineering

Amazon

Amazon

IT, Operations, Customer Service
Herndon, VA, USA
Posted on Mar 11, 2026

Description

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we're looking for talented people who want to help.

You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You'll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

The AWS Global Operations Support Engineering (GOSE) team is seeking a System Engineer to build and maintain business automation infrastructure and support the development of AI-driven operational intelligence platforms. This role will implement production systems that transform manual operational processes into automated, intelligent workflows that improve efficiency and reliability across AWS's global data center portfolio.

As a System Engineer, you will build AWS infrastructure for automation tools, implement integrations with internal systems, and support the deployment of AI-driven operational capabilities. You will work hands-on with Lambda functions, AgentCore, Bedrock, API integrations, and infrastructure-as-code to create scalable solutions that enable thousands of data center engineers to work more efficiently.

Key job responsibilities
- Build and maintain AWS infrastructure for business automation solutions, including , AgentCore deployments, API integrations, MCP server deployments, IAM roles, CloudWatch monitoring, and dedicated AWS accounts with appropriate security controls

- Implement infrastructure-as-code using CDK/CloudFormation to enable repeatable, version-controlled deployments and establish CI/CD pipelines for automation script testing and production deployment

- Develop usage logging, database tracking, authentication systems, and API integrations with internal systems to automate ticket creation, data retrieval, and workflow orchestration

- Support the productionalization of AI proof-of-concepts by implementing infrastructure components, deployment pipelines, operational monitoring, and integration layers between AI agents and internal systems

- Create monitoring and alerting solutions to ensure high availability of automation infrastructure, troubleshoot system issues across AWS services and integration points

- Collaborate with other Systems Engineers, Business Intelligence Engineers, TPMs, and Data Engineers to implement technical solutions while participating in code reviews and design reviews

- Implement automated testing for infrastructure changes, establish logging and observability for automation systems, and contribute to team documentation and best practice guides

- Continuously learn new AWS services, automation techniques, and AI/ML capabilities to improve technical skills and identify improvements to system reliability and performance

About the team
The Global Operations Support Engineering (GOSE) team is focused on maximizing AWS data center infrastructure availability and operational excellence. We achieve this by optimizing labor utilization, deep diving event and incident analysis, developing data engineering and business intelligence solutions, deploying business automation, and managing global operational improvement initiatives.

We transform critical infrastructure data into actionable intelligence that enables the Data Center Community (DCC) organization to prevent customer impact, reduce operational burden, focus on highest-impact activities, and continuously improve fleet-wide reliability and productivity. Through our comprehensive monitoring, analysis, reporting, and program/project management, we serve as the analytical backbone that drives continuous improvement in operational excellence across the global data center portfolio.

The team operates at the intersection of infrastructure operations, data engineering, and artificial intelligence—building systems that fundamentally change how AWS manages its global infrastructure at scale.