ITS System Admin/Engr II (CIS), Corporate Infrastructure Product Management

Amazon

Amazon

Other Engineering, Product, IT

India · Bengaluru, Karnataka, India · Karnataka, India

Posted on May 9, 2026

Description

We are looking for a Systems Engineer to join the CIPM Compute team and own the technical execution layer of our fleet lifecycle program. This is an embedded engineering role within a product management team; you will be the technical hands and eyes of the compute space, working alongside the Technical Program Manager (TPM) who coordinates and owns the overall program.
You will lead lab testing and validation, execute and support pilot deployments, create and maintain technical documentation, and supervise the health and compliance of the production compute fleet. You will work closely with partner engineering teams as a technical peer, consumer, and validator, ensuring that hardware platforms, deployment processes, and lifecycle procedures meet CIPM's standards before handover to delivery teams.
This role is ideal for an engineer who thrives at the intersection of hands-on technical work and structured program execution, and who wants to grow their impact in a team that directly shapes infrastructure strategy at Amazon scale.


Key job responsibilities
Fleet Lifecycle & Health Supervision
● Monitor and review the health of the production compute fleet (CPU, memory, storage, firmware compliance) and proactively identify risks including end-of-life hardware, unresolved vulnerabilities, and capacity gaps
● Coordinate firmware patching cycles and validate remediation outcomes across the fleet
● Provide Tier-3 support for technically complex post-deployment issues, maintaining a structured support queue and hosting regular open-office hours for stakeholders
● Join quarterly reviews with hardware vendors to track lifecycle status and emerging platform developments

Lab Testing & Pilot Execution
● Execute lab testing sessions for new hardware platforms, firmware releases, and deployment automation, documenting findings and providing structured feedback
● Validate deployment and migration runbooks in lab environments before production rollout
● Support and lead pilot deployments at corporate office sites: shadow initial pilots, lead reverse shadow pilots, and document issues and resolutions throughout
● Test new compute products and configurations against hosted service requirements

Technical Documentation
● Create, maintain, and continuously improve hardware deployment runbooks, standard operating procedures, and configuration guides (server provisioning, VM deployment, performance benchmarking, troubleshooting procedures)
● Validate deployment artifacts produced by partner engineering teams
● Maintain documentation currency through a structured feedback collection framework, incorporating learnings from pilots, deployments, and support cases
● Contribute to the consolidation of deployment documentation into a single source

Technical Design & Standards Ownership
● Define and validate technical designs for compute infrastructure deployments
● Validate Bill of Materials (BoM) specifications against site and service requirements
● Help defining hardware configuration tiers to serve multiple customer profiles and budget constraints
● Support vendor evaluation and platform certification efforts, including technical validation of alternative compute platforms

Automation & Tooling
● Test and validate deployment automation scripts and tools developed by partner engineering teams, providing actionable bug reports and improvement feedback
● Maintain fleet core automation tasks such as password rotation, patching workflows, and firmware testing pipelines
● Build lightweight scripts or tooling as needed to address immediate operational gaps, with the ability to read, troubleshoot, and suggest fixes to existing code

Post-Deployment Supervision & Knowledge Transfer
● Validate and supervise compute deployments at warranty sites, logging defects, monitoring resource utilization, and confirming that deployed products meet expected performance baselines
● Define and deliver training sessions, shadowing programs, and Q&A sessions for delivery teams and stakeholders throughout onboarding and pilot phases
● Complete deployment checklists and formal handoff documentation to ensure smooth transitions to operational teams


About the team
Corporate Infrastructure Product Management (CIPM) is responsible for the hardware fleet lifecycle of Amazon's corporate compute infrastructure, encompassing vendor and platform strategy, firmware and patching compliance, fleet inventory and lifecycle management, and hardware capacity planning. CIPM serves as both product owner and strategic driver of the compute hardware roadmap, collaborating closely with partner engineering and deployment teams across the organization.
The team manages a global fleet of enterprise compute servers, mostly Cisco UCS, across corporate office sites worldwide, supporting a range of critical hosted services and infrastructure functions.