Cloud Solution Architect - Cloud & AI Platforms
Microsoft
Responsibilities
- Responsible for delivering Support Mission Critical Service offerings, collaborating with CSU (including CSAs, CSAMs), CSS, Engineering, and other teams as needed. This role ensures a cohesive, cross-delivery organizational experience for customers on their critical workloads, while showcasing progress, evolution, and improvements as outcomes.
- Direct accountability to lead the Proactive Resiliency Efforts, coordinate with other teams on the Accelerated Incident Resolution, and Monitoring & Observability features of an offering.
- Proactive Resiliency: Lead technical engagement with specific workloads that prioritizes Reliability, Security, Supportability, Manageability, and Monitoring and Observability.
- Coordinating the onboarding phase which includes the Consolidated Assessment Week Delivery.
- Remediate proactive recommendations for the specified workloads identified
- Plan and implement both a Workload-Specific Service Improvement Plan and a Customer Success Plan
- Accelerated Incident Resolution: Awareness and visibility into critical incidents to ensure RCAs and recommendations are captured and linked to Proactive Resiliency efforts.
- Monitoring & Observability: Collaborate with relevant resources when engaged to help onboard the customer efficiently and effectively, prioritizing customer experience and effort, as well as drive customer-owned monitoring to enable and improve customer’s observability capabilities.
- Cross-Team Leadership: Build partnership with CSAM to ensure roles are clearly understood and responsibilities are established, maintaining partnership throughout contract and relying on CSAM for account escalation. Coordinate with the leads of the Accelerated Incident Resolution work stream and, when required, the Proactive Monitoring work stream with our internal partners.
- Collaborate with support and stakeholders to ensure there is a comprehensive, up-to-date KnowMe available across various teams including CSS.
- Work with internal teams to request, augment with KnowMe, and share RCAs to customer
Qualifications
Minimum qualifications
- Bachelor's Degree in Computer Science, Information Technology, Engineering, Business, Liberal Arts, or related field
- 4+ years experience in technical projects and specifically in cloud/infrastructure technologies, information technology (IT) consulting/support, systems administration, network operations, software development/support, technology solutions, practice development, architecture, and/or consulting OR equivalent experience.
- Technical Certification in Cloud (e.g., Azure, Amazon Web Services, Google, security certifications).
- Proven experience in Cloud Solutions Architecture or Mission Critical Support for enterprise customers.
- Technical Expertise:
- Deep knowledge of Azure infrastructure services (Compute, Storage, Networking), Container services (such as Azure Kubernetes Service) and Platform-as-a-Service (PaaS) offerings.
- Strong troubleshooting skills across distributed systems and mission-critical workloads.
- Familiarity with performance optimization, high availability, and disaster recovery strategies.
- Customer Engagement
- Demonstrated ability to manage high-severity incidents and provide rapid mitigation strategies.
- Experience working with financial services customers or other highly regulated industries.
- Communication & Collaboration
- Excellent verbal and written communication skills for executive-level updates and technical deep dives.
- Ability to collaborate across engineering, product groups, and global support teams.
- Advanced Technical Skills
- Expertise in virtualization, VM performance tuning, and cache optimization.
- Knowledge of observability tools, telemetry, and proactive monitoring solutions.
- Site reliability / operational troubleshooting experience for large infrastructure as a service or infrastructure environments.
- Industry Experience
- Prior experience supporting large-scale financial platforms (e.g., trading systems, risk management platforms).
- Familiarity with regulatory compliance and data security standards in financial services.
- Leadership & Influence
- Ability to lead operational reviews, drive post-incident analysis, and influence engineering roadmaps.
- Experience in stakeholder management across global time zones.
- Certification - Microsoft Certified: Azure Solutions Architect Expert, ITIL or similar certification for service management.
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.