Senior Data Engineer, WW FBA Central Analytics
Amazon
Description
Worldwide Fulfillment by Amazon (WW FBA) empowers millions of sellers to scale globally through Amazon's leading fulfillment network. FBA sellers deliver fast, reliable Prime-eligible shipping and hassle-free returns to customers worldwide—enabling them to focus exclusively on business growth while Amazon handles operational logistics.
The WW FBA Central Analytics team architects and maintains data infrastructure that delivers critical insights to WW FBA leadership. This team forms strategic partnerships across global product, program, and technology teams to unify datasets, implement self-service analytics platforms, and develop AI capabilities that transform raw data into actionable insights.
We are looking for a Senior Data Engineer who thrives on solving hard problems, shaping new capabilities, and delivering high-quality results in a fast-paced environment. You will be at the forefront of integrating LLM-powered solutions with robust backend systems, ensuring they scale securely and reliably to serve global customers.
This role sits at the intersection of data engineering and AI - you will own the data foundation that determines whether GenAI-powered insights are trustworthy, fast, and scalable. You will work directly on executive-level initiative to deliver proactive, AI-generated insights across FBA metrics to business leadership worldwide.
Key job responsibilities
- Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources across 8 WW FBA metrics domains.
- Lead the strategic migration from Redshift-centric architecture to a flexible lakehouse model, targeting query performance improvement from 60–300 seconds to under 10 seconds.
- Establish metadata management with automated data classification and lineage tracking.
- Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.
- Architect a centralized metrics repository that becomes the single source of truth for all FBA metrics across various time grains.
- Implement robust data quality frameworks with staging-first policies and automated validation pipelines.
- Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns, including multi-dimensional drill-down across Time → Geography → Category.
- Develop intelligent orchestration for metrics generation workflows with comprehensive audit trails.
- Lead the design of semantic data models that balance analytical performance with AI retrieval requirements for LLM-powered insight generation.
- Implement cross-domain federated query capabilities with sophisticated query optimization techniques.
- Architect vector database infrastructure capable of managing large-scale embeddings with consistent low-latency retrieval.
- Integrate schema definitions through MCP service calls to enable automated, AI-accessible data contracts.
- Build and own monitoring and alerting frameworks for all data pipelines, ensuring proactive failure detection and rapid resolution.
- Establish runbooks, schema change management processes, and data quality SLAs that move the team from reactive data consumers to proactive insight generators.