Data Engineer II, ARTS Data Engineering Team
Software Engineering, Data Science
India · Bengaluru, Karnataka, India · Karnataka, India
Description
The Amazon RoW Central Data Engineering (ARTS DE) team builds and operates the central data infrastructure backbone for Amazon's Rest-of-World (ROW) business operations, serving 5,000+ daily users across 14,000+ dashboards and 70,000+ daily data job runs. We run mission-critical systems including a centralized Amazon Redshift cluster, Aurora RDS real-time applications, Tableau Server, and a suite of automated data pipelines that ingest from EDX, SNS, Andes, APIs, and S3.
We are looking for a Professional Data Engineer who takes ownership seriously, thinks clearly under ambiguity, and brings strong technical depth across databases, cloud infrastructure, and data pipeline engineering. You will work alongside senior engineers and technical managers to design, build, and maintain production-grade data systems that directly enable operational decision-making across RoW countries, India, and other emerging markets. If you enjoy untangling complex data problems and taking end-to-end ownership of your solutions, this role is for you.
Key job responsibilities
Design and build data pipelines — architect and implement robust batch, intraday, and near-real-time ETL/ELT pipelines ingesting data from diverse sources including EDX datasets, SNS events, Andes tables, REST APIs, and S3, landing data reliably into Redshift and Aurora.
Own Redshift infrastructure — write, optimize, and tune SQL and ETL workloads on Amazon Redshift; manage WLM queues, distribution/sort keys, materialized views, and query performance; proactively identify and resolve performance bottlenecks.
Manage big data lifecycle — design data models and schemas (star/snowflake), enforce data partitioning and retention policies, implement data quality checks, and ensure data accuracy and freshness SLAs are consistently met.
Deliver on ambiguous requirements — independently break down loosely defined business asks into concrete technical deliverables with clear scope, milestones, and acceptance criteria; drive from requirement to production with minimal hand-holding.
Build automation and self-healing systems — reduce manual toil through automation (Lambda, ECS, Step Functions, CloudWatch alarms); contribute to the team's Server Auto Maintenance Program and ETL cleanup initiatives.
AWS cloud engineering — use AWS services (Redshift, Aurora RDS, S3, Lambda, ECS Fargate, SQS, SNS, CDK/CloudFormation, Secrets Manager, EventBridge) to build scalable, cost-efficient, and maintainable data infrastructure.
Drive cost optimization — proactively identify inefficiencies in SQL workloads, cluster utilization, and pipeline design; propose and implement optimizations that reduce AWS spend without compromising reliability.
Support stakeholders and data consumers — partner with Business Analysts, BI Engineers, Data Scientists, and PMs to understand data needs; deliver clean, documented, raw data pipelines; maintain clear boundaries around pipeline ownership and scope.
Maintain operational excellence — participate in on-call rotation, respond to production incidents with urgency and structured root cause analysis, and implement permanent fixes rather than workarounds.
A day in the life
A typical day for a DE on ARTS DE looks like:
Oncall: Review pipeline/infra/services run status on dashboards; triage any failed jobs or data freshness alerts; provide ETA and updates to stakeholders as needed.
Core hours: Work on active sprint deliverables — this may include writing CDK infrastructure code, developing Redshift SQL models, building a new EDX ingestion pipeline, or debugging a WLM contention issue on the central cluster.
Collaboration: Join a sync with stakeholders to understand a new data onboarding request; push back clearly when scope creep or non-standard pipeline patterns are introduced; document the agreed design in the team wiki.
Deep work: Independent heads-down time on complex tasks — performance tuning a slow Redshift query, infra upgrade, refactoring a Lambda trigger handler, or writing a CDK stack for a new ECS data job etc.
Wrap-up: Update task statuses in SIM tickets; code review a peer's PR; document any patterns or learnings into the team's internal knowledge base.
There is no one telling you exactly what to do each hour — you are expected to manage your own task queue, surface blockers early, and keep work moving forward with accountability.
About the team
ARTS DE is a small, high-impact engineering team established in 2020. Our mission is to provide a unified, highly available, and scalable data infrastructure for Amazon's Rest-of-World operations — covering India, Japan and emerging markets that collectively represent a major and fast-growing segment of Amazon's global business.
We operate at significant scale:
Central Redshift Cluster — 5,000+ daily active users, 70,000+ daily job runs, 14,000+ dashboards on Tableau, QuickSight, and Fusion
Tableau Server — 144 developers, 8,000+ dashboards, 20,000+ daily visits
Aurora RDS — real-time operational applications (capacity alerts, reactive scheduling tools)
GenAI Platform — MyUniverse, our internal data hub, now integrated with Stella 3.0 — an digital AI agent that automates 90%+ of routine operational tasks cross teams. We are a lean team that punches above our weight. Every engineer owns a broad surface area, ships real systems used by thousands of people daily, and is expected to continuously raise the bar — both on the technical side and on how we serve our stakeholders. We value clarity of thought, ownership without ego, and building things that last.