Build and maintain the future of Vanguard’s data platform on Databricks by designing and developing robust, scalable pipelines. Contribute to a validated, self-serve data ecosystem that supports analytics and business insights.

Base Skillsets

Hands-on experience with Databricks (Spark, Delta Lake).
Strong Python, Pyspark and SQL skills.
Building and maintaining scalable ETL/ELT pipelines in AWS.
Understanding of Medallion Architecture and data validation.
Familiarity with orchestration tools (e.g., Airflow, Databricks Workflows).
Experience and comfort with modelling with Kimball-style modelling (Star schemas).

Preferable Skillsets

Experience with CI/CD and infrastructure-as-code (Terraform).
Exposure to Delta Live Tables and Unity Catalog.
Knowledge of DataOps practices and data quality frameworks.

Responsibilities

Design and deploy Databricks infrastructure using approved patterns.
Develop and maintain ingestion pipelines via the Common Ingestion Framework.
Collaborate with Product Managers and Business Analysts to validate data quality and ensure integrity.
Define, document, and build tests within the Testing Framework defined within the program, including the Enterprise solution and the local, notebook-based process.
Implement CI/CD for data pipelines and enforce best practices in testing and documentation.
Monitor pipeline performance, troubleshoot issues, and optimize for cost and scalability.
Document solutions and share knowledge with the engineering team.
Support migration from existing infrastructure to Databricks.

Requisition Id : 1637235

APAC Data Engineering Team Skill Requirements

Data Engineer

Exp level: 5 yrs Hyderabad location

Role

Build and maintain the future of Vanguard’s data platform on Databricks by designing and developing robust, scalable pipelines. Contribute to a validated, self-serve data ecosystem that supports analytics and business insights.

Base Skillsets

Hands-on experience with Databricks (Spark, Delta Lake).
Strong Python, Pyspark and SQL skills.
Building and maintaining scalable ETL/ELT pipelines in AWS.
Understanding of Medallion Architecture and data validation.
Familiarity with orchestration tools (e.g., Airflow, Databricks Workflows).
Experience and comfort with modelling with Kimball-style modelling (Star schemas).

Preferable Skillsets

Experience with CI/CD and infrastructure-as-code (Terraform).
Exposure to Delta Live Tables and Unity Catalog.
Knowledge of DataOps practices and data quality frameworks.

Responsibilities

Design and deploy Databricks infrastructure using approved patterns.
Develop and maintain ingestion pipelines via the Common Ingestion Framework.
Collaborate with Product Managers and Business Analysts to validate data quality and ensure integrity.
Define, document, and build tests within the Testing Framework defined within the program, including the Enterprise solution and the local, notebook-based process.
Implement CI/CD for data pipelines and enforce best practices in testing and documentation.
Monitor pipeline performance, troubleshoot issues, and optimize for cost and scalability.
Document solutions and share knowledge with the engineering team.
Support migration from existing infrastructure to Databricks.

Apply now

See more open positions at EY

Privacy policy Cookie policy