Spark Developer - C10 - Chennai
Citi
ETL Developer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) using Apache Spark and Python. This role required deep understanding of data engineering principles, proficiency in Python and hands-on experience with Spark and Hadoop ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.
Responsibilities:
-
Data Processing and Transformation:
Design and Implement of Spark applications to process and transform large datasets in HDFS.
Develop ETL Pipelines in Spark using Python for data Ingestion, cleaning, aggregation, and transformations.
Performance Optimization:
Optimize Spark jobs for efficiency, reducing run time and resource usage.
Finetune memory management, caching, and partitioning strategies for Optimal performance
Data Engineering with Hadoop and Spark:
Load data from different sources into HDFS, ensuring data accuracy and integrity.
Integrate Spark Applications with Hadoop frameworks like Hive, Sqoop etc.
Testing and debugging:
Troubleshoot and debug Spark Job failures, monitor job logs, and Spark UI to Identify Issues.
Qualifications:
- 2-5 years of relevant experience
- Experience in programming/debugging used in business applications
- Working knowledge of industry practice and standards
- Comprehensive knowledge of specific business area for application development
- Working knowledge of program languages
- Consistently demonstrates clear and concise written and verbal communication
Expertise in handling complex large-scale Warehouse environments
• Hands-on experience writing complex SQL queries, exporting and importing large amounts of data using utilities
Education:
- Bachelor's degree in a quantitative field (such as Engineering, Computer Science) or equivalent experience
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Applications Development------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.