Data Pipeline Engineer - School of Computer Science - MLD

Carnegie Mellon University

Carnegie Mellon University

Remote

Posted on May 28, 2026

Carnegie Mellon University is a private, global research university that stands among the world’s most renowned education institutions. With ground-breaking brain science, path-breaking performances, creative start-ups, big data, big ambitions, hands-on learning, and a whole lot of robots, CMU doesn’t imagine the future, we invent it. If you’re passionate about joining a community that challenges the curious to deliver work that matters, your journey starts here!

The Machine Learning Department (MLD) at Carnegie Mellon University is a leading hub for research and education in artificial intelligence and machine learning. It focuses on developing innovative algorithms and models to address complex problems in diverse fields such as robotics, healthcare, and finance. The department offers a range of undergraduate and graduate programs, fostering a collaborative environment that bridges theoretical research and practical applications. Faculty and students frequently collaborate with industry and other academic disciplines to push the boundaries of what is possible with machine learning.

We are seeking a Data Pipeline Engineer to join the team! As a Data Pipeline Engineer, your role is vital in ensuring the integrity and reliability of our data pipelines. This position is responsible for monitoring, troubleshooting, and conducting root cause analysis of data quality issues within our pipelines, but as a part time team member, you will consult and assist rather than lead in these areas. Your contributions are crucial to maintaining the high standard of our epidemiological tracking and forecasting tools. This role will report directly to the Delphi Engineering Manager.

Core Responsibilities

  • Monitor and maintain the health and efficiency of data pipelines.
  • Troubleshoot and perform root cause analysis for data discrepancies and pipeline issues.
  • Communicate with data providers to understand data discrepancies and manage changes in data delivery.
  • Implement fixes and enhancements to improve data quality and pipeline performance.
  • Collaborate with data scientists and analysts to understand data needs and implement effective data solutions.
  • Develop strategies for data validation and quality assurance.
  • Optimize data flow and collection to improve system efficiency.
  • Document and manage data pipeline architectures, including maintenance and update protocols.
  • Use tools such as SQL, version control and CI/CD, containerization, task schedulers, python frameworks, and cloud services for data pipeline management.
  • Ensure compliance with data governance and security standards.

Adaptability, excellence, and passion are vital qualities within Carnegie Mellon University. We are in search of a team member who can effectively interact with a varied population of internal and external partners at a high level of integrity. We are looking for someone who shares our values and who will support the mission of the university through their work.

Qualifications:

  • Bachelor’s Degree required.
  • Minimum one year of research computing experience required.
  • Basic Linux use and administration: system layout, file permissions, shell, utilities (syslog, cron), diagnostic tools (ps, htop, grep, lsof)
  • Experience in Apache Airflow, preferably version 3.0
  • Basic database use, especially in Postgres
  • Rough script programming (Python, bash)
  • Team software development (git/GitHub, Jira, code reviews, agile methodologies)
  • Data analysis: diagnosing and fixing runtime errors and logic bugs; performing basic growth projections to predict future problems; communicating results
  • Required technologies: Python, MySQL/Postgres, Linux, git & GitHub, Apache Airflow
  • A combination of education and proven experience from which comparable knowledge is demonstrated may be considered.

Preferred Technologies and Languages:

  • Linux, Ubuntu, Bash, Make
  • Apache Airflow
  • Python, pandas, Flask, PyPI publishing
  • SQL, Postgres
  • git, GitHub, GitHub Actions, GitHub Issues
  • Docker, Docker Compose
  • Elastic, Kibana, FileBeat
  • G Suite (Calendar, Mail, Docs, Sheets, Slides, Forms, AppsScript, Groups)
  • Jira Software

Requirements:

  • Successful completion of a pre-employment background check

Joining the CMU team opens the door to an array of exceptional benefits.

Benefits eligible employees enjoy a wide array of benefits including comprehensive medical, prescription, dental, and vision insurance as well as a generous retirement savings program with employer contributions. Unlock your potential with tuition benefits, take well-deserved breaks with ample paid time off and observed holidays, and rest easy with life and accidental death and disability insurance.

Additional perks include a free Pittsburgh Regional Transit bus pass, access to our Family Concierge Team to help navigate childcare needs, fitness center access, and much more!

For a comprehensive overview of the benefits available, explore our Benefits page.

At Carnegie Mellon, we value the whole package when extending offers of employment. Beyond credentials, we evaluate the role and responsibilities, your valuable work experience, and the knowledge gained through education and training. We appreciate your unique skills and the perspective you bring. Your journey with us is about more than just a job; it’s about finding the perfect fit for your professional growth and personal aspirations.

Are you interested in an exciting opportunity with an exceptional organization?! Apply today!

Location

Remote

Job Function

Software/Applications Development/Engineering

Position Type

Staff – Fixed Term (Fixed Term)

Full Time/Part time

Part time

Pay Basis

Hourly

More Information:

  • Please visit Why Carnegie Mellonto learn more about becoming part of an institution inspiring innovations that change the world.

  • Click here to view a listing of employee benefits

  • Carnegie Mellon University is an Equal Opportunity Employer/Disability/Veteran.

  • Statement of Assurance