Senior AI & Data Engineering Lead - Senior Vice President

Citi

Citi

Software Engineering, Data Science

Jersey City, NJ, USA

USD 176,720-265,080 / year

Posted on Jun 5, 2026

This job description outlines a senior-level role for a data architect or lead data engineer within a Data Services team. The position is centered on building and managing the data infrastructure required to support large-scale Generative AI and Machine Learning initiatives. Below is a detailed breakdown of the responsibilities and the skills required for such a role

Expanded Responsibilities

This role combines deep technical expertise in data engineering with strategic thinking and leadership. The core responsibilities can be broken down into three main pillars:

1. Strategic AI Enablement

This goes beyond just building databases; it's about designing the entire data foundation for the company's AI strategy.

  • Data Ecosystem Architecture: You will be responsible for the high-level design of the data platform. This includes:

    • Data Lake/Lakehouse Design: Implementing a central repository to store vast amounts of structured, semi-structured, and unstructured data from various sources. This could involve technologies like AWS S3, Azure Data Lake Storage, or Google Cloud Storage.
    • Federated Querying: Leveraging technologies like Starburst (commercial Trino) to create a virtual data warehouse. This allows data consumers (analysts, data scientists, AI models) to query data across different sources (e.g., data lakes, relational databases, NoSQL databases) with a single SQL query, without needing to move or copy the data.
    • Scalability and Performance: Ensuring the architecture can scale horizontally to handle petabytes of data and a high volume of concurrent queries, which is critical for pre-training large language models (LLMs).

2. Advanced AI Ops & Data Pipelines

This is the hands-on engineering aspect of the role, focused on the movement and processing of data.

  • High-Throughput Data Pipelines: You will lead the development of the data "plumbing" that powers the AI systems. This includes:
    • Batch Processing: Using Apache Spark for large-scale data transformation, cleaning, and feature engineering on historical data.
    • Real-time Stream Processing: Using Apache Kafka as a messaging bus to ingest real-time data from sources like application logs, IoT devices, or clickstreams. Apache Flink would be used for complex event processing on these streams (e.g., fraud detection, real-time recommendations).
  • Optimization and Reliability: Your pipelines must be not only fast but also resilient. This involves:
    • Low Latency: Tuning jobs and infrastructure to minimize the time it takes for data to travel from source to destination.
    • High Availability: Implementing failover mechanisms, monitoring, and alerting to ensure the data pipelines are always running and the AI models have uninterrupted access to fresh data.
    • CI/CD for Data: Implementing DevOps and AI Ops best practices for data pipelines, including automated testing, deployment, and data quality checks.

3. AI Governance & Leadership

This pillar focuses on the "people" and "process" aspects of the role, ensuring data is used responsibly and effectively.

  • Data Governance for AI: As AI systems become more critical, the data they use must be trustworthy. You will establish frameworks for:
    • Data Quality: Implementing automated checks and monitoring to ensure data is accurate, complete, and consistent.
    • Data Provenance & Lineage: Creating systems to track where data comes from, how it has been transformed, and how it is used. This is crucial for debugging models and for regulatory compliance.
    • Data Security: Working with security teams to implement access controls, data masking, and encryption to protect sensitive information, especially in the context of training AI models.
  • Team Leadership and Mentorship: This is a leadership role where you will be expected to:
    • Mentor Data Engineers: Guide junior and mid-level engineers, conduct code reviews, and establish best practices for the team.
    • Foster Innovation: Stay up-to-date with the latest technologies and methodologies in the data and AI space and encourage a culture of experimentation and continuous improvement.
    • Cross-functional Collaboration: Work closely with data scientists, ML engineers, platform engineers, and business stakeholders to understand their needs and deliver effective data solutions.

Qualifications:

  • 10+ years of relevant experience
  • Experience in implementing projects
  • Experience in systems analysis and programming of software applications
  • Demonstrated Subject Matter Expert (SME) in area(s) of Applications Development
  • Demonstrated knowledge of client core business functions
  • Demonstrated leadership, project management, and development skills
  • Relationship and consensus building skills

Education:

  • Bachelor’s degree/University degree or equivalent experience
  • Master’s degree preferred

Required Skills

To succeed in this role, a candidate would need a blend of technical depth, strategic vision, and leadership qualities.

Big Data Technologies

- Processing Frameworks: Expert-level knowledge of Apache Spark. Strong experience with Apache Flink and Apache Kafka.
- Query Engines: Deep understanding and hands-on experience with Trino (Starburst).
- Orchestration: Experience with workflow management tools like Airflow or Prefect.

Data Architecture:

- Data Modeling: Strong understanding of data modeling concepts for both analytical and operational systems.
- Platform Design: Proven experience designing and building scalable data lakes, data warehouses, and lakehouse architectures.
- Cloud Expertise: Proficiency with at least one major cloud provider (AWS, GCP, Azure) and their data services (e.g., S3, Glue, EMR, BigQuery, Databricks).

Governance & Security:

- Data Governance: Experience implementing data quality frameworks, data lineage solutions, and data cataloging tools.
- Security: Knowledge of data security best practices, including encryption, masking, and role-based access control (RBAC).

Programming:

- Python: Expert-level proficiency.
- SQL: Expert-level proficiency for complex analytical queries.
- Scala/Java: Often beneficial for deep work in Spark or Flink.

Soft Skills:

- Leadership: Proven ability to lead complex technical projects and mentor engineers.
- Strategic Thinking: Ability to connect data strategy to broader business and technology objectives.
- Communication: Excellent verbal and written communication skills to articulate complex technical concepts to both technical and non-technical audiences.
- Problem-Solving: Strong analytical and troubleshooting skills.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Primary Location:

Jersey City New Jersey United States

------------------------------------------------------

Primary Location Full Time Salary Range:

$176,720.00 - $265,080.00


In addition to salary, Citi’s offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Anticipated Posting Close Date:

Jun 11, 2026

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.