Senior Data Engineer (AI,ML)
Oracle
Essential Skills
- Proficiency in Python (Java a plus) with hands-on experience in modern ML frameworks such as PyTorch and TensorFlow, plus a solid foundation in statistics and data modeling.
- Experience building end-to-end ML and GenAI pipelines, including data preprocessing, feature engineering, model training, validation, and production deployment.
- Practical expertise in Generative AI and RAG systems, including embeddings, chunking strategies, hybrid retrieval, reranking, and evaluation techniques.
- Hands-on experience with agentic AI workflows, including prompt engineering, intent routing, tool orchestration, function calling, and safe tool-use with guardrails.
- Experience with enterprise software development and cloud-native architectures, including REST APIs, microservices, containerization, CI/CD, and platforms such as AWS, Azure, GCP, or Oracle Cloud.
- Strong problem-solving skills, with the ability to translate business requirements into scalable, reliable, and cost-effective AI solutions.
- Excellent written and verbal communication skills, with the ability to work effectively in a collaborative, cross-functional, and global team environment.
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.
We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Senior AI/ ML Engineer
Career Level - IC3
AI/ML
- Design, train, and optimize machine learning models for real-world applications.
- Build end-to-end ML pipelines, including data preprocessing, feature engineering, model training, validation, and deployment.
- Collaborate with data engineers and software developers to integrate ML models into production systems.
- Monitor model performance, detect data drift, and retrain models for continuous improvement.
GenAI
- Agentic Solution Design & Orchestration
- Architect LLM-powered applications, including intent routing across tools and skills.
- Implement agentic workflows using frameworks such as LangGraph or equivalents; decompose tasks, manage tool invocation, and ensure determinism and guardrails.
- Integrate MCP-compatible tools and services to extend system capabilities.
- Retrieval & Embeddings
- Build effective RAG systems: chunking strategies, embedding model selection, vector indexing, reranking, and grounding to authoritative data.
- Optimize vector stores and search using ANN, hybrid retrieval, filters, and metadata schemas.
- Prompting & Model Strategy
- Develop robust prompting patterns and templates; structure prompts for tool use and function calling.
- Compare generic vs. fine-tuned LLMs for intent routing; make data-driven choices on cost, latency, accuracy, and maintainability.
- Data & Integrations
- Implement NL2SQL (and guarded SQL execution) patterns; connect to microservices and enterprise systems via secure APIs.
- Define and enforce data schemas, metadata, and lineage for reliable retrieval.
- Production Readiness
- Establish evaluation datasets and automated regressions for RAG and agent systems.
- Monitor quality (precision/recall, hallucination rate), latency, cost, and safety.
- Apply guardrails, PII handling, access controls, and policy enforcement end-to-end.
MLOps / LangOps
- Version prompts, models, embeddings, and pipelines; manage A/B tests and rollout strategies.
- Instrument tracing and telemetry for agent steps and tool calls; implement fallback, timeout, and retry policies.
Core Qualifications
- Programming:
- Strong proficiency in Python (NumPy, Pandas, Scikit-learn); experience with ML frameworks such as TensorFlow and PyTorch.
- Machine Learning & Deep Learning
- Hands-on experience with supervised, unsupervised, and reinforcement learning techniques.
- Mathematics & Statistics
- Solid foundation in linear algebra, probability, optimization, and statistical modeling.
- Data Handling
- Experience with SQL and NoSQL databases, data preprocessing, and feature engineering.
- GenAI Expertise
- Strong understanding of vector embeddings and similarity search (cosine, inner product, L2), chunking strategies, and reranking.
- Hands-on experience building RAG pipelines (indexing, metadata, hybrid search, evaluators).
- Practical prompt engineering for tool use, function calling, and agent planning.
- Experience with agentic frameworks (e.g., LangGraph or similar) and orchestration of tools and services; familiarity with MCP and tool-integration patterns.
- Knowledge of NL2SQL techniques, SQL safety (schema constraints, query sandboxes), and microservice integration.
- Ability to evaluate tradeoffs between generic/base LLMs and fine-tuned/task-specific models (accuracy, drift, data/ops burden, latency, and cost).
- Proficiency with Python and common LLM/RAG libraries; containerization and CI/CD.
- Understanding of enterprise security, privacy, and compliance; RBAC/ABAC for data access, logging, and auditability.
MLOps & Deployment
- Familiarity with model deployment frameworks (MLflow, Kubeflow, SageMaker, Vertex AI), CI/CD pipelines, and containerization using Docker and Kubernetes.
Preferred Experience
- Hands-on experience with at least one major cloud provider (AWS, Azure, GCP, OCI).
- Experience with large-scale distributed systems and big data frameworks (Spark, Hadoop).
- Retrieval optimization using hybrid lexical + vector search, metadata filtering, and learned rerankers.
- Model fine-tuning and adapter methods (LoRA, SFT, DPO) and evaluation.
- Observability stacks for LLM applications (tracing, evaluation dashboards, cost/latency SLOs).
- Document AI (OCR, layout parsing) and schema construction for unstructured data.
- Caching, batching, and KV-cache optimization for throughput and cost efficiency.
- Safe tool-use patterns, including constrained decoding, JSON schemas, and policy checks.
How We’ll Assess
- Portfolio or walkthrough of a production RAG or agent system: objectives, architecture, evaluations, and outcomes.
- Hands-on exercise: design an intent router, justify model choice (generic vs. fine-tuned), propose chunking and metadata strategy, and define evaluation metrics.
- Discussion of failure modes (hallucinations, tool errors, SQL risk) and mitigation strategies.
- Approach to governance: access controls, PII handling, audit logging, and red-teaming.