Staff, Data Scientist

Walmart

Walmart

Data Science

Bentonville, AR, USA · Sunnyvale, CA, USA

Posted on May 29, 2026

Position Summary...

Role summary
Join Merchandising Decision Sciences (MDS) as the founding Staff Data Scientist for our new External Data and Analytics Products team. You will design, build, deploy, scale and monitor the ML systems that power Walmart’s view of ROM — the Rest of Market — the slice of retail that doesn’t ring at our own registers but shapes every category decision we make. You will own three model families end-to-end: embedding-driven hierarchy classification, GMV distribution normalization and projection, and causal impact modeling to market share. You will be the only data scientist on the program at the start, so we need someone who can architect for scale on Databricks (on GCP) from day one, ship to production, set up the MLOps foundations, and hand a healthy, well-instrumented platform to the ML engineering team that grows in behind you. This is a builder’s role with a clear runway: get the first models live, prove the lift, and shape the team that scales them.

Have you ever wondered how Walmart sees the Rest of Market — the part of retail we don’t ring ourselves — and decides where to grow share next? Do you get a thrill from being the first scientist on a program: the one who picks the stack, ships the first model, sets the bar, and watches the platform you built fill up with scientists behind you? We’d love to put your end-to-end ML skills to work on one of retail’s hardest measurement problems.

About the team
External Data and Analytics Products is a brand-new subteam within Merchandising Decision Sciences. We acquire, model, and productize syndicated and external data — NielsenIQ, Circana, GS1, and the rest — into analytics and ML services that merchants and systems use to make sharper, faster decisions. Our charter is to turn the noisy, fragmented view of the outside world into a calibrated signal Walmart can plan against. We work as a full-stack team and we hold ourselves to engineering-level rigor: every model we ship has an owner, a monitor, and a runbook.

What you'll do...

What you’ll do
  • Design, build, deploy, and monitor embedding-based classification models that align external product signals to the Walmart merchandising hierarchy — from candidate generation and ANN retrieval through fine-tuned classifiers and human-in-the-loop feedback for long-tail nodes.
  • Develop GMV distribution normalization and projection models that reconcile heterogeneous internal and external GMV signals across categories, time, and geography — and produce projections business partners can plan against.
  • Build causal impact models that quantify market-share movement from merchandising actions (assortment, pricing, promo, distribution) using methods such as difference-in-differences, synthetic control, Bayesian structural time series, and uplift modeling — and clearly communicate assumptions, sensitivity, and confidence to non-technical leaders.
  • Engineer for production from day one on Databricks (on GCP) PySpark + Delta for distributed training and inference, MLflow for tracking and registry, Unity Catalog for governance, Databricks Model Serving and Jobs for deployment, BigQuery, Dataproc and Vertex AI where they fit best.
  • Establish the MLOps foundations the ROM platform will live on: CI/CD for models, feature management, drift and quality monitoring, retraining triggers, shadow deployments, model cards, and on-call runbooks — so the ML engineers who join behind you can scale the platform without re-platforming it.
  • Own the end-to-end ML lifecycle for every model you put in production — problem framing, data contracts, training, evaluation, deployment, monitoring, retraining, and incident response.
What you’ll bring
  • Extensive industry experience as a hands-on data scientist who has personally taken ML systems from notebook to production at scale and stayed on them through monitoring, drift, and retraining.
  • Deep, hands-on experience shipping and scaling ML on Databricks — PySpark, Delta, MLflow (tracking and registry), Unity Catalog, Databricks Jobs and Workflows, and Databricks Model Serving. You know where Databricks shines and where to reach for something else.
  • Strong production fluency with GCP — BigQuery, GCS, Vertex AI, Cloud Run, Composer/Airflow — and the ability to wire Databricks and GCP services together cleanly.
  • Proven expertise with vector embeddings: training, fine-tuning, and evaluating embedding models for retail/product data; pairing embeddings with classifiers; ANN retrieval and vector indexing at catalog scale; choosing the right embedding model for the right job.
  • Deep expertise in supervised classification at scale, including tree ensembles (XGBoost / LightGBM), embedding-based classifiers, and transformer fine-tuning; comfort with severe class imbalance, noisy labels, hierarchy-aware loss design, and long-tail evaluation.
  • Strong command of forecasting and distribution modeling — hierarchical and Bayesian methods, reconciliation across hierarchies, calibrated probabilistic projections, and normalization across heterogeneous data sources.
  • Solid causal inference chops for observational retail/commercial data — difference-in-differences, synthetic control, propensity methods, Bayesian structural time series (e.g., CausalImpact), and uplift / heterogeneous treatment effects.
  • Strong MLE instincts: containerization, CI/CD for models, infrastructure-as-code where it matters, observability for ML systems, and a healthy respect for production discipline. You write code that another engineer can read, test, and extend.
  • Expert-level Python and SQL; comfortable in distributed compute (PySpark) and able to optimize a stubborn job.
  • Excellent written and verbal communication — you can explain an embedding loss to an MLE and causal estimates to a merchant in the same afternoon.
  • Familiarity with syndicated external datasets (e.g., NielsenIQ, Circana) and publicly available data and frameworks (e.g., GS1) is a strong plus.
You’ll sweep us off our feet if…
  • You’ve been the founding or solo data scientist on a program before — you’ve picked the first tools, written the first design docs, shipped the first model, and handed a healthy codebase to the team that came after you.
  • You’ve stood up the MLOps house on Databricks from scratch — MLflow registry, Unity Catalog, model serving, monitoring — and can show the pull requests to prove it.
  • You’ve scaled an embedding-driven classifier on a real retail or e-commerce catalog (millions of SKUs, long-tail nodes, drift over time) and you know exactly where it tends to break.
  • You operate with engineering-level rigor — your notebooks turn into modules, your modules turn into services, and your services have tests, alerts, and runbooks.
  • You’re a strong storyteller who can move a room of merchants and execs by connecting a model output to a market-share point — and equally comfortable in a code review defending an evaluation choice.
  • You think like a product owner: you know which model is worth shipping, which one is worth killing, and which problem isn’t a modeling problem at all.
  • You’ve felt the pain of a model in production at 2 a.m. and built the kind of monitoring and guardrails that mean you don’t feel it twice.

At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more. You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable. For information about PTO, see https://one.walmart.com/notices. Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.
For information about benefits and eligibility, see One.Walmart.
Bentonville, Arkansas US-30001: The annual salary range for this position is $110,000.00 - $220,000.00
Sunnyvale, California US-11789: The annual salary range for this position is $143,000.00 - $286,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include :
- Stock

Minimum Qualifications...

Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.

Option 1: Bachelor’s degree in Computer Science and 5 years' experience in software engineering or related field. Option 2: 7 years’ experience in software engineering or related field. Option 3: Master's degree in Computer Science and 3 years' experience in software engineering or related field.
4 years' experience in data engineering, database engineering, business intelligence, or business analytics.
1 year’s supervisory experience.

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Data science, machine learning, optimization models, PhD in Machine Learning, Computer Science, Information Technology, Operations Research, Statistics, Applied Mathematics, Econometrics, Successful completion of one or more assessments in Python, Spark, Scala, or R, Using open source frameworks (for example, scikit learn, tensorflow, torch), We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.

Primary Location...

601 Respect Dr, Bentonville, AR 72716, United States of AmericaWalmart and its subsidiaries are committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.