Staff, Software Engineer
Walmart
Software Engineering
United States · California, USA · Texas, USA · Tampa, FL, USA · Sunnyvale, CA, USA · Remote
USD 143k-286k / year + Equity
Posted on Mar 27, 2026
Position Summary...
What you'll do...
About the Team At Walmart Global Tech, we build highly scalable and reliable backend platforms that power the online marketplace of the world’s largest retail ecosystem. Our systems process massive volumes of real-time and batch data across Walmart marketplace. We are looking for a Staff Software Engineer with deep expertise in Java and Spring Boot, strong hands-on experience in Apache Kafka and Apache Spark, and a proven track record of building distributed systems at scale.This is an onsite role in Sunnyvale, CA, and candidates must have valid U.S. work authorization as visa sponsorship is not available.
Role Overview As a Staff Engineer, you will act as a hands-on technical leader and system architect, responsible for designing and delivering large-scale backend platforms and data processing systems. You will work cross-functionally to solve complex engineering challenges, influence platform architecture, and mentor senior engineers. This role requires strong ownership, deep system thinking, and the ability to design for high throughput, low latency, and extreme reliability. Key Responsibilities Design and build highly scalable backend microservices using Java and Spring Boot. Architect and implement real-time event-driven systems using Apache Kafka. Develop and optimize large-scale batch and streaming data pipelines using Apache Spark. Drive architecture decisions around scalability, resiliency, observability, and cost efficiency. Lead system design reviews and define engineering best practices for distributed systems. Work closely with Product, Data Science, Platform, and Infrastructure teams to deliver business impact. Optimize system performance through partitioning strategies, caching, async processing, and concurrency tuning. Mentor engineers and act as a technical multiplier across multiple teams. Participate in production incident reviews and drive long-term platform reliability improvements. Preferred Skills: Orchestration Ecosystem: Direct experience building or deeply customizing platforms like Temporal.io, Cadence, Apache Airflow, or Argo Workflows. Distributed State Management & Durable Execution Deep State Knowledge: Experience managing the state of long-running processes that must survive infrastructure failures, network partitions, and deployments. Event Sourcing & CQRS: Familiarity with using event-sourcing patterns to rebuild the state of a workflow by replaying history. Transactions: Understanding of the Saga Pattern for managing distributed transactions and implementing compensations (rollbacks) across microservices. Fault Tolerance & High Availability Idempotency Mastery: Expertise in designing systems where tasks can be retried indefinitely without side effects—a critical requirement for any orchestration engine. Advanced Retry Policies: Knowledge of jitter, exponential backoff, and circuit breakers to prevent "thundering herd" problems when a downstream service fails. Rate Limiting & Quotas: Experience building multi-tenant throttling mechanisms to ensure one massive workflow doesn't starve others of resources. 3. Developer Experience (DevX) & DSLs DSL Design: Experience designing Domain-Specific Languages (YAML, JSON, or Python-based) that allow users to define complex logic simply. SDK Development: Ability to build client-side libraries that abstract away the complexity of the underlying orchestration engine for other developers. 4. High-Throughput Messaging & Queuing Message Brokers: Professional experience with Kafka, Pulsar, or RabbitMQ specifically used as a task distribution layer. Priority Queuing: Implementing logic to handle "hot" tasks vs. background tasks efficiently. 5. Ecosystem Familiarity Hands-on experience with existing orchestrators such as Temporal.io, Cadence, Apache Airflow, Argo Workflows, or AWS Step Functions. An understanding of why these tools succeed (or fail) in specific use cases. Required Qualifications 12+ years of experience in backend and distributed systems engineering. Must-have strong hands-on experience in Java and Spring Boot for building production-grade microservices. Deep expertise in Apache Kafka: Topic design and partitioning Consumer group scaling and offset management Delivery semantics (at-least-once / exactly-once) Stream processing patterns and performance tuning Strong hands-on experience with Apache Spark: Batch and Structured Streaming workloads Job optimization (shuffle tuning, memory tuning, skew handling) Working with large-scale datasets Proven experience building systems operating at large scale (millions–billions of events / high TPS platforms). Experience designing event-driven microservices architectures. Strong understanding of distributed systems fundamentals: Fault tolerance Back-pressure Idempotency Consistency trade-offs Experience with cloud-native deployments (Kubernetes, Docker, AWS/GCP/Azure). Experience with NoSQL / analytical data stores such as Cassandra, BigQuery, HBase, or similar. Strong production debugging and performance tuning skills. Preferred Qualifications - Experience in retail, supply chain, pricing, ads, or e-commerce platforms.
- Exposure to real-time analytics, recommendation engines, or fraud detection systems.
- Experience driving cross-team technical initiatives and platform modernization efforts.
- Familiarity with CI/CD pipelines, observability (metrics/logging/tracing), and infrastructure as code.
- Experience contributing to internal frameworks or platform engineering efforts.
- Provide technical direction across teams and influence architectural decisions.
- Raise the engineering bar through mentorship, design rigor, and operational excellence.
- Balance hands-on coding with strategic technical leadership.
- Drive initiatives that improve developer productivity and platform scalability.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.
For information about benefits and eligibility, see One.Walmart.
The annual salary range for this position is $143,000.00 - $286,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include :
- Stock
ㅤ
ㅤ
ㅤ
ㅤ
Minimum Qualifications...
Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.
Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 4 years’ experience in software engineering or related area.Option 2: 6 years’ experience in software engineering or related area.
Preferred Qualifications...
Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.
Master’s degree in Computer Science, Computer Engineering, Computer Information Systems, Software Engineering, or related area and 2 years' experience in software engineering or related area, We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.