Walmart
Senior, Data Engineer, Big Data
Walmart, California, Missouri, United States, 65018
Position Summary:
At Walmart, we help people save money, so they can live better. This mission serves as the foundation for every decision we make and drives us to create the future of retail. We can’t do that without the best talent – talent that is innovative, curious, and driven to create exceptional experiences for our customers.
Do you have boundless energy and passion for engineering data used to solve dynamic problems that will shape the future of retail? With the sheer scale of Walmart’s environment comes the biggest of big data sets.
As a Walmart Data Engineer in Marketplace, you will dig into our mammoth scale of data to help unleash the power of retail data science by imagining, developing, and maintaining data pipelines that our Data Scientists and Analysts can rely on. You will be responsible for contributing to an orchestration layer of complex data transformations, refining raw data from source into targeted, valuable data assets for consumption in a governed way. You will partner with Data Scientists, Analysts, other engineers and business stakeholders to solve complex and exciting challenges so that we can build out capabilities that evolve the retail business model while making a positive impact on our customers’ and sellers’ lives.
What you'll do:
You will use cutting edge data engineering techniques to create critical datasets and dig into our mammoth scale of data to help unleash the power of data science by imagining, developing, and maintaining data pipelines that our Data Scientists and Analysts can rely on.
You will be responsible for contributing to an orchestration layer of complex data transformations, refining raw data from source into targeted, valuable data assets for consumption in a governed way.
You will partner with Data Scientists, Analysts, other engineers, and business stakeholders to solve complex and exciting challenges so that we can build out capabilities that evolve the marketplace business model while making a positive impact on our customers' and sellers’ lives.
You will participate with limited help in small to large sized projects by reviewing project requirements; gather requested information; write and develop code; conduct unit testing; communicate status and issues to team members and stakeholders; collaborate with project team and cross functional teams; troubleshoot open issues and bug-fixes; and ensure on-time delivery and hand-offs.
You will design, develop and maintain highly scalable and fault-tolerant real time, near real time and batch data systems/pipelines that process, store, and serve large volumes of data with optimal performance.
You will ensure data ingested and processed is accurate and of high quality by implementing data quality checks, data validation, and data cleaning processes.
You will identify possible options to address business problems within one's discipline through analytics, big data analytics, and automation.
You will build business domain knowledge to support the data need for product teams, analytics, data scientists and other data consumers.
What you'll bring:
At least 4+ years of experience in development of big data technologies/data pipelines.
Experience in managing and manipulating huge datasets in the order of terabytes (TB) is essential.
Experience with big data technologies like Hadoop, Apache Spark (Scala preferred), Apache Hive, or similar frameworks on the cloud (GCP preferred, AWS, Azure etc.) to build batch data pipelines with strong focus on optimization, SLA adherence and fault tolerance.
Experience in building idempotent workflows using orchestrators like Automic, Airflow, Luigi etc.
Experience in writing SQL to analyze, optimize, profile data preferably in BigQuery or SPARK SQL.
Strong data modeling skills are necessary for designing a schema that can accommodate the evolution of data sources and facilitate seamless data joins across various datasets.
Ability to work directly with stakeholders to understand data requirements and translate that to pipeline development / data solution work.
Strong analytical and problem-solving skills are crucial for identifying and resolving issues that may arise during the data integration and schema evolution process.
Ability to move at a rapid pace with quality and start delivering with minimal ramp up time will be crucial to succeed in this initiative.
Effective communication and collaboration skills are necessary for working in a team environment and coordinating efforts between different stakeholders involved in the project.
Nice to have from you:
Experience building complex near real time (NRT) streaming data pipelines using Apache Kafka, Spark streaming, Kafka Connect with a strong focus on stability, scalability and SLA adherence.
Good understanding of REST APIs – working knowledge on Apache Druid, Redis, Elastic search, GraphQL or similar technologies. Understanding of API contracts, building telemetry, stress testing etc.
Exposure in developing reports/dashboards using Looker/Tableau.
Experience in eCommerce domain preferred.
Minimum Qualifications:
Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.
Option 1: Bachelor’s degree in Computer Science and 3 years' experience in software engineering or related field.
Option 2: 5 years’ experience in software engineering or related field.
Option 3: Master's degree in Computer Science and 1 year’s experience in software engineering or related field.
2 years' experience in data engineering, database engineering, business intelligence, or business analytics.
Preferred Qualifications:
Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.
Data engineering, database engineering, business intelligence, or business analytics, ETL tools and working with large data sets in the cloud.
Master’s degree in Computer Science or related field and 3 years' experience in software engineering.
We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly.
The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.
Primary Location:
640 W California Avenue, Sunnyvale, CA 94086-4828, United States of America
#J-18808-Ljbffr
At Walmart, we help people save money, so they can live better. This mission serves as the foundation for every decision we make and drives us to create the future of retail. We can’t do that without the best talent – talent that is innovative, curious, and driven to create exceptional experiences for our customers.
Do you have boundless energy and passion for engineering data used to solve dynamic problems that will shape the future of retail? With the sheer scale of Walmart’s environment comes the biggest of big data sets.
As a Walmart Data Engineer in Marketplace, you will dig into our mammoth scale of data to help unleash the power of retail data science by imagining, developing, and maintaining data pipelines that our Data Scientists and Analysts can rely on. You will be responsible for contributing to an orchestration layer of complex data transformations, refining raw data from source into targeted, valuable data assets for consumption in a governed way. You will partner with Data Scientists, Analysts, other engineers and business stakeholders to solve complex and exciting challenges so that we can build out capabilities that evolve the retail business model while making a positive impact on our customers’ and sellers’ lives.
What you'll do:
You will use cutting edge data engineering techniques to create critical datasets and dig into our mammoth scale of data to help unleash the power of data science by imagining, developing, and maintaining data pipelines that our Data Scientists and Analysts can rely on.
You will be responsible for contributing to an orchestration layer of complex data transformations, refining raw data from source into targeted, valuable data assets for consumption in a governed way.
You will partner with Data Scientists, Analysts, other engineers, and business stakeholders to solve complex and exciting challenges so that we can build out capabilities that evolve the marketplace business model while making a positive impact on our customers' and sellers’ lives.
You will participate with limited help in small to large sized projects by reviewing project requirements; gather requested information; write and develop code; conduct unit testing; communicate status and issues to team members and stakeholders; collaborate with project team and cross functional teams; troubleshoot open issues and bug-fixes; and ensure on-time delivery and hand-offs.
You will design, develop and maintain highly scalable and fault-tolerant real time, near real time and batch data systems/pipelines that process, store, and serve large volumes of data with optimal performance.
You will ensure data ingested and processed is accurate and of high quality by implementing data quality checks, data validation, and data cleaning processes.
You will identify possible options to address business problems within one's discipline through analytics, big data analytics, and automation.
You will build business domain knowledge to support the data need for product teams, analytics, data scientists and other data consumers.
What you'll bring:
At least 4+ years of experience in development of big data technologies/data pipelines.
Experience in managing and manipulating huge datasets in the order of terabytes (TB) is essential.
Experience with big data technologies like Hadoop, Apache Spark (Scala preferred), Apache Hive, or similar frameworks on the cloud (GCP preferred, AWS, Azure etc.) to build batch data pipelines with strong focus on optimization, SLA adherence and fault tolerance.
Experience in building idempotent workflows using orchestrators like Automic, Airflow, Luigi etc.
Experience in writing SQL to analyze, optimize, profile data preferably in BigQuery or SPARK SQL.
Strong data modeling skills are necessary for designing a schema that can accommodate the evolution of data sources and facilitate seamless data joins across various datasets.
Ability to work directly with stakeholders to understand data requirements and translate that to pipeline development / data solution work.
Strong analytical and problem-solving skills are crucial for identifying and resolving issues that may arise during the data integration and schema evolution process.
Ability to move at a rapid pace with quality and start delivering with minimal ramp up time will be crucial to succeed in this initiative.
Effective communication and collaboration skills are necessary for working in a team environment and coordinating efforts between different stakeholders involved in the project.
Nice to have from you:
Experience building complex near real time (NRT) streaming data pipelines using Apache Kafka, Spark streaming, Kafka Connect with a strong focus on stability, scalability and SLA adherence.
Good understanding of REST APIs – working knowledge on Apache Druid, Redis, Elastic search, GraphQL or similar technologies. Understanding of API contracts, building telemetry, stress testing etc.
Exposure in developing reports/dashboards using Looker/Tableau.
Experience in eCommerce domain preferred.
Minimum Qualifications:
Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications.
Option 1: Bachelor’s degree in Computer Science and 3 years' experience in software engineering or related field.
Option 2: 5 years’ experience in software engineering or related field.
Option 3: Master's degree in Computer Science and 1 year’s experience in software engineering or related field.
2 years' experience in data engineering, database engineering, business intelligence, or business analytics.
Preferred Qualifications:
Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.
Data engineering, database engineering, business intelligence, or business analytics, ETL tools and working with large data sets in the cloud.
Master’s degree in Computer Science or related field and 3 years' experience in software engineering.
We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly.
The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.
Primary Location:
640 W California Avenue, Sunnyvale, CA 94086-4828, United States of America
#J-18808-Ljbffr