Logo
Twitch

Senior Data Engineer - Community Discovery ML

Twitch, San Francisco, California, United States, 94199


About UsTwitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day.

We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and smash (or elegantly solve) problems together. We’re on a quest to empower live communities.

About the RoleThe Community Discovery ML team focuses on providing personalized, relevant experiences for Twitch users through Recommendation and Search. We are looking for a senior data engineer to join us. You will be the first data engineer hired in a hybrid team of ML engineers and scientists, working on data challenges related to ML models and products. You will extend, design, and build new capabilities in our data systems to ensure fast ML model development and productionization. You will impact cross teams by defining expectations for data usage patterns and data quality.

You will report to an Engineering Manager and work in San Francisco / Bay Area.

You Will:

Oversee team data architecture to meet ML use cases in production.

Design and build scalable data pipelines to support personalization models.

Develop and maintain low-latency, large-scale streaming and batch data processing systems.

Collaborate with applied scientists and ML engineers to integrate data into production models.

Optimize data workflows for performance and cost efficiency.

Implement best practices for data governance and security.

Troubleshoot and resolve data-related issues, focusing on identifying and solving data quality problems.

Mentor others in the team in data-related solutions and skills.

You Have:

6+ years of experience as a data engineer or in a similar role.

Proficiency in SQL, Python, or Scala.

Experience with building batch and streaming data pipelines with high throughput and low latency.

Strong understanding of data architecture and data modeling principles.

Experience analyzing large datasets to identify gaps and inconsistencies, provide data insights, and promote effective product solutions.

Hands-on experience with cloud platforms (AWS, GCP, or Azure) and their data services.

Familiarity with ETL tools and data warehousing solutions.

Experience with distributed data processing technologies such as Apache Spark, Flink, and Kafka.

Experience working with cross-functional roles like ML engineers and scientists.

Bonus Points

Experience with AWS data ecosystems like Redshift, Kinesis, and Glue.

Understand data requirements for ML production systems.

Extensive experience with mature and large-scale production data systems and capable of defining a strong North Star and making increment progress towards that.

Perks

Medical, Dental, Vision & Disability Insurance

401(k)

Maternity & Parental Leave

Flexible PTO

Amazon Employee Discount

We are an equal opportunity employer and value diversity at Twitch. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Twitch values your privacy. Please consult our

Candidate Privacy Notice

for information about how we collect, use, and disclose personal information of our candidates.

Job ID: TW8541

#J-18808-Ljbffr