Logo
The New York Times Company

Senior Software Engineer, Machine Learning Platform

The New York Times Company, New York, New York, us, 10261


About the Role

Machine Learning (ML) at the New York Times enhances the experience of our 150 million digital readers from around the globe and grows our subscriber base through content recommendations and personalizations.

The Machine Learning Platform (MLP) team builds and maintains the infrastructure that hosts all of The New York Times real-time ML inference models, including both data and compute. Our partners are Data Scientists that build and deploy their ML models on the ML platform. On the other end, our partners are engineering systems that call these hosted models at scale with low-latency and Service Level Agreements (SLAs) guaranteed by the MLP.

We are looking for a Senior Software Engineer, with a focus on MLOps, to join our Machine Learning Platform team to help solve creative challenges around Machine Learning infrastructure for the New York Times.

Hybrid work schedule based in New York City reporting to the engineering manager of the Machine Learning Platform team.

Responsibilities:

You will research, develop, and deploy infrastructure for the Machine Learning Platform that supports large scale multi-tenant workloads

You will build a platform to train and test algorithms that provide real-time content recommendations and personalization to our readers

You will enhance ML platform’s CI/CD and integration testing capabilities

You will be part of building a platform to empower data scientists with self-service patterns that support the full ML workflow from model training and testing to production deployment at high scale and low latency across our products

Demonstrate support and understanding of our

value of journalistic independence

and a strong commitment to our mission to seek the truth and help people understand the world.

Basic Qualifications:

4+ years of direct relevant experience in MLOps or DevOps, including experience operating large systems in a production environment

Experience deploying and monitoring systems using cloud infrastructure (GCP or AWS)

Experience working with Kubernetes, Docker, and CI/CD (Drone, Argo, Jenkins,etc)

Experience leading the development of large scale, data driven, distributed multi-tenant systems

Proficiency with at least one high-level programming language like Python or Go

Preferred Qualifications:

Experience with any of the these technologies: Terraform, Airflow, SQL/BigQuery, BigTable or other NoSQL datastores such as Cassandra, DynamoDB, Redis

Familiarity with ML tooling such as Triton, TensorFlow, scikit-learn

Experience building the infrastructure that power real-world machine learning applications like recommendation systems, bandits, etc.

Experience engaging with partners to understand pain points, observe patterns, and identify opportunities for improvements

This role may require limited on-call hours. An on-call schedule will be determined when you join, taking into account team size and other variables.

REQ-017430

#J-18808-Ljbffr