ByteDance
Software Engineer - Machine Learning Infrastructure
ByteDance, Seattle, Washington, us, 98127
Responsibilities:
Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.Responsible for improving use-ability and flexibility of the machine learning infrastructure.Responsible for improving the workflow of model training and serving, data pipelines and resource management for the multi-tenancy machine learning systems.Responsible for designing and developing key components of ML infrastructure and mentoring interns.Qualifications:
Proficient in C/C++/Python, and have solid programming skills.Familiar with deep learning frameworks (TensorFlow/Pytorch).Experience in developing and deploying large-scale systems.Ability to work independently and complete projects from beginning to end and in a timely manner.Good communication and teamwork skills to clearly communicate technical concepts with other teammates.Experience on improving core machine learning infrastructure (TensorFlow, Pytorch, and Jax).Preferred Qualifications:
Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch).Experience in big data frameworks (e.g., Spark/Hadoop/Flink), experience in resource management and task scheduling for large scale distributed systems.Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/TPU/RDMA) or ML for Systems.
#J-18808-Ljbffr
Responsible for the design and implementation of a global-scale machine learning system for feeds, ads and search ranking models.Responsible for improving use-ability and flexibility of the machine learning infrastructure.Responsible for improving the workflow of model training and serving, data pipelines and resource management for the multi-tenancy machine learning systems.Responsible for designing and developing key components of ML infrastructure and mentoring interns.Qualifications:
Proficient in C/C++/Python, and have solid programming skills.Familiar with deep learning frameworks (TensorFlow/Pytorch).Experience in developing and deploying large-scale systems.Ability to work independently and complete projects from beginning to end and in a timely manner.Good communication and teamwork skills to clearly communicate technical concepts with other teammates.Experience on improving core machine learning infrastructure (TensorFlow, Pytorch, and Jax).Preferred Qualifications:
Experience contributing to an open sourced machine learning framework (TensorFlow/PyTorch).Experience in big data frameworks (e.g., Spark/Hadoop/Flink), experience in resource management and task scheduling for large scale distributed systems.Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/TPU/RDMA) or ML for Systems.
#J-18808-Ljbffr