TikTok
Software Engineer, Storage System
TikTok, San Jose, California, United States, 95199
Responsibilities
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.
Our team was established to help realize our company vision, building a global platform for creation and communication. We are doing world-class work in machine learning, computer vision, natural language processing, speech and audio, and knowledge, and transferring our work into products, which hundreds of millions of users worldwide use. As a vital AI infrastructure for the company, our machine learning system integrates our most up-to-date R&D results in AI algorithms and systems. Come and join us, you will get the chance of building large-scale machine learning systems and working with the best AI system and algorithm researchers and engineers.
What You’ll Be DoingBuild a unified data storage format and query engine in different scenarios (high availability/high throughput, large volume/sequential or random access).Build an efficient system for model parameter management, sharding, and deduplication for LLMs.Develop multi-level/hierarchical storage architecture, not limited to HBM/DDR/disk.Optimize the training system for availability and fault tolerance; improve the data consistency, and capacity of the system.Research and implement state of the art indexing/storage structures for machine learning on latest hardware.
QualificationsProficient in the use of C++/Python in the Linux environment.Proficient in the design, development, maintenance and continuous optimization of large-scale distributed systems, and be able to identify potential problems in complex systems.Have participated in optimizations for parameter-server-like systems, or indexing structure of query engines; or have experience in using/optimizing large-scale distributed storage systems such as HDFS and PFS.Strong communication skills and develop new solutions based on issues that arise.
BonusUnderstand open source storage/engine projects such as Redis, RocksDB, Presto, etc.; understand common Machine learning file storage formats such as parquet, TFRecord, IndexRecordIO, etc.Familiar with one of the machine learning frameworks (TensorFlow/PyTorch/Jax).Have experience in one of the following fields: database systems, distributed storage, AI infrastructure, HW/SW co-design, High performance computing, ML hardware architecture (GPU, accelerators, networking), machine learning frameworks, operating systems.ACM/OI competitive programming experiences.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.
TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at rd.accommodations@tiktok.com.
Job Information:The base salary range for this position in the selected city is $145000 - $355000 annually. Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life. We offer the following benefits to eligible employees:We cover 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents and offer a Health Savings Account(HSA) with a company match. As well as Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans. In addition to Flexible Spending Account(FSA) Options like Health Care, Limited Purpose and Dependent Care.Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.We also provide generous benefits like mental and emotional health benefits through our EAP and Lyra. A 401K company match, gym and cellphone service reimbursements. The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
#J-18808-Ljbffr
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.
Our team was established to help realize our company vision, building a global platform for creation and communication. We are doing world-class work in machine learning, computer vision, natural language processing, speech and audio, and knowledge, and transferring our work into products, which hundreds of millions of users worldwide use. As a vital AI infrastructure for the company, our machine learning system integrates our most up-to-date R&D results in AI algorithms and systems. Come and join us, you will get the chance of building large-scale machine learning systems and working with the best AI system and algorithm researchers and engineers.
What You’ll Be DoingBuild a unified data storage format and query engine in different scenarios (high availability/high throughput, large volume/sequential or random access).Build an efficient system for model parameter management, sharding, and deduplication for LLMs.Develop multi-level/hierarchical storage architecture, not limited to HBM/DDR/disk.Optimize the training system for availability and fault tolerance; improve the data consistency, and capacity of the system.Research and implement state of the art indexing/storage structures for machine learning on latest hardware.
QualificationsProficient in the use of C++/Python in the Linux environment.Proficient in the design, development, maintenance and continuous optimization of large-scale distributed systems, and be able to identify potential problems in complex systems.Have participated in optimizations for parameter-server-like systems, or indexing structure of query engines; or have experience in using/optimizing large-scale distributed storage systems such as HDFS and PFS.Strong communication skills and develop new solutions based on issues that arise.
BonusUnderstand open source storage/engine projects such as Redis, RocksDB, Presto, etc.; understand common Machine learning file storage formats such as parquet, TFRecord, IndexRecordIO, etc.Familiar with one of the machine learning frameworks (TensorFlow/PyTorch/Jax).Have experience in one of the following fields: database systems, distributed storage, AI infrastructure, HW/SW co-design, High performance computing, ML hardware architecture (GPU, accelerators, networking), machine learning frameworks, operating systems.ACM/OI competitive programming experiences.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach.
TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at rd.accommodations@tiktok.com.
Job Information:The base salary range for this position in the selected city is $145000 - $355000 annually. Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life. We offer the following benefits to eligible employees:We cover 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents and offer a Health Savings Account(HSA) with a company match. As well as Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans. In addition to Flexible Spending Account(FSA) Options like Health Care, Limited Purpose and Dependent Care.Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability.We also provide generous benefits like mental and emotional health benefits through our EAP and Lyra. A 401K company match, gym and cellphone service reimbursements. The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
#J-18808-Ljbffr