TikTok
Machine Learning Engineer - Data Curation - AIGC, TikTok Monetization GenAI
TikTok, San Jose, CA
DescriptionTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible. Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve. Join us.We are Generative AI team under Monetization Technology. Our team focuses on developing cutting-edge Generative AI techs across all modalities, including text, image, videos, landing pages, etc., and creates industry-leading technical solutions to improve creative efficiency for advertisers, agencies and creators. We are committed to automated creative workflows by leveraging Generative AI technologies, to increase overall revenue for advertisers, agencies and creators.We aim to drive and lead the generative AI in the ads tech and creative industry, powering products and driving values for our clients, creators, and the whole ecosystem. We are looking for infrastructure engineers who are excited to grow their business understanding, build highly scalable and reliable software/infrastructure, partner across functions with global teams, and make big impacts. If you are someone who welcomes challenges, we are eager to have you on the team!Responsibilities: - Collaborate with foundational model researchers, including specialists in Ads LLM, Text-to-Image, and Text-to-Video, to develop and maintain efficient, low-latency data pipelines.- Design and implement robust, scalable systems for data curation and management, supporting the foundational training of models across various formats in distributed environments.- Implement data insights and model evaluation pipelines to enhance user engagement and drive revenue growth.- Develop caching mechanisms to improve data retrieval speeds and enhance model responsiveness.- Stay abreast of the latest academic research and open-source advancements, integrating cutting-edge technologies to continuously improve our data operations and machine learning model performance.QualificationsMinimum Qualifications:1. B.S./M.S./Ph.D. in Computer Science, Computer Engineering, or a related field.2. Programming and Technical Proficiency: Expertise in Python and a strong foundation in deep learning frameworks, such as PyTorch, as well as large model training libraries like FSDP/DeepSpeed and asyncio. A minimum of 3 years' experience with Linux, Docker, and Kubernetes is required.3. Data Engineering and AI/ML Knowledge: Demonstrated capability in data curation, management, and optimization within Generative AI ecosystems, encompassing both streaming and batch data processing. This includes a thorough understanding of machine learning frameworks, parallel data processing techniques, and proficiency with large language models (e.g., Llama series), text to image (e.g., Diffusion-Based Models, Diffusion Transformers), and text to video technologies (e.g., EMU series, MagViT).Preferred Qualifications:1. Advanced Technical Expertise: Experience in CUDA Optimization and a deep understanding of the application of Generative AI models across multiple domains.2. Cloud Computing and Distributed Systems: Significant experience in managing large-scale data systems, with a strong preference for those who have worked with Vector Database solutions. Proficiency in cloud services (AWS/GCP) and familiarity with machine learning training, deployment, and distributed computing frameworks like Spark.3. Interpersonal and Problem-Solving Skills: A demonstrated passion for technology, coupled with outstanding problem-solving capabilities. Exceptional communication, teamwork, and project management skills are essential, along with a resilient character.TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/cdpT2RegularExperienced