Senior Software Engineer - Apple ML Data Platform
Apple, Cupertino, CA, United States
Senior Software Engineer - Apple ML Data Platform
Cupertino, California, United States
Software and Services
The Apple Data Platform (ADP) group builds the data platform that enables the next generation of intelligent experiences on all Apple products and services. ADP empowers Apple engineers to deliver ML-driven products and innovations rapidly and at scale. We are looking for an experienced engineer who can bring their passion for machine learning, infrastructure, big data, and distributed systems to build world-class data+ML platform/products at scale. You will work with many cross-functional teams and lead the planning, execution, and success of technical projects with the ultimate purpose of improving ML experience for Apple customers. Are you passionate about building scalable, reliable, maintainable infrastructure and solving data problems at scale? Come join us and be part of the Data Infrastructure journey.
Description
The ADP ML Data Platform team enables future Apple intelligent products by providing Apple engineers with cutting-edge ML technologies, large-scale compute and data systems specifically designed for machine learning. The platform focuses on ML data management, feature engineering, and embedding management, empowering teams to efficiently build and deploy ML models at scale. As a member of the Apple ML Data Platform team, your responsibilities will include:
- Design, implement, and maintain distributed systems to build world-class ML platforms/products at scale.
- Build relationships with stakeholders across the organization to better understand internal customer needs and enhance our product for end users.
- Partner with ML, data engineering, and infrastructure teams to deliver end-to-end solutions.
- Optimize platform components for large-scale ML workloads across distributed systems.
- Diagnose, fix, improve, and automate complex issues across the entire stack to ensure maximum uptime and performance.
- Design and extend services to improve functionality and reliability of the platform.
- Monitor system performance, optimize for cost and efficiency, and resolve any issues that arise.
Minimum Qualifications
- 5+ years of experience in distributed systems with deep knowledge in computer science fundamentals.
- Experience in delivering data and machine learning infrastructure in production environments.
- Experience configuring, deploying and troubleshooting large-scale production environments.
- Experience in designing, building, and maintaining scalable, highly available systems that prioritize ease of use.
- Experience with alerting, monitoring and remediation automation in a large-scale distributed environment.
- Extensive programming experience in Java, Python, or Go.
- Strong collaboration and communication (verbal and written) skills.
- B.S., M.S., or Ph.D. in Computer Science, Computer Engineering, or equivalent practical experience.
Key Qualifications
Preferred Qualifications
- Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
- Understanding of the ML lifecycle and state-of-the-art ML Infrastructure technologies.