Logo
Acceler8 Talent

Software Engineer - AI Training Data

Acceler8 Talent, Palo Alto, CA, United States


Software Engineer - AI Training Data

Introduction

We are seeking a Software Engineer - AI Training Data to tackle complex challenges in data management. This role is ideal for an engineer passionate about building innovative systems and optimizing large-scale datasets for advanced AI applications.

About the Company

This organization is dedicated to reshaping the physical world by developing cutting-edge AI to accelerate hardware innovation. The focus is on creating frontier models that understand the complexities of semiconductors and electronics. By assembling a talented team, we aim to push the boundaries of what's possible in technology.

About the Role

As a Software Engineer - AI Training Data, you will be responsible for building and optimizing the world’s largest semiconductor dataset. Your work will support our Machine Learning team by preparing and managing vast amounts of information across multiple modalities, including text, images, and circuits. Your expertise in software engineering and scalable infrastructure will be vital for our success.

What We Can Offer You

  • Competitive annual base salary ranging from $150,000 to $350,000, based on experience and expertise.
  • Unlimited PTO to promote work-life balance.
  • Comprehensive health coverage for you and your family.
  • Commitment to your growth through challenging projects and professional development opportunities.
  • Visa sponsorship for international candidates.

Key Responsibilities

  • Build and manage the world’s largest semiconductor dataset.
  • Develop software solutions for efficient data scraping and handling at scale.
  • Extract and clean data from various modalities, including text and images.
  • Prepare and preprocess datasets for the Machine Learning team.
  • Build systems for transferring customer data and feedback.
  • Parse documents of various formats and structures.
  • Develop software pipelines for data labelers and manage workloads across cloud compute clusters.
  • Implement systems for pre-processing datasets for AI training.

In this Software Engineer - AI Training Data role, you will need a proven track record in building scalable software solutions for data pipelines. Your expertise in PDF parsing and data extraction will be essential. Candidates should have a strong understanding of data organization across multiple sources and clouds. Familiarity with state-of-the-art techniques for preparing AI training data is a must.

If you are ready to contribute to groundbreaking work in the semiconductor and AI fields, we encourage you to apply.