FlexAI

Head of Performance Architecture

FlexAI, Santa Clara, California, us, 95053

Join FlexAI:FlexAI is at the forefront of revolutionizing AI computing by reengineering infrastructure at the system level. Our groundbreaking architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, allows developers to leverage a diverse array of compute, resulting in efficient, more reliable computing at a fraction of the cost.

The rapid evolution of machine intelligence has created a need for a new system architecture capable of handling high memory capacity and bandwidth. These are critical bottlenecks in pushing machine intelligence to the next level, where compute demand is expected to increase up to 1000 times current levels.

We are looking for a

Head of Performance Architecture

who is not afraid of pushing boundaries and reimagining what’s possible. In this role, you will lead the performance architecture team, optimize system performance, and architect solutions that propel our platforms to deliver exceptional speed, efficiency, and reliability.

Position Overview:As the

Head of Performance Architecture , you will oversee the analysis, design, and optimization of AI systems and infrastructure performance, essentially architecting AI efficiency. This role requires a technical understanding of system architectures and hardware acceleration and the ability to collaborate with experts across multiple disciplines to identify performance bottlenecks, improve system throughput, and ensure that AI models and workloads operate at peak performance, even at scale.

What you’ll do:

Lead and mentor the performance architecture team, fostering a culture of excellence, innovation, and collaboration.

Define and execute the overall performance strategy, ensuring alignment with business goals and technical requirements.

Oversee the analysis and optimization of AI systems' performance, including hardware and software components, to support AI workloads.

Manage system architecture, design, development, and optimization to ensure high performance, scalability, and reliability.

Identify and address performance bottlenecks across AI infrastructures, including CPUs, GPUs/TPUs, memory, storage, and networking.

Collaborate with AI researchers, data scientists, and infrastructure teams to optimize system performance across the stack.

Lead performance reviews and architecture evaluations to guide the design of new systems to ensure they meet performance requirements.

Provide guidance on best practices for AI system performance, including workload distribution, resource allocation, and hardware utilization.

Stay current with the latest advancements in AI hardware and software, integrating cutting-edge technologies to drive performance improvements.

What you’ll need to be successful:

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related field. Advanced degrees are a plus.

10+ years of experience in system performance engineering, with a focus on AI or high-performance computing (HPC), with at least five years in a leadership role.

Proven experience optimizing AI systems’ performance, including hardware and software components.

Deep knowledge of AI architectures, including GPUs, TPUs, and specialized AI accelerators.

Strong understanding of AI frameworks (e.g., TensorFlow, PyTorch) and how they interact with hardware.

Experience with performance analysis tools, including profiling, benchmarking, and monitoring tools.

Expertise in hardware acceleration technologies and techniques for optimizing AI workloads.

Ability to work with cross-functional teams to drive performance improvements.

Strong problem-solving skills and the ability to make data-driven decisions.

Model inclusive behaviors and contribute to a culture that respects different backgrounds and perspectives.

Preferred Skills:

Experience with distributed AI systems and scaling AI workloads across large-scale infrastructure.

Knowledge of cloud-based AI platforms and performance optimization in cloud environments.

Familiarity with containerized environments (e.g., Kubernetes) and AI performance in these contexts.

Experience with low-level optimization, including assembly-level tuning and compiler optimization for AI workloads.

Strong background in networking performance, including low-latency, high-throughput communication architectures.

What we offer:

A competitive salary and benefits package, tailored to recognize your dedication and contributions.

The opportunity to collaborate with leading experts in AI and cloud computing, learning from the best and the brightest.

An environment that values innovation, collaboration, and mutual respect.

Support for personal and professional development, empowering you with the tools and resources to elevate your skills.

A pivotal role in the AI revolution, shaping the technologies that power the innovations of tomorrow.

About FlexAI:Founded by Brijesh Tripathi and Dali Kilani, who bring experience from Nvidia, Apple, Tesla, Intel, Lifen, and Zoox, FlexAI is not

just

building a product – we’re

shaping the future of AI.

Offices:

Paris - HQ

San Francisco (Bay Area) - US office

Bangalore - India office

Apply NOW!You’ve seen what this role entails. Now we want to hear from you! Does this opportunity align with your aspirations? If you’re even slightly curious, we encourage you to apply – it could be the start of something extraordinary!

At FlexAI, we believe diverse teams are the most innovative teams. We’re committed to creating an inclusive environment where everyone feels valued, and we proudly offer equal opportunities regardless of gender, sexual orientation, origin, disabilities, veteran status, or any other facets of your identity that make you uniquely you.

#J-18808-Ljbffr