ZipRecruiter
Senior Solutions Architect (AI/ML)
ZipRecruiter, Santa Clara, California, us, 95053
Job Description
We are looking for a Senior Solutions Architect to design, develop, and scale innovative AI/ML-driven solutions. You will be responsible for architecting highly scalable, low-latency distributed systems optimized for AI/ML workloads. As a key technical leader, you will solve complex challenges, influence next-generation AI/ML infrastructures, and guide cross-functional teams to deliver state-of-the-art solutions for fast-growing startups and enterprise companies.
Be at the forefront of shaping next-generation AI/ML infrastructures, driving solutions for high-impact products across diverse industries. You'll have the opportunity to influence key architectural decisions and enable real-world applications that scale globally, ensuring innovation and efficiency at every step.
Requirements
You'll be responsible for:
Driving end-to-end GenAI architecture and implementation:
Design and deploy multi-agent systems using modern frameworks (LangGraph, CrewAI, AutoGen) Architect RAG solutions with advanced vector store integration Implement efficient fine-tuning strategies for foundation models Develop synthetic data pipelines for training and testing
Leading ML infrastructure and deployment:
Design high-performance model serving architectures Implement distributed training and inference systems Establish MLOps practices and pipelines Optimize cloud resource utilization and costs Set up monitoring and observability solutions
Driving technical excellence and innovation:
Define architectural standards and best practices Lead technical decision-making for AI/ML initiatives Ensure scalability and reliability of AI systems Implement AI governance and security measures Guide teams on advanced AI concepts and implementations
Overseeing production AI systems:
Manage model deployment and versioning Implement A/B testing frameworks Monitor system performance and model drift Optimize inference latency and throughput Ensure high availability and fault tolerance
Fostering collaboration and growth:
Mentor engineering teams on AI architecture Collaborate with stakeholders on technical strategy Drive innovation in AI/ML solutions Share knowledge through documentation and training Lead technical reviews and architecture discussions
You need: 8+ years experience in software engineering or architecture, including: 4+ years leading cross-functional GenAI/ML teams Production experience with distributed AI systems Enterprise-scale AI architecture implementation To lead and architect enterprise-scale GenAI/ML solutions, focusing on: Multi-agent orchestration using LangGraph, CrewAI, and AutoGen Workflow automation with LlamaIndex, LangChain, and LangFlow Agent coordination using LETTA framework Integration of specialized agents for reasoning, planning, and execution To design and implement sophisticated AI architectures incorporating advanced RAG systems using: Vector databases (Chroma, Weaviate, Pinecone, Milvus) Hybrid search with BM25 and semantic embeddings Self-querying and recursive retrieval patterns Fine-tuning strategies for foundation models: PEFT methods (LoRA, QLoRA, Adapter-tuning) Parameter-efficient training approaches Instruction fine-tuning and RLHF Multi-agent frameworks integrating: Tool-use and reasoning chains Memory systems (short-term and long-term) Meta-prompting and reflection mechanisms Agent communication protocols Expertise in advanced data and synthesis: Synthetic data using Arigilla and PersonaHub Privacy-preserving data synthesis Domain-specific data augmentation Quality assessment of synthetic data Data balancing and bias mitigation To architect high-performance ML serving infrastructure focusing on: Model serving platforms (BentoML, Ray Serve, Triton) Real-time processing with Ray, Kafka, and Spark Streaming Distributed training using Horovod, DeepSpeed, and FSDP vLLM and TGI for efficient inference Integration patterns for hybrid cloud-edge deployments To drive cloud architecture decisions across: Kubernetes orchestration with Kubeflow and KServe Serverless ML with AWS Lambda, Azure Functions, Cloud Run Auto-scaling using HPA, KEDA, and custom metrics Resource optimization with Nvidia Triton and TensorRT MLOps platforms (MLflow, Weights & Biases, DVC) Benefits Bonus points for: Research publications in AI/ML Open-source project maintenance Technical blog posts on AI architecture Conference presentations AI community leadership What you get: Best in class salary: We hire only the best, and we pay accordingly. Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field. Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day. About us: We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting-edge tech at scale. Here's a quick guide to getting to know us better: Watch our CEO, Hardik Jagda, tell you all about Proximity. Read about Proximity's values and meet some of our Proxonauts here. Explore our website, blog, and the design wing — Studio Proximity. Get behind the scenes with us on Instagram! Follow @ProxWrks and @H.Jagda
#J-18808-Ljbffr
Design and deploy multi-agent systems using modern frameworks (LangGraph, CrewAI, AutoGen) Architect RAG solutions with advanced vector store integration Implement efficient fine-tuning strategies for foundation models Develop synthetic data pipelines for training and testing
Leading ML infrastructure and deployment:
Design high-performance model serving architectures Implement distributed training and inference systems Establish MLOps practices and pipelines Optimize cloud resource utilization and costs Set up monitoring and observability solutions
Driving technical excellence and innovation:
Define architectural standards and best practices Lead technical decision-making for AI/ML initiatives Ensure scalability and reliability of AI systems Implement AI governance and security measures Guide teams on advanced AI concepts and implementations
Overseeing production AI systems:
Manage model deployment and versioning Implement A/B testing frameworks Monitor system performance and model drift Optimize inference latency and throughput Ensure high availability and fault tolerance
Fostering collaboration and growth:
Mentor engineering teams on AI architecture Collaborate with stakeholders on technical strategy Drive innovation in AI/ML solutions Share knowledge through documentation and training Lead technical reviews and architecture discussions
You need: 8+ years experience in software engineering or architecture, including: 4+ years leading cross-functional GenAI/ML teams Production experience with distributed AI systems Enterprise-scale AI architecture implementation To lead and architect enterprise-scale GenAI/ML solutions, focusing on: Multi-agent orchestration using LangGraph, CrewAI, and AutoGen Workflow automation with LlamaIndex, LangChain, and LangFlow Agent coordination using LETTA framework Integration of specialized agents for reasoning, planning, and execution To design and implement sophisticated AI architectures incorporating advanced RAG systems using: Vector databases (Chroma, Weaviate, Pinecone, Milvus) Hybrid search with BM25 and semantic embeddings Self-querying and recursive retrieval patterns Fine-tuning strategies for foundation models: PEFT methods (LoRA, QLoRA, Adapter-tuning) Parameter-efficient training approaches Instruction fine-tuning and RLHF Multi-agent frameworks integrating: Tool-use and reasoning chains Memory systems (short-term and long-term) Meta-prompting and reflection mechanisms Agent communication protocols Expertise in advanced data and synthesis: Synthetic data using Arigilla and PersonaHub Privacy-preserving data synthesis Domain-specific data augmentation Quality assessment of synthetic data Data balancing and bias mitigation To architect high-performance ML serving infrastructure focusing on: Model serving platforms (BentoML, Ray Serve, Triton) Real-time processing with Ray, Kafka, and Spark Streaming Distributed training using Horovod, DeepSpeed, and FSDP vLLM and TGI for efficient inference Integration patterns for hybrid cloud-edge deployments To drive cloud architecture decisions across: Kubernetes orchestration with Kubeflow and KServe Serverless ML with AWS Lambda, Azure Functions, Cloud Run Auto-scaling using HPA, KEDA, and custom metrics Resource optimization with Nvidia Triton and TensorRT MLOps platforms (MLflow, Weights & Biases, DVC) Benefits Bonus points for: Research publications in AI/ML Open-source project maintenance Technical blog posts on AI architecture Conference presentations AI community leadership What you get: Best in class salary: We hire only the best, and we pay accordingly. Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field. Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day. About us: We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting-edge tech at scale. Here's a quick guide to getting to know us better: Watch our CEO, Hardik Jagda, tell you all about Proximity. Read about Proximity's values and meet some of our Proxonauts here. Explore our website, blog, and the design wing — Studio Proximity. Get behind the scenes with us on Instagram! Follow @ProxWrks and @H.Jagda
#J-18808-Ljbffr