Logo
Proximity Works

Senior Solutions Architect (AI/ML)

Proximity Works, Santa Clara, CA, United States


We are looking for a Senior Solutions Architect to design, develop, and scale innovative AI/ML-driven solutions. You will be responsible for architecting highly scalable, low-latency distributed systems optimized for AI/ML workloads. As a key technical leader, you will solve complex challenges, influence next-generation AI/ML infrastructures, and guide cross-functional teams to deliver state-of-the-art solutions for fast-growing startups and enterprise companies.

Be at the forefront of shaping next-generation AI/ML infrastructures, driving solutions for high-impact products across diverse industries. You'll have the opportunity to influence key architectural decisions and enable real-world applications that scale globally, ensuring innovation and efficiency at every step.

Requirements

You'll be responsible for —
Driving end-to-end GenAI architecture and implementation:
  • Design and deploy multi-agent systems using modern frameworks (LangGraph, CrewAI, AutoGen)
  • Architect RAG solutions with advanced vector store integration
  • Implement efficient fine-tuning strategies for foundation models
  • Develop synthetic data generation pipelines for training and testing
Leading ML infrastructure and deployment:
  • Design high-performance model serving architectures
  • Implement distributed training and inference systems
  • Establish MLOps practices and pipelines
  • Optimize cloud resource utilization and costs
  • Set up monitoring and observability solutions
Driving technical excellence and innovation:
  • Define architectural standards and best practices
  • Lead technical decision-making for AI/ML initiatives
  • Ensure scalability and reliability of AI systems
  • Implement AI governance and security measures
  • Guide teams on advanced AI concepts and implementations
Overseeing production AI systems:
  • Manage model deployment and versioning
  • Implement A/B testing frameworks
  • Monitor system performance and model drift
  • Optimize inference latency and throughput
  • Ensure high availability and fault tolerance
Fostering collaboration and growth:
  • Mentor engineering teams on AI architecture
  • Collaborate with stakeholders on technical strategy
  • Drive innovation in AI/ML solutions
  • Share knowledge through documentation and training
  • Lead technical reviews and architecture discussions
You need —
8+ years experience in software engineering or architecture, including:
  • 4+ years leading cross-functional GenAI/ML teams
  • Production experience with distributed AI systems
  • Enterprise-scale AI architecture implementation
To lead and architect enterprise-scale GenAI/ML solutions, focusing on:
  • Multi-agent orchestration using LangGraph, CrewAI, and AutoGen
  • Workflow automation with LlamaIndex, LangChain, and LangFlow
  • Agent coordination using LETTA framework
  • Integration of specialized agents for reasoning, planning, and execution
To design and implement sophisticated AI architectures incorporating:
Advanced RAG systems using:
  • Vector databases (Chroma, Weaviate, Pinecone, Milvus)
  • Hybrid search with BM25 and semantic embeddings
  • Self-querying and recursive retrieval patterns
Fine-tuning strategies for foundation models:
  • PEFT methods (LoRA, QLoRA, Adapter-tuning)
  • Parameter-efficient training approaches
  • Instruction fine-tuning and RLHF
Multi-agent frameworks integrating:
  • Tool-use and reasoning chains
  • Memory systems (short-term and long-term)
  • Meta-prompting and reflection mechanisms
  • Agent communication protocols
Expertise advanced data generation and synthesis:
  • Synthetic data generation using Arigilla and PersonaHub
  • Privacy-preserving data synthesis
  • Domain-specific data augmentation
  • Quality assessment of synthetic data
  • Data balancing and bias mitigation
To architect high-performance ML serving infrastructure focusing on:
  • Model serving platforms (BentoML, Ray Serve, Triton)
  • Real-time processing with Ray, Kafka, and Spark Streaming
  • Distributed training using Horovod, DeepSpeed, and FSDP
  • vLLM and TGI for efficient inference
  • Integration patterns for hybrid cloud-edge deployments
To drive cloud architecture decisions across:
  • Kubernetes orchestration with Kubeflow and KServe
  • Serverless ML with AWS Lambda, Azure Functions, Cloud Run
  • Auto-scaling using HPA, KEDA, and custom metrics
  • Resource optimization with Nvidia Triton and TensorRT
  • MLOps platforms (MLflow, Weights & Biases, DVC)

Benefits

Bonus points for —
  • Research publications in AI/ML
  • Open-source project maintenance
  • Technical blog posts on AI architecture
  • Conference presentations
  • AI community leadership
What you get —
  • Best in class salary: We hire only the best, and we pay accordingly.
  • Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field.
  • Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day.
About us —

We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting-edge tech at scale. Here's a quick guide to getting to know us better:

  • Watch our CEO, Hardik Jagda, tell you all about Proximity.
  • Read about Proximity's values and meet some of our Proxonauts here.
  • Explore our website, blog, and the design wing — Studio Proximity.
  • Get behind the scenes with us on Instagram! Follow @ProxWrks and @H.Jagda