Honeyhive
Founding Engineer, AI
Honeyhive, New York, New York, us, 10261
About HoneyHive
At HoneyHive, we are building the first enterprise platform for LLM evaluation and observability. Our platform enables leading AI startups and enterprises to deploy GenAI applications to production and make them safer, trustworthy, and more reliable for their users.We are backed by leading institutional investors in AI/ML who were early investors in companies like Perplexity, Hugging Face, Motherduck, and Weights & Biases.About the role
As the first AI Researcher at HoneyHive, you will lay the technical and cultural foundations for the company and play a pivotal role in setting the bar for future researchers who join the team.In this role, you will:Do work that creates long term impact on the future of AI evaluation and alignment in a post-AGI world.
Conduct novel research on mitigating issues with production LLM applications (for example: detecting hallucination, PII leakage, etc.).
Work closely with founders on the future direction of the company and scope greenfield projects (AI is, of course, critical to everything we do).
Build and lead a team of highly accomplished researchers and engineers.
Work closely with customers to gather feedback, triage bugs, and validate key hypotheses.
Collaborate closely with founders to define our company culture and values, and represent HoneyHive at various events and conferences.
Some example projects include:Training small evaluator models (binary classifiers, open-source LLMs, etc.) for analyzing LLM responses on various criteria (such as hallucination & toxicity), performing PII sanitization, detecting prompt injection attacks, and more.
Implement unsupervised techniques to effectively cluster and analyze production embeddings for insights.
Architecting repeatable DPO/KTO pipelines and implementing the latest academic research in the LLM evaluation and alignment space within our product.
Work with our MLEs to build an extremely performant MLOps pipeline to optimize and serve various language models at production scale.
Creating visualizations and researching new techniques to interpret and explain how LLMs respond and why.
A bit about our stack
We use React and Express for web dev, AWS for all our cloud infra needs, and are building SDKs in Python and Typescript. For all our inference and fine-tuning needs, we use a combination of PyTorch, Hugging Face, and AWS. For all our model evaluation, prompt engineering, data curation, and labelling needs, we use HoneyHive.About you
We think you'd be a great fit if you have:4+ years of full-time experience as a machine learning engineer or researcher, with a proven track record of training large language models, building complex production LLM pipelines, and shipping high-quality AI applications.
Deep familiarity with the latest alignment research literature (such as RLHF, DPO, KTO, etc.).
Experience using Pytorch/Tensorflow and orchestrating data pipelines using Airflow or similar.
Enjoy working at every level of the stack and learning how to use new tools and frameworks.
Comfort with ambiguity and the ability to make educated assumptions in the absence of detailed product requirements or designs.
A passion for ownership and end-to-end responsibility, along with the ability to ship quality code in high-velocity environments.
Immense passion for learning. The space of large language models is evolving very rapidly and having a passion for learning new things is key.
We’d prefer if you also have:Research experience and deep personal interest in the areas of model evaluation, AI safety, or alignment.
Experience building LLM products or contributing to open-source LLM projects.
Experience working at early-stage startup environments.
Why join
A unique opportunity to shape the future of LLM infrastructure stack and make a material impact on the developer experience of working with LLMs.
Innovate on the bleeding edge of AI and solve hard research problems that impact the safe deployment of LLMs in production.
Exposure to various aspects of scaling an early-stage startup, including fundraising, sales, and networking, setting you up to be a potential future founder.
A supportive and fun work environment where you can learn and grow alongside like-minded people.
Our values
Move Fast, Don’t Break Things : We’re not afraid to move quickly but ensure each release meets the high standards of our customers. When in doubt, cut scope, not quality.
Customer Obsession : We’re always focussed on solving our customers’ most important problems. Anything that helps them move faster and remove friction gets prioritized.
First Principles Thinking : AI engineering is a net-new paradigm and requires us to approach problems from a first-principles mindset.
Collective Ownership : We take complete ownership of our work and are collectively accountable for the success of our product.
Benefits
Competitive salary + meaningful equity
Health, vision and dental benefits
Unlimited PTO
Assistance in relocating to NYC
MacBook Pro and peripherals
#J-18808-Ljbffr
At HoneyHive, we are building the first enterprise platform for LLM evaluation and observability. Our platform enables leading AI startups and enterprises to deploy GenAI applications to production and make them safer, trustworthy, and more reliable for their users.We are backed by leading institutional investors in AI/ML who were early investors in companies like Perplexity, Hugging Face, Motherduck, and Weights & Biases.About the role
As the first AI Researcher at HoneyHive, you will lay the technical and cultural foundations for the company and play a pivotal role in setting the bar for future researchers who join the team.In this role, you will:Do work that creates long term impact on the future of AI evaluation and alignment in a post-AGI world.
Conduct novel research on mitigating issues with production LLM applications (for example: detecting hallucination, PII leakage, etc.).
Work closely with founders on the future direction of the company and scope greenfield projects (AI is, of course, critical to everything we do).
Build and lead a team of highly accomplished researchers and engineers.
Work closely with customers to gather feedback, triage bugs, and validate key hypotheses.
Collaborate closely with founders to define our company culture and values, and represent HoneyHive at various events and conferences.
Some example projects include:Training small evaluator models (binary classifiers, open-source LLMs, etc.) for analyzing LLM responses on various criteria (such as hallucination & toxicity), performing PII sanitization, detecting prompt injection attacks, and more.
Implement unsupervised techniques to effectively cluster and analyze production embeddings for insights.
Architecting repeatable DPO/KTO pipelines and implementing the latest academic research in the LLM evaluation and alignment space within our product.
Work with our MLEs to build an extremely performant MLOps pipeline to optimize and serve various language models at production scale.
Creating visualizations and researching new techniques to interpret and explain how LLMs respond and why.
A bit about our stack
We use React and Express for web dev, AWS for all our cloud infra needs, and are building SDKs in Python and Typescript. For all our inference and fine-tuning needs, we use a combination of PyTorch, Hugging Face, and AWS. For all our model evaluation, prompt engineering, data curation, and labelling needs, we use HoneyHive.About you
We think you'd be a great fit if you have:4+ years of full-time experience as a machine learning engineer or researcher, with a proven track record of training large language models, building complex production LLM pipelines, and shipping high-quality AI applications.
Deep familiarity with the latest alignment research literature (such as RLHF, DPO, KTO, etc.).
Experience using Pytorch/Tensorflow and orchestrating data pipelines using Airflow or similar.
Enjoy working at every level of the stack and learning how to use new tools and frameworks.
Comfort with ambiguity and the ability to make educated assumptions in the absence of detailed product requirements or designs.
A passion for ownership and end-to-end responsibility, along with the ability to ship quality code in high-velocity environments.
Immense passion for learning. The space of large language models is evolving very rapidly and having a passion for learning new things is key.
We’d prefer if you also have:Research experience and deep personal interest in the areas of model evaluation, AI safety, or alignment.
Experience building LLM products or contributing to open-source LLM projects.
Experience working at early-stage startup environments.
Why join
A unique opportunity to shape the future of LLM infrastructure stack and make a material impact on the developer experience of working with LLMs.
Innovate on the bleeding edge of AI and solve hard research problems that impact the safe deployment of LLMs in production.
Exposure to various aspects of scaling an early-stage startup, including fundraising, sales, and networking, setting you up to be a potential future founder.
A supportive and fun work environment where you can learn and grow alongside like-minded people.
Our values
Move Fast, Don’t Break Things : We’re not afraid to move quickly but ensure each release meets the high standards of our customers. When in doubt, cut scope, not quality.
Customer Obsession : We’re always focussed on solving our customers’ most important problems. Anything that helps them move faster and remove friction gets prioritized.
First Principles Thinking : AI engineering is a net-new paradigm and requires us to approach problems from a first-principles mindset.
Collective Ownership : We take complete ownership of our work and are collectively accountable for the success of our product.
Benefits
Competitive salary + meaningful equity
Health, vision and dental benefits
Unlimited PTO
Assistance in relocating to NYC
MacBook Pro and peripherals
#J-18808-Ljbffr