TensorLake Inc.

Founding Applied AI Scientist

TensorLake Inc., San Francisco, California, United States, 94199

Founding Applied AI Research Scientist Tensorlake is building a distributed data processing platform for developers building Generative AI applications. Our product, Indexify( https://getindexify.ai ), enables building continuously evolving knowledge bases and indexes for Large Language Model applications by allowing structured data or embedding extraction algorithms on any unstructured data. We are building a server-less product on top of Indexify that allows users to build real time extraction pipelines for unstructured data. The extracted data and indexes would be directly consumed by AI Applications and LLMs to power business and consumer applications.

We are looking for a

Founding Applied AI Research Scientist

who thrives on tackling complex challenges at the intersection of document understanding, multimodal learning, and cutting-edge AI research. Working closely with the founding team, you'll influence Tensorlake’s technical strategy and contribute to advancing the capabilities of our AI products.

Responsibilities

As a Founding Applied AI Research Scientist, you will: - Design, train, and evaluate

document understanding models

for extracting complex data, such as

tables, forms, and structured text

from documents. - Develop and optimize

multi-modal visual Q&A models , enabling our platform to understand and answer questions based on both textual and visual information. - Collaborate with the team to integrate AI-driven features into Tensorlake’s platform, helping turn research insights into practical, real-world solutions. - Work closely with users and customers to understand their needs, ensuring that AI solutions provide real, measurable value in business applications.

Qualifications

- 4+ years of experience working with AI/ML models, specifically in the fields of

document understanding ,

computer vision , and

multi-modal learning . - Proven expertise in training and evaluating models for

complex document extraction , including structured data like

tables

and

forms . -

Deep NLP Expertise : Experience with transformer-based models such as

BERT ,

LayoutLM ,

T5 , or

DocFormer . -

OCR Integration : Proficiency in integrating OCR technologies for extracting text from scanned documents and PDFs. -

Model Pretraining and Fine-tuning : Experience with pretraining large models and fine-tuning them for document understanding tasks. -

Layout Analysis : Understanding document layout and structure for effective table detection and hierarchy extraction. -

Benchmarking and Evaluation : Experience with document-specific datasets and evaluation techniques. -

Vision-Language Models : Familiarity with models that integrate visual and textual data for document understanding. - Solid programming skills in Python and proficiency in at least one deep learning framework (e.g., TensorFlow, PyTorch). - Ph.D. or Bachelor's degree in a quantitative field such as Computer Science, Mathematics, or equivalent industry experience. Benefits

- Ability to save in 401(k) plans - Comprehensive Healthcare and Dental Benefits

If you’re passionate about research in document understanding and multimodal learning, and enjoy tackling ambitious technical challenges, we’d love to hear from you. Even if you only fit some of the criteria but have relevant experience, we encourage you to apply and share a project that showcases your expertise. Bonus points if it’s open-source!

#J-18808-Ljbffr