Logo
Orion Innovation

Data Engineer

Orion Innovation, Morrisville, North Carolina, United States, 27560


Orion Innovation is a premier, award-winning, global business and technology services firm. Orion delivers game-changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide range of clients across many industries including financial services, professional services, telecommunications and media, consumer products, automotive, industrial automation, professional sports and entertainment, life sciences, ecommerce, and education.

Job Description:

We are excited to offer a few data engineer positions in Large Language Model and Multi-modality LLM field, specifically in European languages. The goal is to work with the team on the data part to help build strong multi-lingual AI models. We are excited to offer a few data engineer positions in Large Language Model and Multi-modality LLM field, specifically in European languages. The goal is to work with the team on the data part to help build strong multi-lingual AI models. In addition to English, the candidate is required to be proficient one or more of the following languages: German, Italian, French and Portuguese.

Responsibilities:

Develop and maintain web scraping and data extraction processes to gather large-scale text and image data from diverse sources.Clean, preprocess, and tag text and image data to ensure data quality and usability.Work with different data formats such as Parquet, JSONL, and CSV, ensuring efficient data storage and retrieval.Collaborate with data scientists and machine learning engineers to support the evaluation and improvement of large language models.Stay up-to-date with the latest research and advancements in the field of data engineering, web scraping, and machine learning. Actively participate in academic research and reading groups.Implement and optimize data pipelines for high-volume data processing.Strong proficiency in Python and solid understanding of HTML, JSON, and web technologies.Education:

Master degree required, and 2-4 years of experience.

Behavior Characteristics:

Willing to work hard, full of passion and motivated. Eager to learn new things

Orion is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, citizenship status, disability status, genetic information, protected veteran status, or any other characteristic protected by law.

Candidate Privacy Policy

Orion Systems Integrators, LLC and its subsidiaries and its affiliates (collectively, "Orion," "we" or "us") are committed to protecting your privacy. This Candidate Privacy Policy (orioninc.com) ("Notice") explains:

What information we collect during our application and recruitment process and why we collect it;How we handle that information; andHow to access and update that information.

Your use of Orion services is governed by any applicable terms in this notice and our general Privacy Policy.