Infogain Corp
Big Data Engineer (Standard)
Infogain Corp, Houston, Texas, United States, 77246
Big Data Engineer (Standard) with skills Data Engineering, Big Data, Large Language Models (LLM) for location Bangalore, India
Role Overview:
We are looking for a Data Engineer to manage and preprocess large datasets to be used in training an LLM. He will be responsible for gathering and preparing domain-specific telecom data and ensuring the model’s inputs are properly formatted and optimized.Responsibilities:Collect, clean, and preprocess large datasets for use in machine learning models.Develop and manage data pipelines using PySpark and Python.Ensure seamless data integration from databases like Cloudera and Teradata.Collaborate with AI/ML engineers to ensure data readiness for training.Optimize the data flow for performance in CPU-only environments.Qualifications:5+ years
of experience in data engineering and large-scale data management (Telecom preferred).3+ years
of experience with PySpark for distributed data processing.3+ years
of experience in Python for data manipulation and ETL processes.Strong experience with Cloudera, Teradata, or similar database technologies.Demonstrated experience building scalable data pipelines for ML projects.Knowledge of data formats and structures necessary for machine learning (e.g., CSV, JSON, parquet).EXPERIENCE
4.5-6 YearsSKILLS
Primary Skill: Data EngineeringABOUT THE COMPANY
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.
#J-18808-Ljbffr
Role Overview:
We are looking for a Data Engineer to manage and preprocess large datasets to be used in training an LLM. He will be responsible for gathering and preparing domain-specific telecom data and ensuring the model’s inputs are properly formatted and optimized.Responsibilities:Collect, clean, and preprocess large datasets for use in machine learning models.Develop and manage data pipelines using PySpark and Python.Ensure seamless data integration from databases like Cloudera and Teradata.Collaborate with AI/ML engineers to ensure data readiness for training.Optimize the data flow for performance in CPU-only environments.Qualifications:5+ years
of experience in data engineering and large-scale data management (Telecom preferred).3+ years
of experience with PySpark for distributed data processing.3+ years
of experience in Python for data manipulation and ETL processes.Strong experience with Cloudera, Teradata, or similar database technologies.Demonstrated experience building scalable data pipelines for ML projects.Knowledge of data formats and structures necessary for machine learning (e.g., CSV, JSON, parquet).EXPERIENCE
4.5-6 YearsSKILLS
Primary Skill: Data EngineeringABOUT THE COMPANY
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.
#J-18808-Ljbffr