Data Scientist
Celanese International Corporation, Irving, TX, United States
Responsibilities:
Celanese is a global leader in chemistry, producing specialty material solutions used across most major industries and consumer applications. Our businesses use our chemistry, technology and commercial expertise to create value for our customers, employees and shareholders. We are committed to sustainability by responsibly managing the materials we create for their entire lifecycle and are growing our portfolio of sustainable products to meet increasing customer and societal demand. We strive to make a positive impact in our communities and to foster inclusivity across our teams. Celanese is a Fortune 500 company that employs approximately 12,400 employees worldwide with 2023 net sales of $10.9 billion.
Onsite: 222 W. Las Colinas Blvd Suite 900N Irving, Texas 75039
The Data Scientist will support the Celanese Digitalization initiative, part of Digital Innovation team. The ideal candidate should be experienced in collaborating with Data Engineers, Data Analysts, and Business teams to leverage large external and internal data sets for growth opportunities, optimization, insights generation, and trend identification. The candidate must have extensive experience in data mining, data analysis, using various data tools, building and implementing models, creating algorithms, and running simulations. They must be comfortable working with a wide range of stakeholders and functional teams in a global setting. A passion for uncovering solutions in large data sets, synthesizing and communicating results, and driving business outcomes with stakeholders is essential.
Responsibilities
- Works with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions.
- Mines and analyzes data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.
- Assess the effectiveness and accuracy of new data sources/extracts and data gathering/merging techniques.
- Develops custom data models and algorithms to apply to data sets.
- Uses predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting and other business outcomes.
- Stay up to date on the latest trends & technologies in AI/ML
- Coordinates with different functional teams to implement and maintain models.
- Develops processes and tools to test, monitor and analyze model performance.
- Programs languages such as Python, SQL, etc. Ability to manipulate data and draw insights from large datasets.
- Machine learning frameworks and libraries specifically designed for working with LLMs, RAG, NLP, and text mining.
- Large Language Models (LLMs), including but not limited to models like GPT (Generative Pre-trained Transformer), BERT (Bidirectional Encoder Representations from Transformers), and their derivatives.
- RAG techniques and their applications. Leverages RAG for enhancing the performance of language models by combining the power of retrieval and generative capabilities to provide more accurate, contextually relevant, and information-rich responses.
- Practical application of LMMs in analyzing data that have multiple levels of correlation or non-constant variability.
- Apply LMMs to complex datasets to account for both fixed and random effects, ensuring accurate data interpretation and decision-making.
- Application with a wide array of machine learning techniques, including but not limited to clustering, decision tree learning, and artificial neural networks, and an understanding of their real-world advantages and limitations.
- Working with deep learning tools including computer vision.
- Advanced statistical techniques and concepts (regression analysis, distribution properties, statistical testing, etc.) and experience applying these techniques to data analysis and modeling.
- Text mining, deep neural networks, autoencoders, hybrid models, and cloud computing.
- Articulate complex concepts and findings to both technical and non-technical stakeholders. Collaborate effectively with cross-functional teams to drive projects to completion.
- Work independently with experts from different fields, including chemical engineers, process engineers, and environmental scientists, to integrate diverse data sources and insights.
- Code APIs and work with several languages like python, R, Java, JavaScript, etc.
- Use web services like Azure Cognitive Services/ML, Databricks, etc.
- Use Alteryx to analyze and manipulate data.
- Analyze data from 3rd party Market Intelligence providers.
- Use distributed data/computing tools like Spark, Hadoop, etc.
- Visualize and present data for stakeholders using Power BI, SAP Analytics Cloud, etc.
- Use SAP, Google Analytics, and Snowflake.
- B2B business primarily in specialty chemicals, manufacturing, and oil & gas.
- Use Hyperscale, Agile
Qualifications:
Required:
- PH.D with 3 years of Data Science experience OR Masters with 5 years of chemical experience
- Python
- SQL
- AI, Large Language Models (LLMs), including but not limited to models like GPT (Generative Pre-trained Transformer), BERT (Bidirectional Encoder Representations from Transformers), and their derivatives.
- Azure
- Snowpark
- Jupyter Notebooks
Desired:
- Chemical Eng Background
- Data Science Architecture
- Experience working with R&D (Research & Development)