Saxon Global

Sr. Data Engineer

Saxon Global, San Francisco, CA, United States

Job Description:

We are seeking an experienced and highly skilled Cloud Data Engineer to join our team in the pharmaceutical industry. The successful candidate will have a strong background in data engineering, with expertise in data ingestion, ETL, analytics, and other data-related services, specifically using AWS Services like Glue, Kinesis, etc. In this role, you will play a critical part in developing and implementing data-driven solutions for our organization, ensuring adherence to cybersecurity and data privacy regulations. This position is based in South San Francisco and offers a hybrid working model.

Responsibilities:

Design, develop, and maintain data pipelines using AWS services, such as S3, EMR, Redshift, Glue, Athena, Sagemaker, DynamoDB, and Kinesis streams.
Ensure data privacy, security, and governance principles are integrated into all data engineering tasks.
Collaborate with cross-functional teams to define data integration requirements and develop data-driven solutions.
Leverage your 5-7 years of hands-on experience in data engineering, focusing on data ingestion, ETL, analytics, and AWS Glue and Kinesis streams, Quicksight and Data Analytics.
Work closely with data architects and scientists to optimize feature engineering and data manipulation processes.
Develop complex, high-quality solutions to address pharmaceutical industry-specific data challenges.
Provide examples of complex projects you have successfully completed and the impact of your work.
Stay current with industry best practices and emerging trends in data engineering and cloud technologies.

Qualifications:

Bachelor's degree in Computer Science, Engineering, or a related field.
Minimum 5-7 years of hands-on experience in data engineering, with a strong focus on cloud-based solutions.
Expertise in AWS cloud services related to data storage, integration, and processing, including AWS Glue and Kinesis streams.
Proficiency in Python and other programming languages for building data pipelines.
Experience with PySpark and related data processing frameworks.
Strong understanding of data privacy and security principles, with specific focus on the pharmaceutical industry.
Excellent communication skills, including the ability to present technical information in a clear and concise manner.
Ability to work collaboratively within a team and contribute to complex projects.
Experience with Airflow for job orchestration, MLOps, Snowflake is a plus
AWS certifications (e.g., AWS Certified Data Analytics, AWS Certified Solutions Architect) are highly desirable.