Cloud Data Engineer (The Data Pipeline Architect)
Unreal Gigs, San Francisco, CA, United States
Are you passionate about building and optimizing data pipelines that drive data-driven decision-making in cloud environments? Do you have the technical expertise to design robust data architectures that manage large-scale data processing and ensure seamless data flow? If you're ready to harness the power of cloud technology to create scalable data solutions, our client has the ideal role for you. We're seeking a Cloud Data Engineer (aka The Data Pipeline Architect) to design, develop, and manage cloud-based data infrastructures that support analytical and operational needs across the organization.
As a Cloud Data Engineer at our client, you'll collaborate with data scientists, analysts, and software engineers to build data pipelines and storage solutions that are secure, efficient, and scalable. Your role will be vital in ensuring that data systems are optimized for performance and capable of supporting a range of data-driven initiatives.
Key Responsibilities:
- Design and Build Data Pipelines:
- Create and manage scalable data pipelines that support ETL (Extract, Transform, Load) processes. You'll develop automated solutions to handle data ingestion, transformation, and integration in cloud platforms such as AWS, GCP, or Azure.
- Architect and maintain cloud-based data storage solutions, such as data lakes and warehouses, ensuring they align with business needs and best practices. You'll optimize data structures for performance and reliability.
- Work closely with data scientists, business analysts, and application developers to understand data requirements and implement solutions that support analytics and machine learning models.
- Implement and monitor security measures that protect data privacy and comply with industry standards (e.g., GDPR, HIPAA). You'll manage data access controls and audit logs to safeguard sensitive information.
- Improve the performance of data processing workflows by optimizing resource allocation and parallel processing techniques. You'll work to reduce latency and enhance the scalability of data systems.
- Develop automation scripts and tools using languages such as Python or Java to streamline data processes and reduce manual intervention. You'll contribute to the efficiency and reliability of data operations.
- Use monitoring tools and techniques to ensure the health and performance of data pipelines. You'll identify and troubleshoot issues promptly to maintain smooth data flow and system uptime.
Required Skills:
- Cloud Data Expertise: Strong experience with cloud data platforms (AWS, GCP, Azure) and data processing tools like Apache Spark, Apache Beam, or AWS Glue. You're proficient in building scalable data solutions.
- Programming and Scripting: Proficiency in Python, Java, or Scala for data manipulation and pipeline automation. You can create scripts that handle data transformation and integration tasks.
- ETL and Data Pipeline Management: Experience in developing and managing complex ETL workflows that handle large data volumes. You're skilled at building data pipelines that support real-time and batch processing.
- Data Security and Compliance Knowledge: Familiarity with implementing data security measures and ensuring compliance with industry standards. You can configure access controls and monitor data usage.
- Collaboration and Problem-Solving: Strong ability to work with cross-functional teams and troubleshoot data-related issues. You can clearly communicate technical concepts and solutions.
- Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field. Equivalent experience in data engineering and cloud environments may be considered.
- Certifications such as Google Professional Data Engineer, AWS Certified Big Data - Specialty, or Microsoft Certified: Azure Data Engineer Associate are highly desirable.
- 5+ years of experience in data engineering, with at least 3+ years focusing on cloud-based data solutions.
- Hands-on experience with data pipeline orchestration tools (e.g., Apache Airflow, Dataflow, Step Functions).
- Familiarity with database technologies such as SQL, NoSQL, and data warehousing solutions (e.g., BigQuery, Redshift, Snowflake).
- Health and Wellness: Comprehensive medical, dental, and vision insurance plans with low co-pays and premiums.
- Paid Time Off: Competitive vacation, sick leave, and 20 paid holidays per year.
- Work-Life Balance: Flexible work schedules and telecommuting options.
- Professional Development: Opportunities for training, certification reimbursement, and career advancement programs.
- Wellness Programs: Access to wellness programs, including gym memberships, health screenings, and mental health resources.
- Life and Disability Insurance: Life insurance and short-term/long-term disability coverage.
- Employee Assistance Program (EAP): Confidential counseling and support services for personal and professional challenges.
- Tuition Reimbursement: Financial assistance for continuing education and professional development.
- Community Engagement: Opportunities to participate in community service and volunteer activities.
- Recognition Programs: Employee recognition programs to celebrate achievements and milestones.