Experis
Spark and CouchDB/Caasandra
Experis, Andover, Massachusetts, us, 05544
Job Title: Big Data Engineer (Spark, CouchDB and/or Cassandra)Location: Hybrid/Onsite Phoenix, AZWork type: Contract on W2 for a duration of 12+ months (Possibility for extension)
JOB DESCRIPTION
:We are looking for a highly skilled Big Data Engineer with expertise in Apache Spark, CouchDB, and Cassandra to join our data engineering team in Phoenix.As part of the team, you will be responsible for architecting, developing, and maintaining data pipelines and storage solutions that can process and analyze large-scale data efficiently.You will work closely with data scientists, analysts, and developers to support advanced analytics and data-driven applications.QUALIFICATIONS:
Strong hands-on experience with Apache Spark for data processing and transformation.Proficiency in working with CouchDB and Cassandra databases.Solid understanding of distributed computing principles and database design patterns.Experience with data pipeline and ETL tool development.Familiarity with NoSQL database architecture and best practices for data modeling in Cassandra and CouchDB.Expertise in working with large-scale datasets and optimizing query performance.Knowledge of data security, governance, and compliance best practices.Strong problem-solving skills and the ability to work in a fast-paced environment.Excellent communication skills and ability to collaborate effectively with both technical and non-technical teams.PREFERRED QUALIFICATIONS:
Experience with streaming data platforms like Kafka.Proficiency with cloud platforms such as AWS, Google Cloud, or Azure.Experience with other big data technologies like Hadoop, HBase, or Elasticsearch.Familiarity with containerization tools (e.g., Docker) and orchestration (e.g., Kubernetes).Understanding machine learning and data science workflows.KEY RESPONSIBILITIES:
Design, develop, and optimize distributed data processing workflows using Apache Spark.Build and maintain scalable databases using CouchDB and Cassandra for high availability and fault tolerance.Develop ETL processes to move and transform data between various data sources and storage platforms.Implement real-time and batch data processing solutions using Spark on distributed systems.Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.Optimize performance of Spark jobs and CouchDB/Cassandra queries to ensure fast and reliable data access.Maintain data security and governance practices across all data processing pipelines and storage solutions.Troubleshoot and resolve performance bottlenecks and other technical issues within the data ecosystem.Stay up to date with the latest trends and advancements in big data technologies and apply best practices.
JOB DESCRIPTION
:We are looking for a highly skilled Big Data Engineer with expertise in Apache Spark, CouchDB, and Cassandra to join our data engineering team in Phoenix.As part of the team, you will be responsible for architecting, developing, and maintaining data pipelines and storage solutions that can process and analyze large-scale data efficiently.You will work closely with data scientists, analysts, and developers to support advanced analytics and data-driven applications.QUALIFICATIONS:
Strong hands-on experience with Apache Spark for data processing and transformation.Proficiency in working with CouchDB and Cassandra databases.Solid understanding of distributed computing principles and database design patterns.Experience with data pipeline and ETL tool development.Familiarity with NoSQL database architecture and best practices for data modeling in Cassandra and CouchDB.Expertise in working with large-scale datasets and optimizing query performance.Knowledge of data security, governance, and compliance best practices.Strong problem-solving skills and the ability to work in a fast-paced environment.Excellent communication skills and ability to collaborate effectively with both technical and non-technical teams.PREFERRED QUALIFICATIONS:
Experience with streaming data platforms like Kafka.Proficiency with cloud platforms such as AWS, Google Cloud, or Azure.Experience with other big data technologies like Hadoop, HBase, or Elasticsearch.Familiarity with containerization tools (e.g., Docker) and orchestration (e.g., Kubernetes).Understanding machine learning and data science workflows.KEY RESPONSIBILITIES:
Design, develop, and optimize distributed data processing workflows using Apache Spark.Build and maintain scalable databases using CouchDB and Cassandra for high availability and fault tolerance.Develop ETL processes to move and transform data between various data sources and storage platforms.Implement real-time and batch data processing solutions using Spark on distributed systems.Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.Optimize performance of Spark jobs and CouchDB/Cassandra queries to ensure fast and reliable data access.Maintain data security and governance practices across all data processing pipelines and storage solutions.Troubleshoot and resolve performance bottlenecks and other technical issues within the data ecosystem.Stay up to date with the latest trends and advancements in big data technologies and apply best practices.