Logo
Jobs via eFinancialCareers

Python + Pyspark developer

Jobs via eFinancialCareers, Dallas, Texas, United States, 75215


Our challengeWe are seeking an experienced Python Developer with a strong background in PySpark to join our data engineering team. The ideal candidate will have a robust understanding of big data processing, experience with Apache Spark, and a proven track record in Python programming. You will be responsible for developing scalable data processing and analytics solutions in a cloud environment.The RoleResponsibilities:Design, build and maintain scalable and efficient data processing pipelines using PySpark.Develop high-performance algorithms, predictive models, and proof-of-concept prototypes.Work closely with data scientists and analysts to transform data into actionable insights.Write reusable, testable, and efficient Python code.Optimize data retrieval, develop dashboards, and reports for business stakeholders.Implement data ingestion, data cleansing, deduplication, and data consolidation processes.Leverage cloud-based big data services and architectures (AWS, Azure, or GCP) for processing large datasets.Collaborate with cross-functional teams to define and refine data and analytics requirements.Ensure systems meet business requirements and industry practices for security and privacy.Stay updated with the latest innovations in big data technologies and PySpark enhancements.Requirements:Bachelor's or Master's degree in Computer Science, Engineering, or a related field.Minimum of 3 years of experience in Python development.Strong experience with Apache Spark and its components (Spark SQL, Streaming, MLlib, GraphX) using PySpark.Demonstrated ability to write efficient, complex queries against large data sets.Knowledge of data warehousing principles and data modeling concepts.Proficient understanding of distributed computing principles.Experience with at least one cloud provider (AWS, Azure, GCP), including their big data processing services.Strong problem-solving skills and ability to work under tight deadlines.Excellent communication and collaboration abilities.It would be great if you also had:Experience with additional big data tools like Hadoop, Kafka, or similar technologies.Familiarity with machine learning frameworks and libraries.Experience with data visualization tools and libraries.Knowledge of containerization and orchestration technologies (Docker, Kubernetes).Contributions to open-source projects or a strong GitHub portfolio showcasing relevant projects.

#J-18808-Ljbffr