Logo
Luxoft

Senior PySpark Data Engineer

Luxoft, Baltimore, Maryland, United States, 21276


Project descriptionJoin our dynamic team working on exciting projects in the thriving Middle East region. We offer a multitude of opportunities in various domains. Our diverse team comprises skilled professionals, including front-end and back-end developers, data analysts, data scientists, architects, analysts, and project managers. Currently, we are actively seeking a talented Data Engineer with proficiency in Python programming.ResponsibilitiesActively engage in requirements clarification and contribute to sprint planning sessions.Design and architect technical solutions that align with project objectives.Develop comprehensive unit and integration tests to ensure the robustness and reliability of the codebase.Provide valuable support to QA teammates during the acceptance process, addressing and resolving issues promptly.Continuously assess and refine best practices to optimize development processes and code quality.Collaborate with cross-functional teams to ensure seamless integration of components and efficient project delivery.Stay abreast of industry trends, emerging technologies, and best practices to contribute to ongoing process improvement initiatives.Contribute to documentation efforts, ensuring clear and comprehensive records of technical solutions and best practices.Actively participate in code reviews, providing constructive feedback and facilitating knowledge sharing within the team.SKILLSMust have5+ years of relevant experience in a Senior Data Engineer role.Familiarity with big data technologies such as Hadoop, Apache Spark, or other distributed computing frameworks.Comprehensive understanding of data security principles and practices to ensure the confidentiality and integrity of sensitive information, coupled with knowledge of data governance frameworks and practices for ensuring data quality, compliance, and proper data management.Demonstrated strong expertise in both Python and PySpark for efficient data processing and analytics.Proficient in SQL with the ability to handle complex queries and database operations.Prior experience working with Extract, Transform, Load (ETL) processes.Familiarity with data cleansing, data profiling, data lineage, and adherence to best practices in data engineering.Some experience with various data analysis methodologies.Familiarity with building libraries in Python for enhanced functionality.Knowledge of integrating data pipelines with various APIs for seamless data exchange between systems.Proficiency in version control systems, such as Git, for tracking changes in code and collaborative development.Prior exposure to cloud technologies, particularly Azure or any leading cloud platform.Some exposure to data visualization tools like Tableau, Power BI, or others to create meaningful insights from data.Familiarity with collaboration tools such as Azure DevOps, Jira, Confluence, or others to enhance teamwork and project documentation.A degree in computer science, mathematics, statistics, or a related technical discipline.Familiarity with financial markets, portfolio theory, and risk management is a plus.Non-technical skills:Strong problem-solving skills to tackle complex data engineering challenges.Ability to convey insights effectively through compelling data storytelling.Keen attention to delivering high-quality solutions within specified timelines.Proven ability to work collaboratively within a team, taking a proactive approach to problem resolution and process improvement.Excellent communication skills to articulate technical concepts clearly and concisely.Nice to haveExposure to streaming data processing technologies like Apache Kafka for real-time data ingestion and processing.Knowledge of containerization technologies like Docker for creating, deploying, and running applications consistently across various environments.Extensive experience in data modeling and the evaluation of large datasets.Background in training, deploying, and maintaining models for effective data-driven decision-making.Experience in developing and implementing machine learning algorithms, Natural Language Processing (NLP), and Neural Networks.Proficiency in applied mathematics, including but not limited to linear algebra, probability, statistics, and distributions.

#J-18808-Ljbffr