Photon

PySpark Data Reconciliation Engineer

Photon, Jersey City, New Jersey, United States, 07390

Greetings everyone,We hope you are staying safe. We are hiring a

PySpark Data Reconciliation Engineer

to join our Digital Engineering team.Who are we?For the past 20 years, we have powered many Digital Experiences for the Fortune 500. Since 1999, we have grown from a few people to more than 4000 team members across the globe that are engaged in various Digital Modernization. For a brief 1 minute video about us, you can check https://youtu.be/uJWBWQZEA6o.What are we looking forPySpark Data Reconciliation EngineerKey Responsibilities:Data Reconciliation Development:Design, develop, and test PySpark-based applications to automate data reconciliation processes across various financial data sources, including relational databases, NoSQL databases, batch files, and real-time data streams.Implement efficient data transformation, matching algorithms (deterministic and heuristic) using PySpark and relevant big data frameworks.Develop robust error handling and exception management mechanisms to ensure data integrity and system resilience within Spark jobs.Data Analysis and Matching:Collaborate with business analysts and data architects to understand data requirements and matching criteria.Analyze and interpret data structures, formats, and relationships to implement effective data matching algorithms using PySpark.Work with distributed datasets in Spark, ensuring optimal performance for large-scale data reconciliation.Rules Engine Integration:Integrate PySpark applications with rules engines (e.g., Drools) or equivalent to implement and execute complex data matching rules.Develop PySpark code to interact with the rules engine, manage rule execution, and handle rule-based decision-making.Problem Solving and Gap Analysis:Collaborate with cross-functional teams to identify and analyze data gaps and inconsistencies between systems.Design and develop PySpark-based solutions to address data integration challenges and ensure data quality.Contribute to the development of data governance and quality frameworks within the organization.Qualifications and Skills:Bachelor's degree in computer science or a related field.5+ years of hands-on experience in big data development, preferably with exposure to data-intensive applications.Strong understanding of data reconciliation principles, techniques, and best practices.Proficiency in

PySpark ,

Apache Spark , and related big data technologies for data processing and integration.Experience with

rules engine integration

and developmentStrong analytical and problem-solving skills, with the ability to translate business requirements into technical solutions.Excellent communication and collaboration skills to work effectively with business analysts, data architects, and other team members.Familiarity with

data streaming platforms

(e.g.,

Kafka ,

Kinesis ) and big data technologies (e.g.,

Hadoop ,

Hive ,

HBase ) is a plusThanks & RegardsPraveen PailaPraveen.pa@