Photon

SPARK Data Reconciliation Engineer- NJ

Photon, Trenton, New Jersey, United States,

Job Title: PySpark Data Reconciliation EngineerSummary:We're seeking a skilled

PySpark Data Reconciliation Engineer

to join our team and drive the development of robust data reconciliation solutions within our financial systems. You will be responsible for designing, implementing, and maintaining PySpark-based applications to perform complex data reconciliations, identify and resolve discrepancies, and automate data matching processes. The ideal candidate possesses strong PySpark development skills, experience with data reconciliation techniques, and the ability to integrate with diverse data sources and rules engines.Key Responsibilities:Design, develop, and test PySpark-based applications to automate data reconciliation processes across various financial data sources, including relational databases, NoSQL databases, batch files, and real-time data streams.Implement efficient data transformation, matching algorithms (deterministic and heuristic) using PySpark and relevant big data frameworks.Develop robust error handling and exception management mechanisms to ensure data integrity and system resilience within Spark jobs.Data Analysis and Matching:Collaborate with business analysts and data architects to understand data requirements and matching criteria.Analyze and interpret data structures, formats, and relationships to implement effective data matching algorithms using PySpark.Work with distributed datasets in Spark, ensuring optimal performance for large-scale data reconciliation.Integrate PySpark applications with rules engines (e.g., Drools) or equivalent to implement and execute complex data matching rules.Develop PySpark code to interact with the rules engine, manage rule execution, and handle rule-based decision-making.Problem Solving and Gap Analysis:Collaborate with cross-functional teams to identify and analyze data gaps and inconsistencies between systems.Design and develop PySpark-based solutions to address data integration challenges and ensure data quality.Contribute to the development of data governance and quality frameworks within the organization.Qualifications and Skills:Bachelor's degree in Computer Science or a related field.5+ years of hands-on experience in big data development, preferably with exposure to data-intensive applications.Strong understanding of data reconciliation principles, techniques, and best practices.Proficiency in

PySpark ,

Apache Spark , and related big data technologies for data processing and integration.Experience with

rules engine integration

and development.Strong analytical and problem-solving skills, with the ability to translate business requirements into technical solutions.Excellent communication and collaboration skills to work effectively with business analysts, data architects, and other team members.Familiarity with

data streaming platforms

(e.g.,

Kafka ,

Kinesis ) and big data technologies (e.g.,

Hadoop ,

Hive ,

HBase ) is a plus.Job Info

Job Identification: 20420Job Category: DevelopmentPosting Date: 09/12/2024, 08:15 PMJob Shift: DayLocations: Texas Photon, Dallas, Texas, 75001, US

#J-18808-Ljbffr