Plaid

Experienced Data Engineer - Data Engineering Technical Leader

Plaid, San Francisco, California, United States, 94199

Data Engineering is a cornerstone of Plaid's data-focused vision, playing a pivotal role in fostering a data-driven culture and advancing our vision of delivering trustworthy, truly differentiated data products powered by the largest network of user financial profiles. To support our data-focused vision, we need to scale our data systems while maintaining correct and complete data. We provide golden datasets and tooling to teams across engineering, product, and business and help them explore our data quickly and safely to get the data insights they need, which ultimately helps Plaid serve our customers more effectively. In addition, Plaid will not be successful if we can't move quickly. We build the data systems and tools that enable everyone at Plaid to be data-driven, making analytics easy, obvious, and proactive across the company.

Data Engineers heavily leverage SQL and Python to build data workflows that integrate with our Golang applications. We use tools like DBT, Airflow, Redshift, Atlan, and Retool to orchestrate data pipelines and define workflows. We work with data scientists, product analytics, business intelligence, software engineers, product managers, and many other teams to build Plaid's data strategy and a data-first mindset. Our engineering culture is IC-driven - we favor bottom-up ideation and empowerment of our incredibly talented team. We are looking for engineers and technical leaders who are motivated by creating impact for our consumers and customers, growing together as a team, shipping the MVP, and leaving things better than we found them. Responsibilities

Serving as the primary technical DRI, defining and executing the long-term technical roadmap to build and sustain a data-driven culture at Plaid. Working closely with senior leadership and executives to shape Plaid’s data engineering strategy and roadmap, ensuring alignment with the company’s data-focused product goals and overall vision. Deep understanding of Plaid's products and strategy in order to inform the design of Golden Datasets, set data usage principles, and ensure data is structured for maximum impact and usability. Focus on delivering well-documented datasets with clearly defined quality metrics, uptime guarantees, and demonstrable usefulness. Leading critical data engineering projects that foster collaboration across teams, driving innovation and efficiency throughout the company. Mentoring engineers, operations, and data practitioners on best practices for data organization. Advocating for the adoption of emerging industry tools and practices, evaluating their fit and implementing them at the right time to keep Plaid at the forefront of data engineering. Owning core SQL and Python data pipelines that power our data lake and data warehouse, ensuring their reliability, scalability, and efficiency. Upholding Plaid’s commitment to data quality and privacy, embedding these principles into every aspect of data work. Qualifications

10+ years of experience in data engineering, with a proven track record of building scalable data systems and pipelines. Experience working with massive datasets (500TB to petabytes) and developing robust data models and pipelines on top of them. Lead major data modeling, schema design, and data architecture efforts in past roles. Deep appreciation for schema design and the ability to evolve analytics schemas on top of unstructured data. Advanced knowledge of SQL as a flexible, extensible tool and experience with modern orchestration tools like DBT, Mode, and Airflow. Hands-on experience with performant data warehouses and lakes such as Redshift, Snowflake, and Databricks. Building and maintaining both batch and real-time pipelines using technologies like Spark and Kafka. Excited about exploring new technologies and building proof-of-concepts that balance technical innovation with user experience and adoption. Enjoy getting into the technical details to manage, deploy, and optimize low-level data infrastructure. Champion for data privacy and integrity, and always act in the best interest of consumers. Target base salary for this role is between $204,120 and $360,000 per year. Additional compensation in the form(s) of equity and/or commission are dependent on the position offered. Plaid provides a comprehensive benefit plan, including medical, dental, vision, and 401(k). Pay is based on factors such as (but not limited to) scope and responsibilities of the position, candidate's work experience and skillset, and location. Pay and benefits are subject to change at any time, consistent with the terms of any applicable compensation or benefit plans.

#J-18808-Ljbffr