Harbor Capital
Data/ML Engineer
Harbor Capital, Chicago, Illinois, United States, 60290
This is a Chicago based role which requires a commute into the office 2x a week, please ensure you can meet this requirement prior to applying.
Company Overview
What makes Harbor unique is our commitment to only partnering with the very best asset managers globally. This focus allows us to act in the best interests of our shareholders every day, and help them achieve their investment goals through active, cost aware investing. This approach has served us well, becoming one of the largest and most highly regarded managers of managers in the industry. We take a similar approach with the people we hire, seeking the very best individuals that share our passion for putting shareholders first, and we are currently looking to grow our firm with talented, intellectually diverse people with excellent work ethic.
If you are passionate about putting shareholders first and enjoy the unique nature of working with the very best asset managers in the world, please visit us at harborcapital.com/careers, search for this position and click on "Apply" to start your application!
Role Description
Harbor is a data driven organization and has embraced data to drive corporate strategy and all operational aspects of its business. Harbor is building next generation software, products and services in the areas of investment, distribution, marketing and operations. This role offers a unique opportunity for individuals to be part of a small team of professionals working on the intersection of technology, investments, distribution, marketing and quantitative analytics and is looking for their next challenge in a nimble, fast-changing organization using modern data pipeline. This role is for individuals seeking a challenging role in a nimble, fast-paced organization. Selected individual will be working with a team of professionals building new applications using microservices architecture and modern data pipelines. The role calls for individuals who are unafraid to push the envelope and explore out of the box thinking. We are seeking candidates who can use technology to solve complex data engineering problems. Selected individual will have the opportunity to work with industry leading experts in the Multi Assets Solutions, Distribution, Marketing, Accounting, Investment Products and Investment Research Teams across the firm on initiatives that play a part in shaping and delivering Harbor's vision of being a top tier asset management firm.
Why would you want to work on our team?
This is a unique opportunity to revolutionize the asset management industry by building cutting-edge data and AI solutions with the agility of a startup. As a Data/ML Engineer, you'll work at the intersection of financial markets and advanced technology, where you'll help transform traditional asset management processes through modern data engineering and machine learning.
You'll get to work with cutting-edge technologies in the AWS ecosystem while tackling complex challenges like processing market data at scale, automating investment analytics, and building ML models for portfolio optimization and risk management. Unlike traditional financial firms, our startup approach means you'll have the freedom to innovate, experiment with new technologies, and directly influence our technical direction.
What makes this role particularly exciting is its impact on investment decisions and portfolio management. You'll collaborate with portfolio managers, quantitative analysts, and software engineers in a fast-paced environment where your solutions directly affect investment strategies and business outcomes. If you're passionate about both finance and technology and want to be at the forefront of implementing AI solutions in asset management without the constraints of legacy systems and traditional corporate hierarchies, this role offers that perfect combination.
The opportunity to blend traditional asset management expertise with startup-style innovation is rare in the financial industry. You'll be part of a team that's reinventing how investment decisions are made, all while working with the latest tools and technologies in a culture that encourages experimentation and rapid iteration.
Key Responsibilities
Design, develop, and maintain end-to-end data and ML pipelines, ensuring seamless integration of data processing and model training workflows using modern data engineering and MLOps tools.
Build scalable data ingestion processes using REST/SOAP/GraphQL APIs with third-party applications, while implementing data quality checks and validation frameworks for
ML
training data. Collaborate with data scientists to productionize ML models, including feature engineering pipelines, model deployment, and monitoring systems for model performance and data drift. Work closely with Software Engineering and Cloud Operations teams to architect and implement ML powered applications and services, including API development for model serving. Interface and communicate with business users and executives across the firm to understand requirements and translate them into technical solutions, balancing both data engineering and ML implementation needs. Key Behavioral Expectations Unleashes Innovation Courageous & Resilient Facilitates idea sharing, explores "out of the box" ideas and pushes past the status quo. Embraces opportunities to utilize evolving technology to replace stale systems or processes. Drive for results. Technical Knowledge, Skills & Abilities Strong proficiency in Python programming, data structures, and software engineering best practices including version control (Git), testing, and CI/CD pipelines, with proven experience building production-grade data solutions. Expert-level experience designing, developing, and maintaining scalable ETL/ELT pipelines using AWS services including Glue, S3, CloudFormation, Redshift, and EMR, with hands-on experience implementing data lakes and data warehouses. Deep understanding of distributed computing frameworks like Apache Spark (PySpark) for large-scale data processing, with demonstrated ability to optimize performance and resource utilization. Experience implementing and deploying machine learning models in production, including MLOps practices, model monitoring, and working with frameworks like TensorFlow, PyTorch, or scikit-learn. Strong SQL skills and experience working with both relational and NoSQL databases, including query optimization and database design principles. Proficiency in data modeling, data warehousing concepts, and implementing data governance practices including data quality monitoring, metadata management, and data security. Experience with real-time data streaming technologies like Apache Kafka or AWS Kinesis, and ability to design streaming data pipelines. Understanding of containerization (Docker) and orchestration (Kubernetes) for deploying data and ML services at scale. Educational Qualifications & Experience Bachelor's or Master's degree in Statistics, Mathematics, Economics, Computer Science/Engineering, Information Systems or a related field. 3 - 8 years of Software and/or Data Engineering experience. Financial industry experience a plus. Professional and/or Technical certification a plus.
#J-18808-Ljbffr
ML
training data. Collaborate with data scientists to productionize ML models, including feature engineering pipelines, model deployment, and monitoring systems for model performance and data drift. Work closely with Software Engineering and Cloud Operations teams to architect and implement ML powered applications and services, including API development for model serving. Interface and communicate with business users and executives across the firm to understand requirements and translate them into technical solutions, balancing both data engineering and ML implementation needs. Key Behavioral Expectations Unleashes Innovation Courageous & Resilient Facilitates idea sharing, explores "out of the box" ideas and pushes past the status quo. Embraces opportunities to utilize evolving technology to replace stale systems or processes. Drive for results. Technical Knowledge, Skills & Abilities Strong proficiency in Python programming, data structures, and software engineering best practices including version control (Git), testing, and CI/CD pipelines, with proven experience building production-grade data solutions. Expert-level experience designing, developing, and maintaining scalable ETL/ELT pipelines using AWS services including Glue, S3, CloudFormation, Redshift, and EMR, with hands-on experience implementing data lakes and data warehouses. Deep understanding of distributed computing frameworks like Apache Spark (PySpark) for large-scale data processing, with demonstrated ability to optimize performance and resource utilization. Experience implementing and deploying machine learning models in production, including MLOps practices, model monitoring, and working with frameworks like TensorFlow, PyTorch, or scikit-learn. Strong SQL skills and experience working with both relational and NoSQL databases, including query optimization and database design principles. Proficiency in data modeling, data warehousing concepts, and implementing data governance practices including data quality monitoring, metadata management, and data security. Experience with real-time data streaming technologies like Apache Kafka or AWS Kinesis, and ability to design streaming data pipelines. Understanding of containerization (Docker) and orchestration (Kubernetes) for deploying data and ML services at scale. Educational Qualifications & Experience Bachelor's or Master's degree in Statistics, Mathematics, Economics, Computer Science/Engineering, Information Systems or a related field. 3 - 8 years of Software and/or Data Engineering experience. Financial industry experience a plus. Professional and/or Technical certification a plus.
#J-18808-Ljbffr