Logo
Project 44

Principal Software Engineer - Data Engineering USA - San Francisco, CA / USA - R

Project 44, Snowflake, Arizona, United States, 85937


Principal Software Engineer - Data Engineering

USA - San Francisco, CA / USA - RemoteWhy project44?At project44, we revolutionize supply chains with our High-Velocity Supply Chain Platform. As the connective tissue of the supply chain, project44 optimizes global product movement, delivering unparalleled resiliency, sustainability, and value for our customers. We operate the world's most trusted end-to-end visibility platform, tracking over 1 billion shipments annually for 1,300 leading brands across various industries, including manufacturing, automotive, retail, life sciences, food & beverage, and oil, chemical & gas.If you’re eager to be part of a winning team that works together to solve some of the most challenging supply chain challenges every day, let’s talk.About the role:As a Principal Software Engineer - Data Engineering at project44, you'll have opportunities to work on the latest technologies to streamline Machine Learning & AI Operations, build scalable data infrastructure and democratize data access.What you'll be doing:Work with software architecture and design as part of your job. Leverage and institute best practices from the areas of distributed systems, databases, data platform, infrastructure and platform software, manageability and observability.Providing guidance on new technologies and continuous improvement in best practices. Researching, implementation and development of software development tools.Build systems in a multi-cloud environment - we use AWS & GCP but value experience in other cloud environments such as Azure.Build complex metrics solutions with data visualization support for actionable business insights.Leverage expertise in latest Gen AI tools & methodologies like RAG, Vector DB, embeddings to architect and build automated data access & interpretation solutions.Design and develop ETL/ELT using Python/Java with Snowflake, Postgres and other data stores.Knowledge in Data Warehouse/Data Mart design and implementation.Build distributed, reusable, and efficient backend ETLs. Implementation of security and Data protection.Understand repeatable automated processes for building the application, test it, document it, and deploy it at scale.Work collaboratively with insights and data science teams to understand end user requirements to provide technical solutions and for the implementation of new features and data pipelines.Establish quality processes to deliver a stable and reliable solution.Efficient in writing complex SQL, stored procedures in Snowflake, Postgres, BigQuery.Preparing documentation (Data Mapping, Technical Specifications, Production Support, data dictionaries, test cases, etc.) for all projects.Coach junior team members and help your team to continuously improve by contributing to tooling, documentation, and development practices.You could be a great fit if you have:Experience & Education8+ Years of experience in leading Data Engineering efforts.3+ Years of experience in Snowflake, Oracle and knowledge in No SQL database like MongoDB.3+ Years of experience in Python/Java.3+ Years of experience in ETL Developer role with deep knowledge of data processing tools like Airflow, Argo workflow.4+ years experience with data engineering and operations, including administering production-level, always-on, high throughput, complex OLTP RDBMS.Experience in delivering software solutions in areas of distributed systems.Experience with working with Neural network and Gen AI methodologies.Strong experience in building data warehouse solutions and Data Modeling.Strong ETL performance-tuning skills and the ability to analyze and optimize production volumes and batch schedules.Experience with ETL, GCP, Unix/Linux, Helm Charts as well as Git or other version control systems.Experience with PII redaction for traditional ETL pipelines, as well as in GenAI solutions.Expertise in operational data stores and real-time data integration.Expert level skill in modeling, managing, scaling and performance tuning high-volume transactional database.Bachelor's Degree in computer science or equivalent experience.Technical SkillsStrong programming/scripting knowledge in building and maintaining ETL using Java, SQL, Python, Bash, Go.In-depth hands-on knowledge of public clouds - GCP(preferred)/AWS, PostgreSQL (version 9.6+), ElasticSearch, MongoDB, MySQL/MariaDB, Snowflake, BigQuery.Participate in an on-call rotation to mitigate any data pipeline failures.Strong experience with Kafka or equivalent event/streaming based systems.Experience with Docker, Kubernetes.Experience with RAG, Vector DB, embeddings, etc.Develop and deploy CICD pipelines for Data Engineering.Experience and knowledge of optimizing database performance and capacity utilization to provide high availability and redundancy.Proficiency with high volume OLTP Databases and large data warehouse environments.Ability to work in a fast-paced, rapidly changing environment.Understanding of Agile and its implementation for Data Warehouse Development.Professional Skills/CompetencyFocus on development/ improvement of framework to support repeatable and scalable solutions.Demonstrates excellent communication and interpersonal skills; able to communicate clearly and concisely.Takes initiative to recommend/ develop innovative approaches to getting things done.Is a team player and encourages collaboration.Diversity & InclusionAt project44, we're designing the future of how the world moves and is connected through trade and global supply chains. As we work to deliver a truly world-class product and experience, we are also intentionally building teams that reflect the unique communities we serve. We’re focused on creating a company where all team members can bring their authentic selves to work every day.project44 is an equal opportunity employer seeking to enrich our work environment by creating opportunities for individuals of all backgrounds and experiences to thrive.

#J-18808-Ljbffr