CoreWeave, Inc

Senior Data Engineer - Dimensional Modeling and Metric Generation

CoreWeave, Inc, Sunnyvale, California, United States, 94087

About the Team

The Data Engineering team builds foundational datasets and analytics services that enable BI and data science across CoreWeave. We seek to democratize insights and foster a culture where data-driven decision-making thrives at every level.

About the Role

We’re seeking a skilled Senior Data Engineer to lead the development of foundational data models that empower our Business Intelligence Engineers, analysts, and data scientists to efficiently work with and gain insights from our data. This role will own the creation and maintenance of star and snowflake schemas within our lakehouse environment and set the standards for dimensional modeling best practices. The engineer will also create and optimize key datasets and metrics essential to tracking business health.

Responsibilities

Develop and maintain data models, including star and snowflake schemas, to support analytical needs across the organization.

Establish and enforce best practices for dimensional modeling in our Lakehouse.

Engineer and optimize data storage using analytical table/file formats (e.g., Iceberg, Parquet, Avro, ORC).

Partner with BI, analytics, and data science teams to design datasets that accurately reflect business metrics.

Tune and optimize data in MPP databases such as StarRocks, Snowflake, BigQuery, or Redshift.

Collaborate on data workflows using Airflow, building and managing pipelines that power our analytical infrastructure.

Ensure efficient processing of large datasets through distributed computing frameworks like Spark or Flink.

Qualifications

You thrive in a fast-paced, complex work environment and love tackling hard problems.

Hands-on experience

applying Kimball modeling principles to large datasets.

Expertise in working with analytical table/file formats, including Iceberg, Parquet, Avro, and ORC.

Proven experience optimizing MPP databases (StarRocks, Snowflake, BigQuery, Redshift).

5+ years

of programming experience in Python or Scala.

Advanced SQL skills, with a strong ability to write, optimize, and debug complex queries.

Hands-on experience with Airflow for batch orchestration and distributed computing frameworks like Spark or Flink.

#J-18808-Ljbffr