Bloomberg

Senior Software Engineer - Media Data Engineering

Bloomberg, New York City, NY, United States

Our Team

Bloomberg Media empowers global business leaders with breaking news, expert opinion and proprietary data distributed with global reach. With over 30 million unique visitors each month, our flagship website bloomberg.com is the go-to destination for those looking to stay ahead of the curve. As one of the top 10 most visited financial and news sites on the web, we take pride in delivering top-quality content that our audience can trust.

Media Data Engineering owns a variety of structured and unstructured data assets and enabling applications. We own a Data Warehouse to store our structured data assets. We leverage open source orchestration tools and analytical processing to maintain a collection of ETL and data curation processes for those assets. These data assets empower our Data Science and Insights teams to improve our customer experience through machine learning, A/B testing and data driven decision making. This entire platform is running on a multi-cloud environment built using a mix of open-source and public cloud technologies. We also own the backend for Search on bloomberg.com and the mobile app. This backend is built on top of NoSQL datastore to store, index and query unstructured data such as news articles, stock symbols, people and more. We also own an application that enables access to certain user attributes from our data warehouse for low latency/high throughput use cases e.g. differentiated web site experience. This application uses NoSQL datastore to store and efficiently serve this data to web scale traffic.

What's in it for you

As a member of the team, you will work with a wide range of stakeholders such as Product, Editorial, Ad Operations, Marketing, and other engineering teams. You will maintain, expand, and innovate our Data Engineering infrastructure by leveraging open source and cloud technologies. You’ll also work closely with analysts and data scientists to provide the data and tooling they need to generate insights that will improve the experience of Bloomberg Media customers. You will get to work with unstructured and structured data for use cases like search, big data modeling & analysis, data contracts and data governance.

You’ll have opportunities to grow your network by playing a central role in empowering other teams with a large collection of data sets that lie in a critical path for management and finance reporting for the Bloomberg Media business.

We'll trust you to

Develop and maintain data pipelines flowing terabytes of new data every day.
Innovate and improve the platform by applying the best practices in the Data Engineering field, in particular focusing on improving data quality, data observability and data discovery as we look to scale up our platform both in terms of size and complexity.
Develop and maintain our search infrastructure including ElasticSearch, our backend application to serve the traffic and data ops processes to hydrate indices.
Develop the domain knowledge and strategic thinking necessary to identify opportunities to grow the impact of our team through ideation, project proposals, and technical discussions.
Analyze raw data for insights and trends and build techniques and processes to automate these workflows.
Build a network with stakeholders and leverage it to maximize you and your team’s impact.

You need to have

4+ years experience Programming experience in Python, Javascript, Java or any object oriented programming languages.
Experience developing data extraction, transformation, and load (ETL) pipelines.
Experience with workflow management technologies such as Airflow, Luigi, NiFi, Oozie.
Experience with cloud technologies such as AWS, GCP
Experience working on complex data systems from design to delivery.
Experience with SQL.
Experience with NoSQL technologies like Elasticsearch, DynamoDB, Solr, Cassandra, HBase, etc..
Degree in Computer Science, Engineering or related technology field.

We'd Love to See

Experience with big data systems such as BigQuery, Redshift.
Experience supporting data scientist workflows.
Experience integrating with 3rd party APIs.
Experience managing compute instances, databases, etc. to drive data wrangling and delivery.