New York City, NY, United States Posted 4 days ago
The Wall Street Journal seeks a data engineer who will be responsible for developing tools to help the newsroom in its data science work. The Journal is expanding its use of data in both editorial and audience-related projects, and this engineer will be an important force in bolstering the newsroom’s data capacities.
The Journal is seeking a full-stack data engineer who will be responsible for (1) acquiring new datasets, (2) creating and maintaining data pipelines, (3) deploying data and insights to editors in the newsroom, and (4) building prototypes of tools for editors and newsroom staffers.
This role is responsible for making and maintaining a data pipeline for all the data sets we want, have and need. This is a function tied to the newsroom’s top-level strategy, working in collaboration with the Audience group, the R&D Lab and the broader newsroom. The engineer will collaborate with data scientists and work directly with a number of highly sophisticated audience and content data sets. The engineer will also help with rapid prototyping and testing of newsroom data tools as well as help maintain ones that are successful.
We are looking for someone with deep knowledge of audience behavior around common journalism types, like breaking news and enterprise journalism, as well as experience in newsroom tools and data dashboards. The data engineer should have strong background in A/B testing as well as managing data processes that inform content optimization. This role is suited for a talented engineer with a strong understanding of newsroom workflow and a passion for helping journalists connect with their audiences.
Experience running and supporting production of enterprise data platforms
Experience creating internal tools that combine content and audience data
Experience in building infrastructure required for optimal extraction, transformation and loading of data from various resources
Build data pipelines with tools and cloud-based data services like Google’s BigQuery, AWS, Dataproc and Pub/Sub