Principal Software Engineer - Big Data Platform

Bristol, CT, United States
Full Time Posted 26 days ago
Main Location
Burbank, CA, United States
Open jobs
We have created a new Big Data Platforms group within Disney’s Direct-To-Consumer and International technology organization with the skills, drive, and passion to innovate, create, and succeed in enabling a Direct to Consumer Strategy for various digital products. We are here to disrupt and start a cultural revolution in the application of data and analytics across The Walt Disney Company, focused on Content Personalization/Recommendation, Deep User Understanding, and Audience Segmentation for Linear to Digital Ad Sales, and Analytics.

We need an experienced Data Platform Architect who can drive multiple data initiatives applying innovative architecture that can scale in the cloud. We’re looking for a creative and talented individual who loves designing scalable platforms, particularly at the peta-byte level and extract value from both structured and unstructured real-time data. More specifically, we need a technology leader to build a highly scalable and extensible Big Data platform which enables collection, storage, modeling, and analysis of massive data sets from numerous channels.

Responsibilities :
  • Act as the proactive and technical architect point person for DTCI Consumer Data Platforms end to end (data collection through knowledge extraction via statistical machine learning and deep learning approaches, distribution via stream, APIs and files, ad-hoc analysis, reporting & visualization).
  • Present and educate management team about technical direction to achieve maximum profitability by using best data management technologies while reducing overall cost of operation.
  • Lead and coach other software engineers by developing re-usable frameworks. Review design and code produced by other engineers.
  • Provide expert level advice to data scientists, data engineers, and operations to deliver high quality analytics via machine learning and deep learning via data pipelines and APIs.
  • Lead the transformation of a peta-byte scale batch-based processing platform to a near real-time streaming platform using technologies such as Apache Kafka, Cassandra and Spark.
  • Design and build efficient ETL/ELT process to move data through the data processing pipeline to meet the demands of the business use cases using Java, Open Source, and AWS Products. Build easy to re-use workflow model and take the entire team to follow the pattern to implement all ETL process to improve efficiency and reduce cost.
  • Optimize and automate data ingestion, data processing and distribution data from variety of sources, including click stream data, ratings data, advertising data, 3rd party sources and sources not yet identified.
  • Manage complex data dependencies across datasets and incremental data loading workflows.
  • Design and build api/stream/batch based data export mechanism to be used by other DTCI products such as AdSales, Web, App platforms.
  • Be a fearless leader in championing smart, scalable and flexible design
  • Collaborating with product management and acting as the bridge between product management, engineering teams, and customers to understand requirements and technical solutions
  • Help us stay ahead of the curve by working closely with data management team, data engineers, our DevOps team, and analysts to design systems which can scale overnight in ways which make other groups envy

Basic Qualifications :
  • Roughly 10 years of experience building large scale data platforms from Architecture all the way to implementation and support. Platform is expected to handle Peta Bytes of data in cloud environment, on a real-time manner.
  • Code Ninja – Must be hands-on on latest technologies such as Java, Scala, Apache Spark, Apache Kafka, Hadoop, API design and development, No-SQL databases such as Cassandra, OLAP columnar storage systems, Bit Map indexes to handle millions of consumers and thousands of attributes while allowing real-time querying/ segmentation.
  • Visionary - Solid understanding of software development from design and architecture to build software for future.
  • Have a data toolbox – Familiar with technologies relevant to the data and integration space including Hadoop, Spark, Apache Druid, Cassandra, Java, Python, and ML frameworks.
  • Hunger to Learn & Teach – Genuine interest to learn new cutting edge technologies and share it with rest of the engineering team to keep them up-to-date on technology trends. We love to see your public GIT or similar profiles.
  • Problem solver – Enjoy new and meaningful technology or business challenges which require you to think and respond quickly
  • Passion and creativity – Are passionate about data, technology, & creative innovation
  • Open source – Prefer open source technologies and build it yourself mentality, open source contribution history is highly preferred.
  • Team player – Enjoy working collaboratively with a talented group of people to tackle challenging business problems so we all succeed (or fail fast) as a team

Preferred Qualifications:
  • Experience in building large data streaming platform will be a huge plus.
  • Experience in operationalizing Machine Learning workflows to scale will be a huge plus as well.
  • Experience with Content Personalization/Recommendation, Audience Segmentation for Linear to Digital Ad Sales, and/or Analytics
  • Presence in open source projects will be huge plus. We love to see our social profile if any.
  • Working experience in Machine Learning framework such as Apache Spark MLLib, TensorFlow or similar.
  • 5+ years of hands-on experience in data and analytics technology, with focus on data architecture and large volume data processing. Experience with Java, Python, and/or SQL.
  • 5+ years of experience building, coaching and leading software professionals
  • 5+ years of experience working with relational databases, data services, big data, complex event processing and machine learning.
  • 2+ years of experience with cloud deployments, AWS experience preferred. Proficiency with linux/unix based systems.

Preferred Education :
  • Masters in Computer Science or similar is preferred.

Help us maintain the quality of jobs posted on PowerToFly. Let us know if this job is closed.
We're a community of women leveraging our connections into top companies to help underrepresented women get the roles they've always deserved. Simultaneously, we work to build truly inclusive hiring processes and environments where women can thrive and not just survive.
Are you hiring? Join our platform for diversifiying your team