Stash is on a mission to give financial opportunity to all; we want to build financial systems that work for everyone—not just the wealthy. But that takes more than just a mission. It takes great people and an open, inclusive, and diverse environment where innovation and quality can thrive. We are working toward a future where investors are as diverse as our world.
At Stash, data is at the core of how we make decisions and build great products for millions of users. As a Senior Data Engineer you will be a part of our newly formed Data Platform Team which is leading the architectural design decisions and implementation of a modern data infrastructure at scale. As a senior engineer on our team you will be comfortable with independently carrying out the end to end product and feature development process, help shape the architectural design of our data platform, and serve as a mentor to other engineers. You will build distributed services and large scale processing systems that will support various teams to work faster and smarter. You will partner with Data Science to help productionize machine learning models and algorithms into actual data driven products that will help make smarter products for our users.
Currently, some tools our team uses are Python, Scala, Hadoop, Yarn, Pandas, Spark, MongoDB, AWS EMR/EC2/Lambda/kinesis/S3/Glue, ElasticSearch, Hive, Redshift, Airflow, and Terraform.
What you'll do:
Design, develop, and deploy a scalable, distributed data platform
Productionize our machine learning models and algorithms into data-driven feature MVPs that scale
Drive data solutions and features that will impact business decisions and our product road map
Leverage best practices in continuous integration and deployment to our cloud-based infrastructure
Optimize data access and consumption for our business and product colleagues
Develop an understanding of key product, user, and business questions
Who we’re looking for:
4+ years of professional experience working in data engineering
BS / MS in Computer Science, Engineering, Mathematics, or a related field
You have built large-scale data products and understand the tradeoffs made when building these features
You have a deep understanding of system design, data structures, and algorithms
Experience (or a strong interest in) working with Python or Scala
Experience with working with a cluster manager (YARN / Mesos / Kubernetes)
Experience with distributed computing and working with Spark, Hadoop, or MapReduce Framework
Experience working on a cloud platform such as AWS
Experience with ETL in general
Experience working in a product-driven enviornment
Experience working with Apache Airflow
Experience working with AWS Glue
Experience in Machine Learning and Information Retrieval