Job Details

There is a place for you at T. Rowe Price to grow, contribute, learn, and make a difference.  We are a premier asset manager focused on delivering global investment management excellence and retirement services that investors can rely on today and in the future. The work we do matters. We invite you to explore the opportunity to join us and grow your career with us.

Overview

Developer Services is reimagining the way we tackle operational challenges through an engineering mindset. You will have a rare opportunity to be a part of a growing, global, multi-functional Site Reliability Engineering team tasked with building SRE from the ground up into a first-class organization. You will be empowered to engineer and own solutions that foster scalable and resilient hybrid cloud solutions (both AWS and On-prem). You will build tooling to enable developers eliminate toil, increase observability of their services, and streamline deployments into AWS. You will also research problems, conduct installations, and evaluate new technologies

Requires specialized knowledge and expertise in your own job discipline and deep experience in integrating related disciplinary knowledge
Leads disciplinary or multi-functional program of notable risk; uses sophisticated analytical thought to identify creative solutions
Accountable for work of yourself and others; sets standards around which others will operate
Works independently with minimal guidance
Acts as advisor to management and key external partners on broad ranging projects
Proactively identifies problems and can present and implement solutions to these problems

Role summary and job responsibilities

Be highly opinionated around automation, observability, change velocity, and have a keen eye for engineering for resiliency and scale.
Work collaboratively across various functions and organizations including Infrastructure Engineering, Operations, Security, and Application Development teams.
Build capabilities and functions that enable the adoption of SRE across T. Rowe Price and are drivers of innovation.
Drive strategy involving implementing standard methodologies in areas of observability, automation, incident management, postmortems, micro services, and cloud solutions.
Fearlessly dive in on outages to restore service of our Developer Services products, and get to the root cause with the utmost transparency through a blameless postmortem culture.
Own the performance and availability of our Developer Services products and continuously enhance and innovate our platform to mature our stack.
Engineer end-to-end automation frameworks like Ansible AWX and SaltStack in an operationally viable fashion to equip Infrastructure Engineering and Operations with the tools to eliminate toil seamlessly.
Engineer and drive adoption of robust, scalable monitoring solution like Prometheus and Grafana which supports a Service Level Objectives monitoring culture.
Engineer enhancements to Incident Management capabilities through integration of x-Matters or PagerDuty to reduce Mean-Time-To-Response and Resolution and streamline paging and critical issue activities.
Be a part of an on-call rotation, continuously enhance documentation, and mentor others on the standard methodologies of SRE to encourage adoption.

Business knowledge

Decomposes the most complex problems into discrete work units.
Identifies non-obvious relationships and anomalies often overlooked by others.
Balances strategic and pragmatic concerns when solving problems.
Articulates broader business concerns and/or regulatory landscape, including key risks and controls (e.g., GDPR, MIFID, SOX)
Makes sound decisions with limited facts or resources.
Makes decisions that are cognizant of the firm’s broader business strategy

Requirements

Typically has 5+ years of relevant experience
Deep knowledge of AWS resources, networking, security, services, APIs, and even billing.
Engineering SDLC pipelines including supporting CI/CD based cloud deployments via Terraform or AWS API's, Git-based SCMs, and Artifact Repositories such as Artifactory and Docker Registry at scale (over 50TB of data at over 9000 transactions per second).
A solid core foundation in infrastructure and systems engineering including unix/linux compute, networking, storage, and monitoring stacks.
Experience developing docker-based micro services in both multi-tenant and dedicated micro service runtime environments.
Hands-on object oriented development and/or scripting experience such as Java, Python, and Go.

Preferred Experience

Contributions to Open Source projects especially related to Observability and Automation.
Thorough knowledge and experience of building highly available distributed systems, consensus protocols, service discovery, multi-tenancy paradigms, and operating in AWS resiliently at scale with low operational overhead.
A keen interest in keeping ahead of the technological advances in the SRE space and proven success at incorporating new technology into existing systems.
Previous experience mentoring and managing small teams with a desire to grow vertically with us with the potential of expanding a team of SREs reporting to you. Of course, growing as an Individual Contributor with us is equally appealing!

Commitment to Diversity, Equity, and Inclusion:

We strive for equity, equality, and opportunity for all associates. When we embrace the power of diversity and create an environment where people can bring their authentic and best selves to work, our firm is stronger, and we create greater value for our clients. Our commitment and inclusive programming aim to lift the experience for each associate and builds allies for our global associate community. We know that a sense of belonging is key not only to your success at the firm, but also to your ability to bring your best each day.

Benefits: We invest in our people through a wide range of programs and benefits, including:

Competitive pay and bonuses as well as a generous retirement plan and employee stock purchase plan with matching contributions
Flexible and remote work opportunities
Health care benefits (medical, dental, vision)
Tuition assistance
Wellness programs (fitness reimbursement, Employee Assistance Program)

Our policies may change as our working lives evolve. Yet, our commitment to supporting our associates’ well-being and addressing the needs of our clients, business, and communities is unwavering.

Learn more about T. Rowe Price

Help us maintain the quality of jobs posted on PowerToFly. Let us know if this job is closed.

We're connecting diverse talent to big career moves. Meeting people who boost your career is hard - yet networking is key to growth and economic empowerment. We’re here to support you - within your current workplace or somewhere new. Upskill, join daily virtual events, apply to roles (it’s free!).

Are you hiring? Join our platform for diversifiying your team

Post a job

Lead Site Reliability Engineer

Lead Site Reliability Engineer

Lead Site Reliability Engineer

Lead Site Reliability Engineer

Job Details

Overview

Role summary and job responsibilities

Business knowledge

Requirements

Preferred Experience

Commitment to Diversity, Equity, and Inclusion:

You Might Also Like