Site Reliability Engineer - Observability

San Francisco, CA, United States Posted 9 days ago
Main Location
San Francisco, CA, United States
Open jobs

Yelp is looking for an experienced software engineer to guide our engineering teams toward a reliable and efficient future.

The Production Observability team collaborates with groups all across Yelp Engineering to improve visibility into the health of high-risk services and infrastructure, with the goal of reducing the amount of time it takes to diagnose a critical issue. We do this by collecting metrics and building tooling and visualizations that make it easy for all teams to understand how their systems behave in production.

Yelp engineering culture is driven by our values: we’re a cooperative team that values individual authenticity, and encourages “unboring” solutions to problems. New hires are expected to deploy working code their first week, and your impact will only grow from there with the support of your manager, mentor and team. At the end of the day, we are all about helping our users, growing as engineers, and having fun in a collaborative environment.

What You Will Do:
  • Design, build and deploy software systems that run 24/7 at scale.
  • Drive devops best practices by coordinating across many teams and teaching other engineers how to investigate problems.
  • Build tooling to identify hot spots and regressions across the infrastructure that put Yelp products at risk.
  • Dive deep into our large service-oriented architecture to make it transparent, measurable and tunable.
  • Optimize our workflows and products systematically through automation.
We Are Looking For:
  • An experienced software engineer, with an interest in metrics and devops.
  • Familiarity with performance analysis tools. (e.g. tracers, profilers, debuggers, visualization tools)Fluency in Python, C, C++, Java, or a similar language.
  • At least one year of full-time working experience (besides internships).If you don't have at least one year of experience in a similar role, please take a look at our College Engineering roles instead!
  • Experience building and supporting large-scale distributed systems that back a consumer app or website.
  • Experience exploring datasets and turning performance metrics into easily-understood data visualizations.
  • Familiarity with real-user and/or synthetic performance monitoring.
  • Experience integrating performance tools into Continuous Integration/Deployment pipelines.
Help us maintain the quality of jobs posted on PowerToFly. Let us know if this job is closed.
We're a community of women leveraging our connections into top companies to help underrepresented women get the roles they've always deserved. Simultaneously, we work to build truly inclusive hiring processes and environments where women can thrive and not just survive.
Are you hiring? Join our platform for diversifiying your team