Help us maintain the quality of jobs posted on PowerToFly. Let us know if this job is closed.
Job Type
Full Time
Job Details
Job Description Team: The Site Reliability Engineering team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability and performance of the ServiceNow platform and infrastructure. The SRE is empowered to drive technical resolutions across the technology stack from application through to hardware and all stops in between. The ultimate goal of the SRE is to never have to escalate an issue to an engineering or development team and to completely own the resolution of incidents. They are also tasked with driving forward the operability of the platform to drive down incident numbers and to reduce MTTR. To accomplish this the team combines Software Development, Networking and Systems Engineering expertise with a strong desire to be challenged by problems of scale and complexity and to make services better for our customers. What you get to do in this role: As an Engineer in the SRE team you will:
- Deliver immediate relief and provide a sustainable resolution to issues within the ServiceNow platform.
- Use knowledge and experience in software development, application support, systems engineering and networking to proactively prevent issues from reoccurring.
- Lead internal stakeholders and partner teams to improve the reliability, scalability and performance of the infrastructure through improved system design.
- Champion and contribute to a culture of intolerance to manual activity, which results in an automation environment delivering repeatable and scalable response to system issues.
- Ability to work in shifts (Sun-Wed / Wed-Sat 7:30 am to 17:30pm ) which includes one weekend day.
- Knowledge of Linux systems.
- Comfortable designing, authoring, testing, and debugging code in a team setting in one of the following languages such as Python, Go, Java, or Ruby.
- Experience working with relational database: MySQL, MariaDB or PostgresSQL.
- Experience working with systems at scale - supporting critical services with focus on automation, observability, availability, and performance.
- Expertise in Observability and Monitoring of applications, services, and networks at scale.
- Experience with DevOps automation, CI/CD pipeline and agile methodologies such as Gitlab CI-CD.
- Experience writing test specifications and understand the fundamentals of test automation.
- Experience working with Cloud technologies such as Azure and AWS.
- Experience in configuration management of infrastructure using Ansible.
- Experience with Kubernetes to orchestrate the deployment, scaling, and management of containers.
- Along with holidays, we have company-wide designated global well-being days where everyone is off and can spend time doing what matters most.
- Good working culture to support the balance you need in both work and life.
- Parental leave programs.
- Childcare and caregiving benefits.
- A learning experience platform built using our own technology, to support your learning and development goals as well as a tuition reimbursement program.
- A global, cross-functional mentoring program.
- We also have team building activities, various employee belonging groups, volunteering, and community outreach programs.
About the Company
ServiceNow
Santa Clara, CA, United States
At ServiceNow, our technology makes the world work for everyone, and our people make it possible. We deliver digital workflows that create great... Read more