Posted 21 days ago

Boomi SITE RELIABILITY ENGINEER

Are you ready to work on world changing technologies?  Today, organizations need to move with increased agility and insight to grow and thrive. Boomi is one of the hottest tech companies in the SaaS/Cloud industry, named a Leader for the third year in a row in the Gartner Enterprise iPaaS Magic Quadrant and recently recognized by Inc. Magazine as one of the best workplaces. Our award-winning, patented technology is transforming the world of integration by making enterprise-class integration technology accessible and affordable to companies of all sizes.

 

Boomi provides the foundation on which your business can evolve and innovate. According to a recent survey by Vanson Bourne, connected businesses are far outpacing their competitors. We help organizations connect everything and engage everywhere across any channel, device or platform. More than 7,000 organizations are using Boomi to run better, faster and smarter.

Working at Boomi means doing what you love. We hire trailblazers with an entrepreneurial spirit who can solve challenging problems, make a real impact in technology and want to build something big. If you are passionate about solving hard problems, enjoy working with world-class people and developing cutting edge technology, you should explore a career with Boomi. Learn more at http://www.boomi.com/  or visit Boomi Careers.

Boomi is looking for an experienced Site Reliability Engineer to augment our SRE team responsible for the overall reliability of our cloud services in production, used by over 11,000 customers globally. This position requires a mindset focused on achieving Customer Satisfaction with a passion to provide a reliable and highly available solution to consumers. This is a hands-on position and requires the ability to develop software in support of ‘as code’ automation as well as contribute to the creation of new policies and processes within the organization.


What you’ll achieve
 

As a Site Reliability Engineer, you will be responsible for the availability of Boomi's systems and services. You will work with our developers and support teams on Incident Management, automation tooling and monitoring.


You will:

  • Participate actively in detecting, remediating and reporting on Production incidents, ensuring the SLAs are met and driving Problem Management for permanent remediation.

  • Participate in on-call rotation to ensure coverage for planned/unplanned events.

  • Engage with other Engineering organizations to implement processes, identify improvements, and drive consistent results.

  • Working with your SRE and Engineering counterparts for driving Game days, training and other response readiness efforts.

  • Collaborate with Service Engineering organizations to build and automate tooling, implement best practices to observe and manage the Boomi services in production and consistently achieve our market leading SLA.

  • Improving the scalability and reliability of Boomi’s systems in production.

  • Evaluating, designing and implementing new system architectures.

  • Automate the provisioning and maintenance of Boomi’s infrastructure.

Take the first step towards your dream career

Every Dell Technologies team member brings something unique to the table. Here’s what we are looking for with this role:

Essential Requirements

  • Expert in developing Ansible playbooks and automation for Infrastructure as code.

  • Well versed with Python to build frameworks and automation suits for underlying platforms.

  • Expert in defining, measuring, and improving Reliability Metrics (SLO/SLI/ Error budgets)

  • Strong in implementing observability practices (Monitoring, Logging, Distributed Tracing etc.) preferably using Splunk and New Relic. Experience not limited to using the dashboards, but creating them from the scratch.

  • Experience in conducting and automating DR exercise in the cloud thus validating RPOs and RTOs.

  • Worked in On-call rotation in the past.

  • Experience with Incident management process, practices, standards and tools.

  • Strong understanding and working experience with AWS components.


Desirable Requirements

  • Skilled in maintaining and improving overall health, availability, performance, resiliency, and capacity of Boomi' services and infrastructure

  • A grasp of Cloud Native concepts, containerization best practices and security awareness in Cloud.

  • Good understanding on troubleshooting infra and app performance issues with APM tools

Boomi is an Equal Opportunity Employer and Prohibits Discrimination and Harassment of Any Kind

Boomi is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. All employment decisions at Boomi are based on business needs, job requirements and individual qualifications, without regard to race, color, religion or belief, national, social or ethnic origin, sex (including pregnancy), age, physical, mental or sensory disability, HIV Status, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, past or present military service, family medical history or genetic information, family or parental status, or any other status protected by the laws or regulations in the locations where we operate. Boomi will not tolerate discrimination or harassment based on any of these characteristics. Boomi encourages applicants of all ages

Mission
We're connecting diverse talent to big career moves. Meeting people who boost your career is hard - yet networking is key to growth and economic empowerment. We’re here to support you - within your current workplace or somewhere new. Upskill, join daily virtual events, apply to roles (it’s free!).
Are you hiring? Join our platform for diversifiying your team
Boomi Site Reliability Engineer (Remote)