Posted 8 days ago

Are you looking for an opportunity to defend the productivity of millions worldwide against malicious individuals and organizations? Continually curious about innovative techniques and technologies in the rapidly developing landscape of information security? Obsessed with finding security solutions that solve hard problems and create better experiences for users and developers? So are we. M365 Security Engineering is the core security team for Microsoft 365's suite of services, including Office 365. We design, build, and operate security solutions that run at cloud scale to secure the world's largest commercial SaaS service. Our solutions enable engineers in Microsoft to build intelligent and differentiated services and products while meeting industry leading customer commitments and building customer trust.


Our vision is that building services and products that commercial and consumer customers can trust is something that any engineer at Microsoft should be able to do in a self-service way. We invest in key areas of security and compliance centrally so that teams can focus on building the next great app, using intelligence, automation, and extensibility to scale our ability to hold teams accountable for delivering secure and compliant products and services.


Our culture is inclusive, casual, and high energy; our engineers come from diverse backgrounds, are passionate and loyal to coworkers and our products, and grounded in our customer needs. Our team has a strong sense of accountability and provides its members many opportunities for learning and career growth.  Our bold mission is to defend the productivity of people and organizations at home, at school, and at work. Enabling teams at Microsoft to deliver trustworthy intelligent services people love to use to communicate, create, learn, and work together, anywhere.


We are seeking a Site Reliability Engineer for our Security Management team. In this role you will be responsible for supporting – and improving the efficiency and reliability of – host security agents and their associated back-end systems.  You will be responsible for guiding M365 services to adopt critical security tools, and improve the user experience, reliability

  • Design and build infrastructure & systems that provide high levels of scalability, reliability, and performance for M365 Core Security’s shared security agents and services
  • Partner with deployment and component owners to codify and reliably test infrastructure which supports security agents deployed to over 2 million hosts across both Azure and internal management stacks
  • Work with the broader dev team to support escalations from other M365 services that have adopted, or are adopting, shared security tools
  • Build and contribute to logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization in a cloud-agnostic environment
  • Drive improvements to the escalation and on-call process through automation and process improvement
  • Partner with Dev and PM teams to provide guidance and best practices around scalability, reliability, and performance of security agents, infrastructure, and services
  • Deliver impact while fostering strong and inclusive collaboration with engineering teams and partner teams across Microsoft

Required Qualifications

  • 3-5 years of Windows hands-on system experience
  • Proven experience with supporting production services and monitoring infrastructures
  • Object-oriented coding experience in c#, Python or another equivalent

Preferred Qualifications

  • Substantial experience in CI\CD concepts and hands on implementations experiences, specifically GIT
  • A mix of coding/testing/Integration experience
  • Exposure to Mac OS is a bonus
  • Ability to manage and deliver multiple project phases at the same time
  • Strong analytical and problem solving and organizational skills
    Excellent written and oral communication skills
  • Ability to deal with the ambiguity associated with working in a fast-paced and changing environment.
  • Leadership skills: Sound problem resolution, judgment, negotiating and decision-making skills.
    Feedback/Metrics collection techniques to expose live site/service issues
  • Support a 24x7 live site support model for the services team owns


Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.


Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.


Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

We're connecting diverse talent to big career moves. Meeting people who boost your career is hard - yet networking is key to growth and economic empowerment. We’re here to support you - within your current workplace or somewhere new. Upskill, join daily virtual events, apply to roles (it’s free!).
Are you hiring? Join our platform for diversifiying your team
Site Reliability Engineer