Site Reliability Engineer

Main Location
Redmond, WA, United States

This position is available anywhere in Mexico 

Intelligent Conversation and Communications Cloud (IC3) Carrier Operations Team
Intelligent Conversations and Communications Cloud (IC3) powers billions of real-time customer conversations across Microsoft’s first-party (Teams, Skype) and second-party (Dynamics) solutions. IC3 enables reliable and high-quality audio/video calling, meeting, and messaging services that work every time, from anywhere seamlessly across all customer touchpoints. IC3 makes conversations on our platform more intelligent in real-time empowering the best-in-class productivity tools for the modern workplace where every call, meeting or chat makes the next one better.

As part of the IC3 Carrier Operations SRE team, our mission is to ensure we operate the IC3 PSTN services with end to end high availability, performance and reliability to ensure customer objectives are consistently met or exceeded. To achieve this, we work closely with our product and engineering teams and use a variety of home-grown toolsets aimed at aggressive automation for reliability. We are also a service engineering-focused team running at scale while supporting deployments to support new carriers across the globe.



• Work with team of engineers focused on improving the reliability, scalability, latency, and efficiency of PSTN services powering cloud communications.
• Managing problem resolution with service providers.
• Learning and enhancing existing tools, developing new tools to meet new scale and features aimed at reducing manual intervention, enhancing prevention, detection and mitigation of service impacts.
• Participate in on-call rotation of the local follow-the-sun team.
• Manage incident response and perform root cause analysis investigations.
• Reviewing existing processes and driving improvements in order to support scale and excellence of PSTN services.
• Analyzing data and providing operational insights into service reliability, customer experience to Design and Product teams.
• Partnering with Data Scientists/ML engineers in developing proactive anomaly detection measures
• Participating in recruiting, mentoring and developing a team of experienced SRE engineers.


• 2+ years of experience as a software engineer or site reliability engineer directly supporting development and quality in a product engineering team environment.
• 3+ years experience shipping distributed systems, services and highly available infrastructure
• 2+ years experience of scripting/coding using one or more of the following: PowerShell, C#, Python
• Expertise with PowerBI – create data models, write queries, creating powerful visualizations
• Experience with T-SQL, Kusto Query Language (KQL), Azure Log Analytics, Cosmos

• Experience with Microsoft Azure, Azure DevOps, ServiceNow, Microsoft Dynamics or FLOW
• Passionate about Site Reliability Engineering Practices
• Knowledge/experience of cloud-based distributed systems and micro services architecture.
• Knowledge/experience of Internet network architecture and working/functioning principles.
• Experience with Voice over IP highly desirable.
• Experience analyzing network packet captures and signaling traces
• Experience working with SBCs, Media Gateways, Circuit-switched Telephony, SS7, ISDN/ISUP.



Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.


Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

We're a community of women leveraging our connections into top companies to help underrepresented women get the roles they've always deserved. Simultaneously, we work to build truly inclusive hiring processes and environments where women can thrive and not just survive.
Are you hiring? Join our platform for diversifiying your team
Site Reliability Engineer
Microsoft Corporation