Site Reliability Engineering IC3

Main Location
Redmond, WA, United States
Jobs

Microsoft 365 (M365) Intelligent Conversation and Communications Cloud (IC3)

Intelligent Conversation and Communication Cloud (IC3) powers billions of real-time customer conversations across Microsoft’s first party (Teams, Skype), and second party (Dynamics) solutions. IC3 enables reliable and high-quality audio/video calling, meeting, and messaging services that work every time from anywhere seamlessly across all customer touchpoints. IC3 makes conversations on our platforms more intelligent in real-time empowering best-in-class productivity tools for the modern workplace where every call, meeting, or chat will make the next one better.

About the Team

Would you like to be a part of the team which supports over 100 million daily active users on Teams and counting? Our team owns the Management Infrastructure and Data Platform charter for IC3, delivering impact for the overall Teams and App/Bots - from controlling chat/calling/meetings experience to delivering key insights and analytics for all communication workloads. As a team, we have embraced microservices, not only as an engineering principle, but also as a culture. We own how we architect our service, our quality, when we deploy/enable features and we own fixing the issues impacting our customers (DevOps model). While there is plenty of work to be done in our services, our organization highly values work-life balance, and it is reflected in our culture.

Responsibilities

Responsibilities

  • Work with team of engineers focused on improving the reliability, scalability, latency, and efficiency of services powering cloud communications.
    • Managing problem resolution with service providers.
    • Learning and enhancing existing tools, developing new tools to meet new scale and features aimed at reducing manual intervention, enhancing prevention, detection and mitigation of service impacts.
    • Participate in on-call rotation of the local follow-the-sun team.
    • Manage incident response and perform root cause analysis investigations.
    • Reviewing existing processes and driving improvements in order to support scale and excellence of our services.
    • Analyzing data and providing operational insights into service reliability, customer experience to Design and Product teams.
    • Partnering with Data Scientists/ML engineers in developing proactive anomaly detection measures
    • Participating in recruiting, mentoring and developing a team of experienced SRE engineers.
Qualifications

Required:
• 2+ years of experience as a software engineer or site reliability engineer directly supporting development and quality in a product engineering team environment.
• 3+ years experience shipping distributed systems, services and highly available infrastructure
• 2+ years experience of scripting/coding using one or more of the following: PowerShell, C#, Python
• Expertise with PowerBI – create data models, write queries, creating powerful visualizations
• Experience with T-SQL, Kusto Query Language (KQL), Azure Log Analytics, Cosmos


Preferred:
• Experience with Microsoft Azure, Azure DevOps, ServiceNow, Microsoft Dynamics or FLOW
• Passionate about Site Reliability Engineering Practices
• Knowledge/experience of cloud-based distributed systems and micro services architecture.
• Knowledge/experience of Internet network architecture and working/functioning principles.
• Experience analyzing network packet captures and signaling traces

 

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Mission
We're a community of women leveraging our connections into top companies to help underrepresented women get the roles they've always deserved. Simultaneously, we work to build truly inclusive hiring processes and environments where women can thrive and not just survive.
Are you hiring? Join our platform for diversifiying your team
Site Reliability Engineering IC3
Microsoft Corporation