Azure Networking is the team behind the cloud and is responsible for delivering over 200 Microsoft web portals, Live and Online Services around the world including infrastructure, security and compliance, operations, globalization, and manageability. Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide. We are looking for a passionate, high energy individual to help build the network that powers the world’s largest online services.
Our team is the WAN Engineering team within Azure Networking. This team provides the network support for Microsoft's online services. We have a well- deserved reputation for providing industry leading operations engineering across multiple data centers on a best of class, global backbone. The successful candidate will directly influence our top priority of highly reliable Online Services. In an always on, always connected world, this role provides a unique opportunity to work on a breadth of technologies found in very few other networking opportunities.
Network reliability and stability are our core imperatives. This position provides network incident response and remediation for ongoing incidents which are actively impacting online services, as well as light design and deployment functions. This type of SWAT team is a key requirement for highly reliable Online Services where availability must be absolutely optimized. A successful Service Engineer will analyze, triage, and repair network issues actively impacting software services availability, as well as being able to collaborate well with peers on topics related to network design, network monitoring, and software architecture.
Core responsibilities of this job will include:
- Actively participate in on-call rotation which will include:
- Rapid <5 minute response to emergency network escalations.
- Immediate mitigation of network impact and restoration of service.
- Investigation and deep dive analysis of root cause for outages.
- Implementation of repair items and recommendations proactive improvement of the network
- Creating and maintaining operational documentation such as TSGs, Standard Operating Procedures, Methods of Procedure, as well as any needed post incident review details
- Detailed documentation of incidents with details for post incident reviews and analysis
- Participate in design and deployment discussions
- Execute on design decisions which may include:
- Onboarding/Deploying new hardware/software
- Remediate and standardize network infrastructure
- Optimize existing network designs
This role is part of the WAN Engineering team and interacts closely with our Tier 1 team in the Azure Network Operations Center, and other infrastructure support teams.
The successful candidate must be able to troubleshoot and triage complicated networking incidents ranging from physical layer issues up to load balancing problems at layer 4 in the networking stack. The person in this role must have a strong working knowledge of layer 2 protocols such as spanning tree, layer 3 networking with IP, layer 4 session troubleshooting over TCP, as well as routing protocols such as OSPF/IS-IS, BGP, and MPLS. Also needs experience utilizing network filtering troubleshooting via access control lists and firewall policies. These skills must be applied across a variety of networking platforms including but not limited to Cisco, Juniper, and Arista. In addition to technical skills, the successful candidate must be highly effective in written and oral communications with sets of partners located around the globe. Azure Networking provides connectivity to a very large number of online services requiring the successful candidate to be a strong collaborator, attentive listener, calm under pressure, earn confidence from others and possess a solid understanding of software architecture with the ability to articulate software requirements to teams of Software Developers. Microsoft strives to hire the best in the industry and the right person for this job will be able to conduct themselves with utmost professionalism on a team of high caliber peers.
• 5-7 years of network operations/network engineering experience in an online services or internet service provider environment. This time must include Layer 2 analysis and troubleshooting, TCP/IP routing, switching, and load balancing in a complex networking environment.
• CCIE/JNCIE is highly desired
• Experience scripting with C#, Perl, Python, Bash, PowerShell, TCL or Java is a plus
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.