L3 DevOps Specialist 3130388

Montreal, QC, Canada Full Time Posted 23 days ago

Company Profile 

Morgan Stanley is a global financial services firm and a market leader in investment banking, securities, investment management and wealth management services. With offices in more than 43 countries, the people of Morgan Stanley are dedicated to providing our clients the finest thinking, products and services to help them achieve even the most challenging goals. 

As a market leader, the talent and passion of our people is critical to our success. We embrace integrity, excellence, team work and giving back.


The Technology division partners with our business units and leading technology companies to redefine how we do business in ever more global and dynamic financial markets. 

Our sizeable investment in technology results in leading-edge tools, software, and systems. Our insights, applications, and infrastructure give a competitive edge to clients’ businesses—and to our own.

Enterprise Technology & Services (ETS) delivers shared technology services for the Firm supporting all business applications and end users. ETS provides capabilities for all stages of the Firm’s software development lifecycle, enabling productive coding, functional and integration testing, application releases, and ongoing monitoring and support for over 3,000 production applications.

ETS also delivers all workplace technologies (desktop, mobile, voice, video, productivity, intranet/internet) in integrated configurations that boost the personal productivity of our employees. Application and end user services are delivered on a scalable, secure, and reliable infrastructure composed of seamlessly integrated datacenter, network, compute, cloud, storage, and database services.

Position Description

The Site Reliability Specialist for IT Service Management role is part of Enterprise System Management in Morgan Stanley’s Application Infrastructure organization.  Our global Level 3 team of subject matter experts are focused on reliability engineering and support of the firm’s strategic platforms and components that are used for technology metrics, logs, analytics, workflows, and inventory for use across Morgan Stanley technology business areas. The team handling IT Service Management products is expanding to meet growing demand for our solutions by internal clients.

This position specializes in IT Service Management products, including ServiceNow and proprietary Change Management tooling handling millions of critical technology workflows and transactions.  The Morgan Stanley implementation of ServiceNow is a vendor hosted solution that nevertheless requires a number of technologies internally, including SQL databases, proprietary ServiceNow API, Unix, AFS, SSL, SAML and other web infrastructure that enables connectivity for many thousands of our users and extends the functionality of base applications (Incident, Problem, Change, Service Requests, etc.). In addition to the ServiceNow product, the team is also responsible for a suite of change management products and tools used to ensure changes are properly documented and authorized, with audit trails and critical detection controls for compliance. These change management tools are critical for ensuring the firm meets regulatory requirements and are critical to protecting the firm, particularly with agile adoption driving up change volumes.  

We require another Reliability Specialist who can identify efficiencies and other areas for improvement that will improve reliability and reduce the overall cost of support.  In addition to provide incident, problem and deployment management coverage as well as an array of various support and maintenance activities. You will integrate into the global ‘follow the sun’ support model to provide coverage.  

Your primarily responsibilities will be:

- Improving the environment through identifying operational risks and suggesting and implementing efficiencies.
- Provide incident management, including prompt user notifications, managing Incident conference calls, assist in troubleshooting, ensure appropriate resources are on the call, and work towards speedy resolution of incidents.
- Perform impact assessment of Infrastructure maintenance on systems and integrations. 
- Execute and manage product improvement and maintenance projects such as server migrations, failovers, upgrades and patches, SSL/SLA certificate changes, etc. 
- Deployment management: Manage Major internal or vendor release deployment using custom deployment runbooks, communication templates, and run checkout status calls.
- Ensure all aspects of system functionality is monitored.
- Problem management: pre-emptively /reactively manage problems initiated from activities like capacity management, software defects, hygiene and incident management. Track known errors, prioritize, and work with developers, vendors or other MS teams to eradicate problems.
- Play an active part in the 24/7 on-call rotation for escalations from L1/L2 support, and learn from those escalations to eliminate or optimize and automate the causes of these escalations.

Required Skills

- At least 5 years of relevant experience
- Good knowledge of Unix/Linux i.e. administration and basic infrastructure 
- Good troubleshooting skills in a distributed environment and knowing when to seek assistance
- Practical experience of scripting such as Shell, Python,  or Perl
- Applying an analytical approach to resolving complex technical issues with minimal assistance (following a product familiarization phase)
- Ability to communicate clearly in English to technical clients globally both verbally and in written form
- Ability to organize workload of multiple concurrent tasks without close supervision
- An aptitude for maintaining focus and effective collaboration during high pressure situations such as outage handling or failed deployments
- Experience with establishing and maintaining productive, influential relationships.

Desired Skills

- ServiceNow System Administration experience
- Good knowledge of SQL to be able to troubleshoot and develop ad-hoc scripts
- Major incident management call handling
- Experience with deployment automation tools such as Ansible.

Knowledge of French and English is required.

Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential.


We’re passionate about connecting highly skilled women with leading companies commited to diversity and inclusion

Are you looking for your dream job? In Office. Flexible. Remote.

Join our Movement

Are you hiring? Join our platform for diversifying your team

Post a job