Watch this video to learn more about Moody's
Job Details
Experience Level: Experienced Hire
Categories:
- Engineering & Technology
Location(s):
- Remote - United Kingdom, Remote, Remote, GB
At Moody's, we unite the brightest minds to turn today’s risks into tomorrow’s opportunities. We do this by striving to create an inclusive environment where everyone feels welcome to be who they are-with the freedom to exchange ideas, think innovatively, and listen to each other and customers in meaningful ways.
If you are excited about this opportunity but do not meet every single requirement, please apply! You still may be a great fit for this role or other open roles. We are seeking candidates who model our values: invest in every relationship, lead with curiosity, champion diverse perspectives, turn inputs into actions, and uphold trust through integrity.
At Moody’s Analytics Know Your Customer (KYC), we solve problems that matter. We are a cross functional team of sales and marketing, technology and product professionals who are all passionate about preventing criminal infiltration of the world’s financial system and bringing transparency to global supply chains by detecting fraud, terrorism, human trafficking, and other criminal threats. We combine the agility, passion, and dynamism of a startup with the strong positioning and stability of an established institution, providing our people with dynamic career paths and mobility options across the globe, all while having access to the entire Moody’s network. We empower our people and are committed to helping them reach their goals
The Director Manager - SRE will define and implement the SRE vision within the KYC GRID line of business. They will be responsible for contributing to the vision, long term strategy, and roadmap of the SRE function within the organization.
They will ensure that the SRE engineering function materially accelerates the stability, performance, and operation of the production platforms.
- Facilitate platform migration from AWS EC2/ECS to AWS EKS
- Define SRE approach along with quality guidelines and standards for applications/serverless functions/infrastructure.
- Manage and mentor global team of DevOps/SRE engineers.
- Manage on-call rotations across continents, India, UK, US. Across direct reports and wider infrastructure team.
- Participate in 24x7 operational support and on-call rotation shifts as escalation point.
- Work to automate detection and resolution of recurring issues.
- Own the root cause analysis investigation and prepare customer facing RCAs.
- Ensure that system design and procedures are documented and up to date.
- Collaborate with Development/DevOps/App support teams to define architecture, optimize performance, and right size environments.
- Build upon the current monitoring and observability stack with improved dashboards, earlier detection, and faster resolution of incidents.
- Work closely with other technical teams to optimize workflow processes.
- Performance test and find the capacity limits within UAT environments working alongside our QAS team for automated testing.
- Requires a deep knowledge of continuous deployment and configuration management tools suitable for production Kubernetes environments.
- Maintain the release process with an effort to reduce time to deploy and human intervention required.
- Own the testing and cadence of the Disaster Recovery exercises, and the automation of implementation.
- Expertise in multiple technical environments and knowledge of one or more business areas.
- Maintain environment and infrastructure to ensure vulnerabilities are minimized and upgrades are seamlessly deployed to meet infosec requirements.
- Further develop CD pipelines for multiple functions, improving quality and operational efficiency though all environments (QAS, UAT, PRD)
- Lead in own project planning processes.
- Coordinate with vendors to resolve problems (predominantly AWS, IBM, Moodys shared technology)
- Leads technical evaluation of new technologies and tools to aid monitoring, observability, logging.
Qualifications:
- Comprehensive expertise (10 years) in the following areas:
- Subject matter expertise in multiple strategic areas.
- Knowledge of at least one and preferably two programming languages (e.g. Python, Java, C/C++/C#)
- Production experience of running and maintaining a large scale Kubernetes deployment on EKS with associated automation and deployment experience in a mission critical client facing product.
- Demonstrated ability to analyze and interpret complex problems or processes, identify and understand requirements, and develop alternative solutions.
- Terraform, Serverless, Saltstack and Ansible knowledge is key, with a strong knowledge of Terraform modules a must.
- Recent experience of Datadog and Sumo logic would be a plus.
- Design, implementation, and management of Continuous Integration and Deployment processes for large size organization
- Extensive Cloud experience (Amazon AWS preferred) especially in architecting, deploying, and managing production workloads
- Proven record of managing medium to large size engineering team(s), expectation of up to 7 direct reports.
- Must have been working in a SRE or Production Engineering lead role, to train and upskill junior engineers in SRE and associated processes.
- BS in Computer Science (or equivalent)
- Additional coursework and/or some training certifications are preferred.
#LIhybrid #LIremote
Moody’s is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status, sexual orientation, gender expression, gender identity or any other characteristic protected by law.
Candidates for Moody's Corporation may be asked to disclose securities holdings pursuant to Moody’s Policy for Securities Trading and the requirements of the position. Employment is contingent upon compliance with the Policy, including remediation of positions in those holdings as necessary.
In a world shaped by increasingly interconnected risks, Moody's helps customers develop a holistic view of these risks to advance their business... Read more