Posted 4 months ago by

Sr Infrastructure Engineer - Application Performance Management Monitoring (Phoenix;St. Petersburg)

United States
Employment: Full Time Experience: Senior
Looking to invest your time and energy into innovative engineering projects that make a difference for a global IT services organization, then join the Enterprise Monitoring, Tooling and Engineering team at American Express. Be a part of the team responsible for introducing and supporting technology that improves the availability, performance and efficiency of American Express’ IT operations.

EMTE seeks a Senior Infrastructure Engineer with the ideas, knowledge, and strengths to help us deliver a world-class monitoring platform. This individual will be responsible for EMTE’s efforts to raise the bar on operational excellence and best practices for our Application Performance Management (APM) and Automation tools. This Senior Infrastructure Engineer will align all designs with American Express’ architectural enterprise standards and promote the adoption of monitoring/automation best practices. Success for this individual’s performance and outcomes will be measured, in part, on the engineer’s ability to:
  • design, produce, support and continuously improve EMTE’s monitoring tools
  • increase the operational stability and efficiency of EMTE’s monitoring platforms
  • create greater visibility of AET performance and availability
  • lead and collaborate with team members, technology partners and other stakeholders to create innovative solutions that achieve personal goals and those set by organizational leaders and the team.
As a Senior Infrastructure Dev/Ops Engineer you will:
  • Lead EMTE efforts to deploy, maintain and improve upon effective use of APM tools (e.g., AppDynamics, Dynatrace, etc.)
  • Maintain and improve upon APM platform availability and performance
  • Create materials to mentor APM stakeholders in the use of APM tools and best practices
  • Train APM users to analyze application performance and availability trends and conduct root cause analysis of performance issues
  • Develop, implement and support efforts to “Monitor the Monitor” – creating greater visibility into the system and application health of EMTE monitoring tools; improving stability, alert notifications and related KPIs for MTTx
  • Manage/Perform administration for EMTE tools/platforms (e.g., APM, Enterprise Logging as a Service, system/node monitoring, Event Correlation, etc.)
  • Monitor environment and computing resources for reporting and capacity planning.
  • Evaluate changes/updates of EMTE monitoring tools to determine if they could impact availability of production systems and coordinate with all appropriate stakeholders as needed
  • Assist with the administration/support of other EMTE platforms as necessary
  • Be available to provide on-call support for monitoring and automation tools during business hours, nights and weekends
  • Conduct performance analysis of EMTE processes and workload to identify opportunities for greater efficiency
  • Research and evaluate capabilities available within existing tools or recommend new tooling required to realize greater efficiencies, performance and availability gained through automation
  • Promote automation solutions for operations within EMTE and for stakeholders
  • Contribute to the implementation of an automation framework designed to increase greater operational efficiency and stability of American Express’ IT Operations processes, tools and infrastructure
  • Develop, document and implement enterprise standards and procedures for monitoring tools and processes
  • Work closely, at a deep technical level, with engineering teams to ensure solution designs are consistent with American Express Technology’s architectural vision, platform/product roadmaps, enterprise standards, guidelines and principles
  • Collaborate with delivery teams to build IT strategies in line with company and platform standards
  • Ensure compliance with security standards, and assist in audit preparations.
  • Adopt DevOps methods in support of monitoring and automation tools/services
  • Help bridge the gap between application development and infrastructure teams.
  • Troubleshoot issues that span hardware, software, applications and network services.
  • Follow Incident/Problem/Change Management, SOX and PCI processes
  • Function as an active member of an Agile team, consistently contributing to the team and its Agile practices (tools, common components, and documentation) and Scrum processes
  • Perform all activities in a timely manner, as required, to contribute toward Enterprise-level compliance of internal/external processes, standards and regulatory controls.
  • 6+ years experience using and administering Application Performance Management tools (e.g., AppDynamics and/or Dynatrace)
  • Expertise Supporting Unix/Linux Systems (RHEL 6/7)
  • Expertise Supporting J2EE Applications (JBoss, Weblogic, Websphere, etc)
  • Expertise Supporting .NET Applications
  • Prior experience supporting Network Infrastructure (TCP/IP; Layer2/Layer3)
  • Prior experience or understanding of Data Center operations/methodologies
  • 8+ years experience in IT Architecture (e.g., Network, Server, Application, Database)
  • 8+ years of experience with systems analysis/programming, incorporating: design methodology, Infrastructure operations support or engineering
  • Hands-on experience with a variety of software languages, operating systems, or network protocols
  • Experience managing team/workgroup activities
  • Bachelor’s Degree or equivalent experience in related field required
  • Self-motivated leader who can effectively collaborate in team and cross-team settings. Ability to persuade and influence without direct control.
  • Able to prioritize/manage tasks and supporting team involved across multiple work streams
  • Strong analytical, logical reasoning and problem solving skills
  • Strong written and verbal communication skills, with the ability to influence cross-functional teams, business and/or vendor partners, and technology leaders
  • Able to develop/make presentations, facilitate discussions and provide technical demonstrations in 1:1, small group and large group settings.
Preferred Experience:
  • Prior experience using/administering Open Source or Commercial Off-the-Shelf monitoring tools used for log monitoring, time series data, infrastructure/node monitoring or event correlation (e.g., Splunk/Elastic, ICINGA, Tivoli, BMC Patrol,
  • Ability to read and write in at least one scripting language (Perl, PowerShell, Bash etc.)
  • Ability to read and write in at least one programming language (Python, Java, Javascript, PHP, etc.)
  • Prior experience with at least one Version Control System (Git, Subversion, CVS etc)
  • Expertise with Workload Automation distributed systems
  • Expertise and administration with Ansible/Tower, Puppet and/or Che
  • Working knowledge of CI/CD tools (e.g., Jenkins)
  • Working knowledge of ServiceNow
  • Prior experience Enterprise SOA Environment
  • Prior experience with Cloud Computing Environments (EC2, Openstack, etc)
  • Prior experience in DevOps or DevOps-like environment (Practices that emphasize the collaboration and communication of both software developers and operations engineers)
  • Working knowledge of Application Development workflow and Agile Methods
  • Experience working with Scrum or Kanban-related tools and concepts (e.g., Jira, Rally, Epics, Stories, estimating story points, etc.)
  • Knowledge of SOX, PCI and other regulatory standards helpful
Equal Opportunity Statement

American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability status, or any other status protected by law.

The PowerToFly Mission

We're passionate about connecting highly skilled women with leading companies committed to diversity and inclusion.

Are you looking for your dream job? In Office. Flexible. Remote.

Join our Movement

Are you hiring? Join our platform for diversifying your team

Post a job