PowerToFly
Recent searches
  • Events
  • Companies
  • Resources
  • Log in
    Don’t have an account? Sign up
Filters
Clear All
Advanced filters
Job type
  • Reset Show results
Date posted
  • Reset Show results
Experience level
  • Reset Show results
Company
  • Reset Show results
Skills
  • Reset Show results
Clear All
Cancel Show Results
Active filters:
Results 9061 Jobs
Loading...
Loading more jobs...

No more jobs to load

No more jobs to load

Engineering Lead Analyst, Innovation Labs

Citi

Save Job
Citi

Engineering Lead Analyst, Innovation Labs

Onsite Mississauga, Canada
Posted 5 hours ago
Save Job

Watch this video to learn more about Citi

Job Details

As an Infra & DevOps Engineer, you will join a dynamic team in the Citi Innovation Labs under CTO organization. You will operate within NAM hours, complementing our existing team primarily based in Israel (EMEA hours). Your expertise will be vital in strengthening our infrastructure and DevOps practices, directly contributing to faster and more reliable software delivery. This role is deeply hands-on, focusing on implementing, maintaining, and optimizing critical systems that foster innovation and support our scalable, resilient, and secure infrastructure. You will be an active team player, bringing specialized technical skills to address operational challenges, implement advanced solutions, and collaborate closely to achieve our collective goals, especially within high-performance and GenAI environments.


Key Responsibilities

  • Core System Implementation: Implement and maintain essential infrastructure components, including specific configurations for on-prem GPU clusters (V100/A100/H100/H200 MIG) that underpin GenAI and high-performance workloads, ensuring operational stability.

  • CI/CD Operations & Improvement: Contribute to the efficient operation and continuous improvement of our CI/CD pipelines and automation frameworks. Leverage and contribute to our GitHub repositories to streamline development and deployment processes.

  • System Reliability & Performance: Monitor, troubleshoot, and optimize system reliability and performance across various environments. Work with the team to identify and resolve critical issues promptly, ensuring a high level of operational availability and client satisfaction.

  • Automation Development: Develop and implement automation scripts and tools to enhance operational efficiency, reduce manual effort, and improve the consistency of our infrastructure and deployment processes.

  • Emerging Technology Support: Provide hands-on support for the deployment and ongoing operation of emerging technologies relevant to GenAI, such as NIM images, MLflow 3.x, Coder, and LLMOps infrastructure. Actively contribute to the setup and maintenance of experimentation platforms like GCP Sandbox.

  • Operational Best Practices: Adhere to and actively contribute to established operational best practices, documentation, and runbooks to ensure consistency and maintainability of our systems.

  • Team Collaboration: Work seamlessly within the team, participating in discussions, sharing insights, and collaborating with colleagues and development partners to achieve shared objectives.

Skills & Experience Required

  • 6+ years of overall work experience, specifically 5+ years of dedicated, hands-on technical experience in Infrastructure, Site Reliability Engineering (SRE), or DevOps roles, with a proven ability to contribute significantly to complex operational environments.

  • Proven practical experience in working with and optimizing GPU infrastructure for GenAI and high-performance computing - is an advantage.

  • Strong practical knowledge of cloud environments, containerization technologies (Docker, Kubernetes, OpenShift), and operational aspects of serverless computing.

  • Proficiency in scripting languages (e.g., Python, Bash) for system automation, configuration, and diagnostics.

  • Demonstrated experience in implementing and operating CI/CD pipelines, infrastructure-as-code principles, and automation solutions, with solid experience using GitHub.

  • Understanding of and ability to apply enterprise security best practices, compliance standards, and data privacy considerations in daily operations.

  • Solid problem-solving skills with an ability to diagnose and resolve technical issues effectively in production environments.

  • Strong communication and interpersonal skills, fostering effective teamwork and collaboration within a diverse, global team.

  • Bachelor’s degree in computer science, engineering, or a related technical field, or equivalent practical experience.

Tech Stack Expertise

  • Cloud Platforms: AWS, GCP (Operational experience).

  • GPU Infrastructure: NVIDIA V100/A100/H100 /H200 clusters, MIG (Practical operational experience).

  • Scripting & Automation: Python, Bash.

  • CI/CD Orchestration: Tekton, Harness, CI/CD for GenAI workloads.

  • Version Control & Collaboration: Git, GitHub Enterprise, Jira, Confluence.

  • Database Technologies: MongoDB/MaaS, PostgreSQL and Redis (Operational knowledge).

  • Operating Systems: Linux, Wintel (System administration experience).

  • Containerization & Orchestration: Docker, Kubernetes, OpenShift (Hands-on operational experience).

  • Networking: Load Balancers, DNS.

  • Monitoring & Observability: ELK Stack, Prometheus, Grafana, ITRS (Practical operational experience).

  • Infrastructure as Code: Terraform, Ansible (or similar) (Practical application).

  • Developer Productivity Tools: GitHub Copilot, StackOverflow for Teams, Devin, Delphine.

  • Service Mesh: Practical operational experience.

Education:

  • Bachelor’s degree/University degree or equivalent experience

  • Master’s degree preferred

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Systems & Engineering

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Primary Location Full Time Salary Range:

$120,800.00 - $170,800.00

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Automated Processing and AI

We use automated processing, including artificial intelligence, for our legitimate business interests (or our reasonable and appropriate business purposes) to identify and align the candidate's skills and abilities with a specific job opening. Additionally, if you so choose, or consent, we can match your skills and abilities to other suitable roles at Citi.

Importantly, all our hiring processes and decisions, including determining your suitability for a role, are conducted, checked, and decided by individuals. Our automated processing and AI do not involve relying on automatic or autonomous decision-making. Please refer to any Jurisdictional Considerations, with specific provisions for your country (where relevant) for further details.

------------------------------------------------------

This job opening is for an existing job vacancy.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Company Details
Citi
 
Work at Citi

About Citi Working at Citi is far more than just a job. A career with us means joining a team of more than 200,000 dedicated people from around... Read more

Did you submit an application for the Engineering Lead Analyst, Innovation Labs on the Citi website?