Squad leadership Jobs. Page 252

Watch this video to learn more about VISA

Job Details

Job Description

This engineer is expected to lead by example through hands-on contributions, deep technical expertise, and cross-team influence, particularly in the area of infrastructure bootstrap orchestration and automation at scale.

Key Responsibilities:

Platform Ownership & Reliability:

Own the end-to-end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components, including:

Cloud infrastructure primitives
Kubernetes clusters and cluster services
Networking, ingress, and service discovery
Service Mesh and supporting data-plane components

Ensure platform components are resilient by design, applying SRE principles such as:

Fault isolation and graceful degradation
Capacity planning and saturation control
Reduced operational toil and clear failure modes
Continuously assess and mitigate reliability risks, proactively improving platform stability and operational readiness.

Infrastructure Bootstrap & Automation Leadership:

Lead the design and implementation of infrastructure bootstrap orchestration, including:

Automated cluster and environment provisioning
Deterministic, repeatable platform bring-up and teardown
Dependency-aware orchestration across cloud, network, and Kubernetes layers

Drive a strong Infrastructure-as-Code and GitOps-first approach, ensuring:

Platform components are reproducible and auditable
Changes are automated, testable, and reversible
Manual intervention is minimized or eliminated
Identify automation gaps and lead initiatives that significantly reduce human effort, onboarding time, and operational risk.

SRE Practices & Operational Excellence:

Apply and promote SRE practices across the platform, including:

Clear ownership and runbooks for platform components
Participation in on-call rotation as a platform reliability escalation point
Incident response, post-incident reviews, and problem management

Improve platform operability by:

Simplifying day-2 operations
Standardizing upgrade and rollback strategies
Reducing Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR)
Ensure platform operations align with security, compliance, and internal control requirements.

This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.

Qualifications

Strong hands-on experience with:

Public Cloud platforms (AWS preferred, Azure)
Kubernetes at scale, previous experience administrating productive Kubernetes environments
Service Mesh technologies (e.g., Istio preferred, App Mesh, Linkerd)

Strong understanding of:

Observability tooling and Golden Signals concepts
Incident management concepts and oncall operations
Infrastructure as Code (e.g., Terraform)
Cloud-Native containerized micro-services architecture
Strong collaboration and communication skills.

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Company Details

VISA

Foster City, CA, United States

Work at VISA

At Visa, we are driven by a common purpose – to uplift everyone, everywhere by being the best way to pay and be paid. As our products and... Read more

Sr. Site Reliability Engineer

School To Work - Purchasing

Solution Architect - Director

Business Analysis Group Manager

Assc Dir-Financial Engineer

Firmwide People Analytics Advisor (Hybrid)

Rules Consultant (Project Management)

Senior Manager, Sales Engineering, VAS Issuing Solutions (Pismo) - Sydney or Melbourne

IT Project Senior Analyst Assistant Vice President

Banker, Project & Infrastructure Finance, Structured Debt - VP

Wondering why you’re not getting hired?

Sr. Site Reliability Engineer