Job details
Agentic AI introduces autonomous AI agents capable of analyzing data, making decisions, and executing actions at scale—requiring new guardrails, real‑time cost management, and AI-centric FinOps frameworks. Key Responsibilities 1. Cloud Architecture Optimization & Technical Advisory
- Conduct deep architectural reviews of high‑spend cloud services to identify inefficiencies.
- Recommend code‑level and infrastructure changes—including serverless patterns, right‑sizing, and storage tiering—to reduce spend.
- Ensure engineering teams adopt cost‑efficient design standards to prevent cloud and on-prem “tech debt.”
- Build cloud cost observability and on-prem analytics frameworks that provide real‑time usage and spend insights.
- Develop forecasting models, dashboards, anomaly‑detection systems, and financial models to support cloud budgeting.
- Integrate data from cloud providers, usage logs, telemetry, and AI agent activity streams.
- Develop automated governance scripts and IaC controls (Python, Bash, elasticsearch, etc) for proactive enforcement.
- Implement tagging standards, cost attribution, chargeback/showback frameworks, and compliance policies.
- Manage FinOps governance foundations promoting visibility, accountability, and cross‑team alignment.
Agentic AI introduces autonomous, reasoning‑capable AI agents that perform tasks, invoke APIs, spin up compute, and make resource decisions independently—requiring a new layer of FinOps oversight.
Design & Integrate Agentic AI Workflows into FinOps- Architect and integrate Agentic AI systems that autonomously analyze cloud usage, detect inefficiencies, and propose or execute optimizations.
- Incorporate multi‑agent systems capable of proactive anomaly detection, predictive optimization, and autonomous corrective actions within the cloud and on-prem ecosystem.
- Establish per‑agent cost attribution, including owner tags, budget identifiers, and full traceability of every model invocation or API call.
- Build telemetry pipelines (e.g., OpenTelemetry with cost metadata) capturing cost_per_call, decision logs, and tool usage for all agents.
- Design dynamic and iterative budgeting models, replacing static annual budgets with daily/weekly limit enforcement for agentic workflows.
- Implement policy-driven controls (e.g., budget throttles, automated revocation, execution guardrails) to manage microtransaction-level spend driven by autonomous agents.
- Govern agent estates using enterprise-grade tooling (e.g., Microsoft’s Foundry Control Plane) to enforce identity, security, and auditability for AI agent actions.
- Leverage or build Citi AI optimization agents (e.g., Azure Copilot Optimization Agent) that automatically analyze performance, compare SKU alternatives, and generate execution-ready automation scripts.
- Oversee the safe implementation of agent-suggested optimizations by validating performance impact and compliance before execution.
- Manage the cost implications of LLM inference, multi-agent collaboration, and retrieval-augmented generation (RAG) workflows, where token usage and replication can multiply costs significantly.
- Optimize model selection, context length, inference endpoints, and caching strategies to reduce unnecessary LLM consumption.
- Partner with FinOps Champions, engineering teams, and business stakeholders to translate cloud and AI cost goals into actionable backlogs.
- Promote organizational alignment via shared ownership of cloud and and on-prem AI spending across finance, engineering, and operations.
- Communicate complex On-prem, cloud, and AI cost insights clearly to executives and product teams.
- Drive ongoing cloud and agent-driven optimization initiatives to reduce waste, prevent cost overruns, and maximize ROI.
- Develop long-term cloud, AI, and automation strategy including SKU optimization, licensing, GPU provisioning, and model lifecycle cost management.
- Expertise in cloud architecture (AWS, Azure, GCP) with hands‑on cost optimization experience.
- Strong mastery of FinOps principles, cost models, and cloud financial governance.
- Experience with Python, SQL, Terraform/IaC, cloud billing datasets, and telemetry instrumentation.
- Understanding of LLMs, multi-agent architectures, RAG workflows, and AI operational cost models.
- Ability to design secure, monitored, and budget‑controlled environments for autonomous agents.
- FinOps Certified Practitioner / FinOps Certified Professional.
- Experience with AI agent platforms such as Azure Copilot Optimization Agent or enterprise agent governance systems.
- Background in MLOps, AI Systems Architecture, or autonomous AI engineering.
#LI-Hybrid
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Infrastructure------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Get Weekly Job Offers
Be first to know when jobs open.