PowerToFly
Recent searches
  • Events
  • Companies
  • Resources
  • Log in
    Don’t have an account? Sign up
Filters
Clear All
Advanced filters
Job type
  • Reset Show results
Date posted
  • Reset Show results
Experience level
  • Reset Show results
Company
  • Reset Show results
Skills
  • Reset Show results
Clear All
Cancel Show Results
Active filters:
Results 9672 Jobs

Wondering why you’re not getting hired?

Take your 3-min quiz and find out!

  • See what’s holding you back
  • Know exactly what to fix
  • Get a plan to move forward
Take the Quiz!
Loading...
Loading more jobs...

No more jobs to load

No more jobs to load

Big Data / PySpark Engineering Lead - Vice President
Save Job
Citi

Big Data / PySpark Engineering Lead - Vice President

Onsite Pune, India
Posted 5 hours ago
Save Job

Watch this video to learn more about Citi

Job Details

The Applications Development Technology Lead Analyst is a senior level position responsible for establishing and implementing new or revised application systems and programs in coordination with the Technology team. The overall objective of this role is to lead applications systems analysis and programming activities.

Key Responsibilities

Architecture & Design

  • Design and implement scalable, fault-tolerant batch and real-time data processing pipelines.
  • Develop robust data models and schema designs optimized for both performance and storage efficiency.
  • Evaluate and integrate emerging tools and frameworks (e.g., Spark, Flink, Kafka) into the existing stack.
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals

Data Modernization & Migration Leadership

  • Legacy Systems Decommissioning: Lead the strategic migration of data and logic from legacy platforms (e.g. on-premises SQL Servers) to a modern Data Lakehouse environment.
  • ETL/ELT Transformation: Re-engineer existing stored procedures and complex legacy ETL jobs into scalable, distributed processing frameworks using Spark (Python) and Starburst/Trino.
  • Validation & Parity Testing: Design and implement automated frameworks for Data Parity Testing to ensure 100% accuracy and consistency between legacy outputs and new big data results.
  • Schema Evolution: Map and transform rigid, legacy relational schemas into flexible, high-performance formats optimized for the cloud (e.g., Parquet, Avro, or Iceberg).
  • Phased Cutover Management: Orchestrate a phased migration strategy (Parallel Run, Shadow Execution) to ensure zero downtime for downstream business applications and reporting tools.
  • Performance Benchmarking: Establish performance baselines on legacy systems and ensure the new Big Data architecture meets or exceeds those benchmarks at scale.
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards

2. Engineering Excellence

  • Write clean, high-performance code in Python.
  • Optimize complex SQL queries and fine-tune distributed computing clusters to reduce latency and costs.
  • Ensure data integrity and security by implementing rigorous validation and encryption standards.

3. DevOps & Reliability

  • Build and maintain CI/CD pipelines for automated testing and deployment of data jobs.
  • Monitor system health and troubleshoot performance bottlenecks across the data lifecycle.

4. Leadership & Strategy

  • Provide technical mentorship and conduct code reviews for junior and mid-level engineers. Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary.
  • Translate complex business requirements into technical specifications.
  • Collaborate with Product Managers to ensure data availability for downstream analytics, business models and users
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements


 

Required Skills & Qualifications

 

  • Highly experienced and skilled technical lead with 12+years of experience with software building and platform engineering.
  • Experience in Data Engineering, focused on Big Data ecosystems.
  • Knowledge in Hadoop, YARN, Hive, Impala, Spark, and Spark SQL with extensive high volume of data processing pipeline development. Programming Expert level and hand on experience in Python.
  • Familiarity with data formats like Avro, Parquet, CSV, JSON.
  • Hands-on experience in writing SQL queries.
  • Highly experienced with Unix based operating systems and shell scripting.
  • Experience with source code management tools such as Bitbucket, Git etc.
  • Big Data Tech Proficiency and hands-on in Hadoop, Spark, Hive, Kafka, and NoSQL databases (MongoDB, HBase).
  • Experience working with query engines like Trino, Presto, Starburst
  • Strong computer science fundamentals in data structures, algorithms, databases, and operating systems.
  • Reverse Engineering, ability to read "spaghetti" SQL or old scripts and document the business logic before moving it.
  • Data Lineage, Experience using tools (like Collibra or Informatica) to track where data comes from and where it’s going.
  • Change Management, Experience managing the technical "shock" to the business when switching from legacy BI tools to modern query engines like Starburst.

Preferred Qualities

  • Problem Solver: You don't just fix bugs; you identify the root cause to prevent recurrence.
  • Communicator: You can explain the "why" behind a technical decision to non-technical stakeholders.
  • Automation and AI Mindset: You believe that if a task has to be done twice, it should be automated. Familiarity with AI tools to expedite deliveries.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

PySpark.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Company Details
Citi
 
Work at Citi

About Citi Working at Citi is far more than just a job. A career with us means joining a team of more than 200,000 dedicated people from around... Read more

Did you submit an application for the Big Data / PySpark Engineering Lead - Vice President on the Citi website?