PowerToFly
Recent searches
  • Events
  • Companies
  • Resources
  • Log in
    Don’t have an account? Sign up
Results 11681 Jobs
Loading...
Loading more jobs...

No more jobs to load

No more jobs to load

Data Engineer - Senior

Cummins Inc.

Save Job
Cummins Inc.

Data Engineer - Senior

Onsite Pune, India
Posted 12 hours ago
Save Job

Watch this video to learn more about Cummins Inc.

Job Details

DESCRIPTION

GPP Database Link (https://cummins365.sharepoint.com/sites/CS38534/)

Job Summary:

Leads projects for design, development and maintenance of a data and analytics platform. Effectively and efficiently process, store and make data available to analysts and other consumers. Works with key business stakeholders, IT experts and subject-matter experts to plan, design and deliver optimal analytics and data science solutions. Works on one or many product teams at a time.

Key Responsibilities:

Designs and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured). Designs and implements framework to continuously monitor and troubleshoot data quality and data integrity issues. Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users. Designs and provide guidance on building reliable, efficient, scalable and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages. Designs and implements physical data models to define the database structure. Optimizing database performance through efficient indexing and table relationships. Participates in optimizing, testing, and troubleshooting of data pipelines. Designs, develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others). Uses innovative and modern tools, techniques and architectures to partially or completely automate the most-common, repeatable and tedious data preparation and integration tasks in order to minimize manual and error-prone processes and improve productivity. Assists with renovating the data management infrastructure to drive automation in data integration and management. Ensures the timeliness and success of critical analytics initiatives by using agile development technologies such as DevOps, Scrum, Kanban Coaches and develops less experienced team members.

RESPONSIBILITIES

Competencies:

System Requirements Engineering - Uses appropriate methods and tools to translate stakeholder needs into verifiable requirements to which designs are developed; establishes acceptance criteria for the system of interest through analysis, allocation and negotiation; tracks the status of requirements throughout the system lifecycle; assesses the impact of changes to system requirements on project scope, schedule, and resources; creates and maintains information linkages to related artifacts.

Collaborates - Building partnerships and working collaboratively with others to meet shared objectives.

Communicates effectively - Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences.

Customer focus - Building strong customer relationships and delivering customer-centric solutions.

Decision quality - Making good and timely decisions that keep the organization moving forward.

Data Extraction - Performs data extract-transform-load (ETL) activities from variety of sources and transforms them for consumption by various downstream applications and users using appropriate tools and technologies.

Programming - Creates, writes and tests computer code, test scripts, and build scripts using algorithmic analysis and design, industry standards and tools, version control, and build and test automation to meet business, technical, security, governance and compliance requirements.

Quality Assurance Metrics - Applies the science of measurement to assess whether a solution meets its intended outcomes using the IT Operating Model (ITOM), including the SDLC standards, tools, metrics and key performance indicators, to deliver a quality product.

Solution Documentation - Documents information and solution based on knowledge gained as part of product development activities; communicates to stakeholders with the goal of enabling improved productivity and effective knowledge transfer to others who were not originally part of the initial learning.

Solution Validation Testing - Validates a configuration item change or solution using the Function's defined best practices, including the Systems Development Life Cycle (SDLC) standards, tools and metrics, to ensure that it works as designed and meets customer requirements.

Data Quality - Identifies, understands and corrects flaws in data that supports effective information governance across operational business processes and decision making.

Problem Solving - Solves problems and may mentor others on effective problem solving by using a systematic analysis process by leveraging industry standard methodologies to create problem traceability and protect the customer; determines the assignable cause; implements robust, data-based solutions; identifies the systemic root causes and ensures actions to prevent problem reoccurrence are implemented.

Values differences - Recognizing the value that different perspectives and cultures bring to an organization.

Education, Licenses, Certifications:

College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required. This position may require licensing for compliance with export controls or sanctions regulations.

Experience:

Intermediate experience in a relevant discipline area is required. Knowledge of the latest technologies and trends in data engineering are highly preferred and includes:

  • Familiarity analyzing complex business systems, industry requirements, and/or data regulations
  • Background in processing and managing large data sets
  • Design and development for a Big Data platform using open source and third-party tools
  • SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework
  • SQL query language
  • Clustered compute cloud-based implementation experience
  • Experience developing applications requiring large file movement for a Cloud-based environment and other data extraction tools and methods from a variety of sources
  • Experience in building analytical solutions

Intermediate experiences in the following are preferred:

  • Experience with IoT technology
  • Experience in Agile software development

QUALIFICATIONS

Job Summary:

This role requires strong expertise in Python, PySpark, Databricks, and Machine Learning/AI, along with hands-on experience in React, Node.js, Azure Web Applications, and OpenAI API integration. The engineer will design, build, and deploy scalable data quality and observability solutions — including AI/ML-powered validation, web-based visualization, and automated data monitoring — across the enterprise ecosystem.

Key Responsibilities:

  • Data Quality Engineering & Automation
  • Design and develop data quality profiling pipelines using PySpark, Python, and Databricks to ensure data accuracy and reliability.
  • Build automated validation, cleansing, and anomaly detection frameworks leveraging ML/AI models.
  • Integrate OpenAI APIs for intelligent rule generation, data summarization, and NLP-based insights.
  • Full-Stack Development (Web + API)
  • Develop responsive React-based user interfaces for self-service data quality monitoring and visualization.
  • Build and maintain RESTful APIs using Node.js and Python (Flask/FastAPI) to serve ML models and DQ metrics.
  • Host and manage applications using Azure Web Apps, ensuring secure API connectivity and scalable architecture.
  • Cloud and Data Integration
  • Work with Azure SQL Database, Azure Storage, and Databricks for seamless data access and validation.
  • Integrate data pipelines with APIs, Dataverse, Snowflake, or other cloud data sources.
  • Support Azure Application Insights and monitoring integration for performance tracking.
  • ML/AI Enablement
  • Contribute to ML projects focused on data quality improvement and predictive monitoring.
  • Apply techniques like anomaly detection, clustering, or deep learning to identify data issues proactively.
  • Use OpenAI / LLM APIs to automate root cause analysis, data profiling, and rule suggestions.
  • Collaboration and Documentation
  • Partner with Data engineers, Architect team, Cloud support team and business stakeholders to translate requirements into technical solutions.
  • Document DQ frameworks, ML workflows, APIs, and integration processes for reusability and transparency.

Education:

  • Bachelor’s or Master’s in Computer Science, Data Science, Engineering, or a related field.

Experience:

  • 4+ years in data engineering / data quality / ML pipeline roles
  • Strong exposure to web-based data platforms (React + Node.js + Azure)
  • Proven record of implementing AI-augmented data quality or observability solutions
  • Experience in OpenAI API integration or similar LLM-based applications preferred.

Technical Skills (Preference will be given to):

  • Programming - Python, PySpark, SQL, Node.js, JavaScript
  • Framework / Tools - Databricks, React, FastAPI/Flask, scikit-learn
  • Cloud platforms - Azure (Web Apps, Application Insights, Storage, SQL Database)
  • AI/ML & OpenAI - NLP, LLM API integration, AI-driven anomaly detection
  • Data Quality - Profiling, cleansing, validation frameworks
  • APIs - RESTful API design and integration
  • Visualization - Power BI, React based dashboards
  • DevOps / Infra - Github, CI/CD pipelines, Azure DevOps

Soft Skills:

• Analytical mindset with high attention to detail.

• Strong communication and collaboration across technical and business teams.

• Innovative problem-solver with a passion for automation and data integrity.

Job Systems/Information Technology

Organization Cummins Inc.

Role Category On-site with Flexibility

Job Type

ReqID 2423078

100% On-Site No

Company Details
Cummins Inc.
 Columbus, IN, United States
Work at Cummins Inc.

Cummins Inc. is a global power leader with complementary business segments that design, manufacture, distribute and service a broad portfolio of... Read more

Did you submit an application for the Data Engineer - Senior on the Cummins Inc. website?