Senior AI Data Engineer

This role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE office.

Who We Are:

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description:

HPE Financial services is where we help organizations create the investment they need for digital transformation, in an innovative and sustainable way. We partner with customers across their entire IT asset portfolio from edge to cloud to end-user. Unique to each client’s aspirations and size, our financial and asset management solutions are anchored by best-in-class tech upcycling services. Join us redefine what’s next for you.

Role Summary

The Senior Data Scientist role is an individual contributor role that acts as technical subject matter across the full data platform stack - data architecture, transformation design, data quality frameworks, and governance fit for both analytical and AI consumption. The role will work hand in glove with the AI engineers and will play a pivot role in enabling high quality and accuracy data for AI products. This role will also act as a technical mentor for the data engineering team and a trusted partner to the Senior AI & ML Engineer — jointly ensuring that governed, high-quality data reliably powers both reporting and AI use cases.

What you'll do:

Technical Leadership & Data Architecture

Serve as the data engineering SME — the escalation point for complex platforms, pipeline, and governance decisions across the team and organization.
Architect end-to-end data solutions: Design reusable data components, design Lakehouse structures, data vault patterns, semantic layers, and integration of architectures across AI, Analytics and Automation platforms.
Define and enforce data engineering standards, pipeline design patterns, naming conventions, and coding best practices; conduct architecture and code reviews.
Lead technical discovery for new data initiatives: assess feasibility, design solution approaches, and produce architecture documentation for stakeholder alignment.
Mentor and upskill the Technical Data Engineer through structured knowledge transfer, pair programming, and design reviews.

Advanced Data Transformation & Pipeline Engineering

Design and deliver complex, production-grade ELT/ETL pipelines using Databricks (Delta Live Tables, PySpark, Unity Catalog) and Microsoft Fabric (Dataflows Gen2, Notebooks, Data Factory).
Architect reusable, parameterized pipeline frameworks that the wider team can adopt — reducing one-off scripting and increasing delivery velocity.
Define and implement advanced transformation patterns: multi-hop Delta Lake pipelines, SCD Type 2/6, event-driven streaming ingestion, and late-arriving data handling.
Optimize pipeline performance at scale — partitioning strategy, Z-ordering, liquid clustering, broadcast joins, and cost-based query planning in Spark.

Data Quality Strategy & Governance Leadership

Define the AI products data quality framework — establish DQ dimensions, thresholds, escalation paths, and remediation SLAs across all critical datasets.
Drive business glossary completeness, lineage documentation, data stewardship workflow design, and policy management.
Implement automated data quality validation at scale. Integrate DQ gates into CI/CD pipeline deployments.
Act as the data quality escalation authority — triage complex DQ incidents, perform root-cause analysis, and drive permanent fixes rather than tactical workarounds.

Power BI & Advanced Analytics Reporting

Lead the Power BI governance model: certified dataset strategy, endorsement policies, deployment pipelines, and report lifecycle management.
Design and review complex semantic models for performance, correctness, and scalability; define DAX patterns and measure libraries for team-wide reuse.
Translate complex analytical requirements from senior stakeholders into governed, self-service-ready data products.
Drive adoption of Copilot-assisted authoring in Power BI and Fabric; evaluate AI-assisted analytics features and recommend adoption roadmaps.
AI Data Enablement & Cross-functional Partnership
Partner with the Senior AI & ML Engineer to architect data foundations for LLM and ML use cases — defining feature stores, embedding pipelines, and vector-ready data products.
Ensure Gold layer datasets are structured, documented, and versioned in ways that make them immediately consumable by agentic AI and RAG pipelines.
Participate in cross-functional AI and data initiatives at a strategic level — representing the data engineering perspective in architecture forums and leadership reviews.
Define and lead master data management (MDM) initiatives to establish single sources of truth for key business entities.

Innovation, Standards & Thought Leadership

Maintain a technology radar for the data engineering practice; evaluate emerging tools and conduct experimentation and pilots to assess the capabilities
Present data architecture proposals and governance roadmaps to senior stakeholders with clarity and business context.
Foster a culture of data quality ownership across the organization — run knowledge sessions, author internal engineering guides, and build data literacy

What you need to bring:

Qualifications

Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, Mathematics, or a related technical discipline.
7 – 10 years of hands-on experience in data engineering, analytics engineering, or data architecture roles.
Proven track record of delivering production-grade data platforms — not just pipelines — with measurable business impact.
Deep, demonstrable expertise with Databricks (including Unity Catalog and Delta Live Tables) in a production environment.
Strong hands-on experience with Microsoft Fabric or its predecessor Azure Synapse Analytics.
Expert Power BI development skills including semantic model governance and deployment pipeline management.
Significant experience with Collibra or an equivalent enterprise data governance platform.
Advanced Python and PySpark skills; expert-level SQL.
Demonstrated experience leading cross-functional data initiatives and mentoring junior data engineers

Technical Skill Requirements

Data Transformation - PySpark (advanced), SQL (expert), Azure Data Factory, Fabric Data Factory; Medallion Architecture, data vault modelling, SCD types 1–6, streaming ingestion, late data handling
Data Quality & Governance - stewardship, policy management, lineage, business glossary, DQ framework design
Data Modelling - Star schema, snowflake schema, data vault 2.0, dimensional modelling, entity-relationship design; semantic layer governance; master data management (MDM) concepts
Reporting & BI - Power BI (expert DAX, certified datasets, deployment pipelines, RLS, gateway management); Fabric Power BI integration; Copilot in Power BI; self-service analytics governance
Programming - Python (advanced), SQL (expert), PySpark (advanced); REST API design; JSON, Parquet, Delta, Avro formats; YAML-based pipeline configuration
Cloud & DevOps - Azure (ADF, ADLS Gen2, Azure ML, Synapse, Key Vault, APIM); CI/CD for data pipelines (GitHub Actions, Azure DevOps); Infrastructure-as-Code basics (Terraform, Bicep); cost optimization and performance tuning
Performance Optimization - Spark query optimization, partitioning, Z-ordering, liquid clustering, broadcast joins, caching strategies; Power BI datamodel compression and DAX query optimization
AI & Data Enablement - Feature store design, embedding pipeline architecture, vector-ready data product design; Microsoft Copilot in Fabric and Power BI; understanding of LLM data requirements for RAG and fine-tuning
Data Governance & Security - Unity Catalog access control, row/column level security, data classification (PII, confidential), GDPR / data residency considerations, audit logging, data retention policies
Leadership & Standards Architecture review, data engineering standards definition, technology radar management, cross-functional stakeholder communication, mentoring and upskilling junior engineers

#Financialservices

Additional Skills:

Accountability, Accountability, Action Planning, Active Learning, Active Listening, Agile Methodology, Agile Scrum Development, Analytical Thinking, Bias, Coaching, Creativity, Critical Thinking, Cross-Functional Teamwork, Data Analysis Management, Data Collection Management (Inactive), Data Controls, Design, Design Thinking, Empathy, Follow-Through, Group Problem Solving, Growth Mindset, Intellectual Curiosity (Inactive), Long Term Planning, Managing Ambiguity {+ 5 more}

What We Can Offer You:

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

Unconditional Inclusion

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

Let's Stay Connected:

Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

#india

Job:

Engineering

Job Level:

TCP_04

HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.

Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.

HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.

No Fees Notice & Recruitment Fraud Disclaimer

It has come to HPE’s attention that there has been an increase in recruitment fraud whereby scammer impersonate HPE or HPE-authorized recruiting agencies and offer fake employment opportunities to candidates. These scammers often seek to obtain personal information or money from candidates.

Please note that Hewlett Packard Enterprise (HPE), its direct and indirect subsidiaries and affiliated companies, and its authorized recruitment agencies/vendors will never charge any candidate a registration fee, hiring fee, or any other fee in connection with its recruitment and hiring process. The credentials of any hiring agency that claims to be working with HPE for recruitment of talent should be verified by candidates and candidates shall be solely responsible to conduct such verification. Any candidate/individual who relies on the erroneous representations made by fraudulent employment agencies does so at their own risk, and HPE disclaims liability for any damages or claims that may result from any such communication.

Senior AI Data Engineer

Related Jobs

Senior Documentation Engineer

Senior Engineer- Artificial Intelligence

Customer Support Specialist

Product Expert

Senior Solutions Architect

Senior Solutions Architect