Line of Service

Advisory

Industry/Sector

Not Applicable

Specialism

Data, Analytics & AI

Management Level

Senior Associate

Job Description & Summary

At PwC, our people in data and analytics engineering focus on leveraging advanced technologies and techniques to design and develop robust data solutions for clients. They play a crucial role in transforming raw data into actionable insights, enabling informed decision-making and driving business growth.

Those in data and automation at PwC will focus on automating data internally using automation tools or software to streamline data-related processes within the organisation. Your work will involve automating data collection, data analysis, data visualisation, or any other tasks that involve handling and processing data.

*Why PWC

At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us.

At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations. "

Job description

We are seeking a talented and experienced GCP Data Engineer to design, build, and maintain scalable and reliable data pipelines and infrastructure on the Google Cloud Platform. In this role, you will be crucial in transforming raw data into actionable insights, enabling data-driven decision-making across the organization by working closely with data scientists, analysts, and other stakeholders.

· Design, develop, and maintain high performance ETL/ELT pipelines using Pyspark, Python, SQL

· Build & optimize distributed data processing workflows on cloud platforms (GCP or Azure)

· Develop and maintain batch and real-time ingestion including integration with Kafka

· Ensure Data Quality and metadata management across data pipelines

· Monitor and tune data systems and queries for optimal performance, cost-efficiency, and reliability.

· Automate data workflows and processes using tools like Cloud Composer (Apache Airflow) and leverage Cloud Monitoring/Logging for troubleshooting and operational efficiency.

Mandatory Skill Sets:

· Data engineering with 4-8 years of experience with strong proficiency in PySpark, Python, SQL.

· Hands-on experience with GCP especially on the services like BigQuery, DataProc, Cloud Storage, Composer, Dataflow

· Strong understanding of data warehousing concepts, data modelling & ETL/ELT processes and expertise in Datawarehouse / Datalake / lakehouse architecture

· Familiarity with big data processing frameworks like Apache Spark and should have experience in Apache Kafka

· Experience with version control tools like Git and CI/CD pipelines.

Good to have:

· Experiences with DBT – in building models, testing & deployments

· Should have knowledge on Data modelling

· Good to have exposure on Docker and deployments on GCP

· Good to have hands-on with Pub/sub, Cloud run

· Exposure to streaming workloads

· Good to have hands-on exposure Java – core

Preferred Skill Sets

· Analytical and problem-solving skills

· Ability to work in agile

· Communication and stakeholder management skills

· Accountability & ownership

DE - Cortex

Data engineer with hands-on expertise on Google Cloud Cortex Framework focusing on data integration, analytics, and AI/ML solutions using SAP data on Google Cloud Platform (GCP).

Responsibilities

· Design, build, and deploy enterprise-grade data solutions that bridge SAP and Google Cloud environments using the Cortex Framework's reference architecture and deployment accelerators.

· Utilize tools like SAP SLT Replication Server, the BigQuery Connector for SAP, and Dataflow pipelines to ingest, transform, and load (ETL/ELT) SAP data into BigQuery.

· Leverage predefined data models, operational data marts in BigQuery, and tools like Looker to create dashboards and deliver actionable business insights from unified SAP and non-SAP data.

· Implement machine learning templates and integrate AI models (potentially using Vertex AI) to optimize business outcomes and enable advanced analytics.

· Work with business stakeholders, engineering teams, and partners to ensure solutions meet business needs and are scalable, cost-effective, and compliant with best practices.

· Monitor system performance, troubleshoot data pipeline issues, and implement best practices for data governance and security within the GCP environment.

· Strong hands-on experience with Google Cloud Platform (GCP) services, especially BigQuery, GCS, Dataflow, Cloud Composer and Vertex AI.

· Proficiency in SAP systems (SAP ECC or S/4HANA) and SAP data extraction methods.

· Expertise in SQL and Python programming languages.

· Hands-on expertise on Looker Studio (LookML)

· Familiarity with data governance, data modeling, and security principles.

· Specific knowledge of the Google Cloud Cortex Framework for SAP integration and analytics is a mandatory skill.

· Experience deploying Cortex Framework components from the official GitHub repository.

Mandatory Skill Sets

· Should have knowledge on Data modelling

· Good to have exposure on Docker and deployments on GCP

· Good to have hands-on with Pub/sub, Cloud run

· Exposure to streaming workloads

· Good to have hands-on exposure Java – core

Preferred Skill Sets

· Analytical and problem-solving skills

· Ability to work in agile

· Communication and stakeholder management skills

· Accountability & ownership

· Design and Development: Design, build, and maintain robust, scalable ETL/ELT data pipelines and data architectures on GCP.

· Data Processing: Implement batch and real-time data processing solutions using GCP-native services such as BigQuery, Dataflow, Dataproc, and Pub/Sub.

· Data Storage and Management: Select, manage, and optimize appropriate data storage solutions (e.g., Cloud Storage, BigQuery, Cloud SQL, Bigtable) based on performance, scalability, and cost.

· Data Quality and Governance: Implement best practices for data quality, integrity, and security across all data systems, ensuring compliance with relevant regulations (e.g., GDPR, HIPAA).

· Performance Optimization: Monitor and tune data systems and queries for optimal performance, cost-efficiency, and reliability.

· Automation and Monitoring: Automate data workflows and processes using tools like Cloud Composer (Apache Airflow) and leverage Cloud Monitoring/Logging for troubleshooting and operational efficiency.

· Collaboration: Partner with data scientists to operationalize machine learning models and collaborate with business analysts to understand data requirements and deliver tailored solutions.

· Documentation: Create and maintain clear documentation for data architectures, pipelines, and processes.

GCP Data Engineer Roles and Responsibilities in 2025

Dec 1, 2023 — GCP Data Engineer Roles and Responsibilities. A GCP Data Engineer is responsible for GCP Data Engineer roles and respon...

GCP Masters

Data Engineer - GCP Job Description Template - Expertia AI

Data Engineer - GCP Job Description Template. As a Data Engineer specializing in GCP, you will be responsible for designing, devel...

Required Qualifications and Skills

· Education: Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.

· Experience: Proven experience (typically 3+ years) as a Data Engineer, with a strong focus on Google Cloud Platform (GCP).

· Technical Skills:

o Proficiency in programming languages, particularly Python and SQL.

o Hands-on experience with core GCP data services: BigQuery, Dataflow, Pub/Sub, and Cloud Storage.

o Strong understanding of data warehousing concepts, data modeling (e.g., schema design, dimensional modeling), and ETL/ELT processes. o Familiarity with big data processing frameworks like Apache Spark or Apache Beam.

o Experience with version control tools like Git and CI/CD pipelines.

· Soft Skills:

o Excellent problem-solving and analytical abilities with keen attention to detail.

o Strong communication and collaboration skills to work effectively with cross-functional teams.

· Design, develop, and maintain high performance ETL/ELT pipelines using Pyspark, Python, SQL and DBT

· Build & optimize distributed data processing workflows on cloud platforms (GCP or Azure)

· Develop and maintain batch and real-time ingestion including integration with Kafka

· Ensure Data Quality and metadata management across data pipelines

· Monitor and tune data systems and queries for optimal performance, cost-efficiency, and reliability.

· Automate data workflows and processes using tools like Cloud Composer (Apache Airflow) and leverage Cloud Monitoring/Logging for troubleshooting and operational efficiency.

• Data engineering with 4-8 years of experience with strong proficiency in PySpark, Python, SQL. • Hands-on experience with GCP especially on the services like BigQuery, DataProc, Cloud Storage, Composer, Dataflow • Strong understanding of data warehousing concepts, data modelling & ETL/ELT processes with expertise in Datawarehouse / Datalake / lakehouse architecture • Familiarity with big data processing frameworks like Apache Spark and should have experience in

Apache Kafka • Experience with version control tools like Git and CI/CD pipelines.

Data engineering with 4–8 years of experience with strong proficiency in PySpark, Python, SQL, and DBT

Hands-on experience with GCP services such as BigQuery, DataProc, Cloud Storage, Composer, and Dataflow

Strong understanding of data warehousing concepts, data modelling, and ETL/ELT processes with expertise in Data Warehouse / Data Lake / Lakehouse architecture

Familiarity with big data processing frameworks like Apache Spark and experience with Apache Kafka

Experience with version control tools such as Git and CI/CD pipelines

Looking for Senior Looker resource with GraphQL exposure, to design and implement advanced BI solutions that unify relational and graph data models with generative AI capabilities.

· Develop complex LookML models that interface with SpannerGraph and Neo4j to visualize highly connected data.

· Design interactive dashboards in Looker and Power BI that translate multi-dimensional graph relationships into actionable business intelligence.

· Lead the architectural design of a unified semantic layer that bridges traditional SQL databases and graph-based insights.

· Build and deploy Agentic AI systems using the Google Cloud Agent Development Kit (ADK) to automate complex data reasoning tasks.

· Use Dataplex to manage data quality, governance, and metadata across distributed data silos.

Good Experience with Looker & LookML.

o Hands-on experience with Graph Databases (Neo4j, SpannerGraph) and GQL.

o Looker & LookML and preferably with other BI skills like power BI would be great to have

o Multi-platform BI proficiency (Looker + Power BI).

· Good to Have:

o Developing multi-agent systems using Google ADK and Vertex AI Agent Builder.

o Proficiency in Vertex AI Studio, Dataplex, and embedding models.

o Advanced SQL, Python and orchestration of GCP workloads.

· Other Expectations:

o Analytical and problem-solving skills

o Ability to work in agile

o Communication and stakeholder management skills

o Accountability & ownership

Year of Experience required:4Years to 8Years

Education Qualification :BE,B.Tech,MCA.

Education (if blank, degree and/or field of study not specified)

Degrees/Field of Study required: Bachelor of Engineering

Degrees/Field of Study preferred:

Certifications (if blank, certifications not specified)

Required Skills

GCP Dataflow

Optional Skills

Accepting Feedback, Accepting Feedback, Active Listening, Agile Methodology, Alteryx (Automation Platform), Analytical Thinking, Automation, Automation Framework Design and Development, Automation Programming, Automation Solutions, Automation System Efficiency, Business Analysis, Business Performance Management, Business Process Automation (BPA), Business Transformation, C++ Programming Language, Communication, Configuration Management (CM), Continuous Process Improvement, Creativity, Daily Scrum, Data Analytics, Data Architecture, Data-Driven Insights, Data Ingestion {+ 34 more}

Desired Languages (If blank, desired languages not specified)

Travel Requirements

Available for Work Visa Sponsorship?

Government Clearance Required?

Job Posting End Date

IN_Senior Associate_GCP,BigQuesry,GCS,DataFlow_Data and Analytics_Advisory_Bangalore

Related Jobs

Receptionist

Lead Toddler Teacher — Includes a Sign-On Bonus!

Front-End Engineer II - Budget Specialization

Staff Engineer, System Integration

Software Development Lead

Analytics Engineer