We are tech transformation specialists, uniting human expertise with AI to create scalable tech solutions.

With over 7.400 CI&Ters around the world, we’ve built partnerships with more than 1,000 clients during our 30 years of history. Artificial Intelligence is our reality.

We are looking for a highly skilled Databricks Data Architect to lead the design, development, and implementation of a modern hub-and-spoke data platform for a leading retail company. This platform will serve as the foundation for advanced analytics and real-time personalization, enabling data-driven decision-making and customer engagement.

The role involves working closely with business stakeholders, data engineers, and product teams to design scalable, secure, and efficient data solutions on Databricks and the modern data stack. The architect will also drive the adoption of DataOps practices such as CI/CD pipelines, data contracts, quality checks, and observability.

Position Overview

Architect and Design a hub-and-spoke data platform on Databricks to integrate structured, semi-structured, and unstructured data from multiple sources.

Enable Advanced Analytics & Real-Time Use Cases such as personalization, customer 360, recommendation systems, and marketing attribution.

Implement DataOps Best Practices, including CI/CD pipelines, version-controlled workflows, automated testing, data quality monitoring, and observability.

Define and Enforce Data Contracts to ensure schema consistency and reliable data exchange between producers and consumers.

Optimize Data Processing using Databricks Delta Lake, Spark, and streaming technologies for both batch and real-time pipelines.

Integrate with Cloud-Native Ecosystems (AWS/Azure/GCP) for scalable storage, compute, security, and orchestration.

Collaborate Across Teams including business stakeholders, analytics, ML, and engineering teams to align data architecture with business goals.

Ensure Security and Compliance with industry standards (e.g., GDPR, CCPA) for sensitive retail and customer data.

Required Skills and Qualifications:

Must-have Skills:

Expertise in Databricks (Delta Lake, Spark, Databricks SQL, Unity Catalog, MLflow).

Strong background in data architecture and design of large-scale, distributed systems.

Proven experience building data pipelines (batch and streaming) with technologies like Delta Live Tables, Spark Structured Streaming, Kafka, Event Hubs, or Kinesis.

Solid understanding of DataOps practices: CI/CD for data workflows, testing frameworks, monitoring, and observability tools.

Proficiency in SQL and Python (PySpark).

Hands-on experience with cloud platforms (AWS, Azure, or GCP) including storage, networking, IAM, and security best practices.

Knowledge of data governance, lineage, and cataloging solutions (e.g., Unity Catalog, Collibra, Alation).

Ability to integrate APIs, third-party systems, and retail platforms into the data ecosystem.

Advanced English is essential.

Understanding of the entire business side.

Desired Qualitifications

Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.

Retail industry experience with customer data, personalization, and omnichannel analytics use cases.

Experience implementing real-time decisioning systems or personalization engines.

Familiarity with modern orchestration tools (Airflow, dbt, Dagster, Prefect).

Experience with data quality frameworks (Great Expectations, Deequ, Soda).

Exposure to machine learning pipelines and MLOps best practices.

Excellent communication and stakeholder management skills.

Certifications in Databricks, Azure Data Engineer, AWS Big Data, or GCP Data Engineering.

#LI-BM2

[Job-25823] Data Tech Lead - Databricks, Brazil