Ci&t

[Job - 25641] Senior Data Platform Engineer - Real-Time Data Streaming, Colombia

Colombia Full Time
We are tech transformation specialists, uniting human expertise with AI to create scalable tech solutions.
With over 7,400 CI&Ters around the world, we’ve built partnerships with more than 1,000 clients during our 30 years of history. Artificial Intelligence is our reality. 

Location: Colombia (Remote)
Industry: Financial Services / Mortgage Lending
Project Type: Real-Time Data Platform Modernization

The Mission:

Build and operate a cutting-edge, near-real-time data platform that synchronizes data across enterprise systems. You'll be hands-on with modern streaming technologies, solving complex data engineering challenges that enable real-time decision-making, eliminate data silos, and modernize how the organization leverages data.

What You'll Do

Real-Time Data Pipeline Development (40%)
Design and deploy real-time data pipelines using Kafka Connect and Debezium CDC to stream changes from operational databases to Snowflake
Develop CI/CD pipelines to automate data pipeline deployments
Build ETL/ELT processes for both batch and streaming scenarios
Implement data quality validation and monitoring across streaming pipelines
Troubleshoot and optimize pipeline performance for reliability and efficiency

Kubernetes Infrastructure Management (30%)
Deploy and manage data infrastructure on Kubernetes clusters
Monitor connector health and troubleshoot issues in production environments
Maintain containerized applications and ensure high availability
Optimize resource allocation and implement scaling strategies
Handle production support and incident resolution

Data Warehouse & Modeling (20%)
Build and optimize Snowflake data models that power analytics and business intelligence
Design dimensional models (star/snowflake schemas) for analytics use cases
Write complex SQL queries and optimize performance and costs
Implement data governance and cataloging using DataHub - building data catalogs, lineage tracking, and discovery tools
Collaborate with data analysts and business stakeholders to understand requirements

Collaboration & Documentation (10%)
Work with data analysts, engineers, and business stakeholders across teams
Document data flows, transformations, and technical decisions
Participate in architecture discussions and technical reviews
Support knowledge sharing and best practices across the team

What You Need to Succeed

Core Requirements:
1. Advanced English Communication (Non-Negotiable)
Daily communication with US-based client teams
Professional written and verbal English for meetings, documentation, and collaboration
2. Snowflake Production Experience (Non-Negotiable)
You've built data warehouses in Snowflake production environments
Written complex SQL and optimized performance and costs
Experience with modern data warehouse architecture patterns
3. Real-Time Streaming Expertise
Hands-on production experience with Kafka and Kafka Connect
Understanding of CDC (Change Data Capture) patterns
Debezium experience is a strong plus
Stream processing and real-time data architecture knowledge
4. Kubernetes Operational Experience
You've deployed and maintained containerized applications in production
Experience troubleshooting and monitoring Kubernetes workloads
Understanding of container orchestration and cloud-native patterns
Important Data Engineering Skills:
5. Data Pipeline Development
ETL/ELT design and implementation for batch and streaming
Version control and CI/CD for data pipelines
Automated testing and deployment practices
6. Data Modeling
Dimensional modeling (star/snowflake schemas) for analytics
Data Lake/Warehouse architecture
Understanding modern data stack patterns
7. Problem-Solving & Independence
Debug complex data pipeline issues independently
Production support experience under pressure
Ability to navigate ambiguity and find solutions

Highly Valuable (Nice-to-Have):
1 - Data Governance & Cataloging
DataHub, Collibra, or Alation experience
Data lineage tracking and metadata management
Data discovery and cataloging best practices
2 - Advanced Stream Processing
Apache Flink or Flink SQL
Complex event processing patterns
State management in streaming applications
3 - Modern Data Stack Tools
Airbyte, Fivetran, or similar ingestion tools
dbt (data build tool) for transformations
Data quality frameworks and practices
4 - Open Table Formats
Apache Iceberg or Delta Lake experience
Understanding of lakehouse architectures
5 - Cloud Platforms & FinOps
AWS or Azure experience
Cloud cost optimization practices
Infrastructure as Code (Terraform, CloudFormation)
6 - Financial Services Domain
Mortgage or lending industry knowledge
Understanding of regulatory/compliance requirements
Financial data modeling experience

Why This Role?

Modern Tech Stack:
Work with the latest in streaming data and cloud data platforms
Hands-on with Kafka, Kubernetes, Snowflake, and DataHub
Real-time data architecture and event-driven systems

High Impact:
Your solutions will directly enable business decisions and operational efficiency
Eliminate data silos across the organization
Enable real-time insights for a major financial services company

Learning Opportunities:
Gain deep expertise in real-time data architecture
Advanced Kubernetes and cloud-native development
Data governance and modern data stack technologies

CI&T Culture:
Training budget for professional development and certifications
Access to learning platforms and technical courses
Career development programs and mentorship opportunities
Collaborative environment with talented engineers
Health and wellness benefits

The Ideal Candidate

You're likely a great fit if you can say:

✅ "I've built production data pipelines that process data in Snowflake"
✅ "I've worked with Kafka or similar streaming platforms in production"
✅ "I've deployed and managed applications on Kubernetes"
✅ "I'm comfortable with SQL, Python, and data modeling"
✅ "I communicate effectively in English with international teams"
✅ "I can work independently and debug complex issues"
✅ "I understand dimensional modeling and data warehouse design"

This might NOT be the right fit if:

❌ You don't have production Snowflake experience (it's non-negotiable here)
❌ Kafka and streaming technologies are completely new to you
❌ You're uncomfortable with Kubernetes and containerized environments
❌ English communication is a challenge for daily collaboration
❌ You prefer batch-only data processing over real-time streaming

Common Questions

"Do I need to have ALL the technologies listed?"
No - strong fundamentals in Snowflake, Kafka, and Kubernetes are essential. If you have those core skills and are eager to learn tools like DataHub or Flink, we encourage you to apply. The "nice-to-have" section represents growth opportunities, not mandatory requirements.

"Is Debezium experience mandatory?"
It's a strong plus but not mandatory. If you understand CDC patterns and have worked with Kafka Connect, you can learn Debezium on the job. However, general CDC and streaming experience is important.

"How much of this role is infrastructure vs. data engineering?"
About 40% data pipeline/streaming development, 30% Kubernetes/infrastructure management, 30% data modeling/Snowflake work. You'll need to be comfortable spanning infrastructure and data engineering responsibilities.

"What level of English is required?"
You'll need to communicate clearly in written and verbal English - daily meetings with US-based client teams, technical documentation, and async communication via chat/email. Doesn't need to be perfect, but professional communication is essential.

"What's the most challenging part of this project?"
Building reliable, near-real-time data pipelines that handle high volumes of change data capture events, ensuring data quality across streaming flows, and managing complex Kubernetes deployments - all while maintaining performance and cost efficiency.

"Will there be opportunities to work with new technologies?"
Yes! The modern data stack is constantly evolving. You'll have opportunities to explore tools like Apache Flink, Iceberg, and advanced data governance platforms as the project grows.

"Is this remote work or hybrid?"
Fully remote within Colombia. You'll collaborate with distributed teams across time zones, primarily aligning with US business hours for key meetings.

#LI-LO1