It's fun to work in a company where people truly BELIEVE in what they are doing!
We're committed to bringing passion and customer focus to the business.
About Fractal
Founded in 2000, Fractal Analytics (www.fractal.ai) is a strategic analytics partner to the most admired Fortune 500 companies globally and helps them power every human decision in the enterprise. Fractal currently has 5500+ employees across 18 global locations, including the United States, UK, Ukraine, India, Singapore, and Australia. Fractal has been recognized as 'Great Workplace' and 'India's Best Workplaces for Women' in the top 100 (large) category by The Great Place to Work® Institute; featured as a leader in Customer Analytics Service Providers Wave™ 2021, Computer Vision Consultancies Wave™ 2020 & Specialized Insights Service Providers Wave™ 2020 by Forrester Research Inc., a leader in Analytics & AI Services Specialists Peak Matrix 2022 by Everest Group and recognized as an 'Honorable Vendor' in 2022 Magic Quadrant™ for data & analytics by Gartner Inc. For more information, visit https://fractal.ai/
Job Description:
Need someone with strong Data Engineering skillet to ensure production (operations/support) related activities are delivered as per SLA. Need to work on issues/requests, bug fixes, minor changes, co-ordinate with the development team in case of any issues, work on enhancements.
Role Details
You will be part of the operation team providing L2 support to a client working in specified business hours or working in a 24*7 support model.
Provide Level-2 (L2) technical support for data platforms and pipelines built on Azure Data Factory (ADF), Databricks, SQL, and Python. This role involves advanced troubleshooting, root cause analysis, code-level fixes, performance tuning, and collaboration with engineering teams to ensure data reliability and SLA compliance. Must adhere to ITIL processes for Incident, Problem, and Change management.
Key Responsibilities
Advanced Troubleshooting & RCA
- Investigate complex failures in ADF pipelines, Databricks jobs, and SQL processes beyond L1 scope.
- Perform root cause analysis for recurring issues, document findings, and propose permanent fixes.
- Debug Python scripts, SQL queries, and Databricks notebooks to resolve data ingestion and transformation errors.
- Analyze logs, metrics, and telemetry using Azure Monitor, Log Analytics, and Databricks cluster logs.
Code-Level Fixes & Enhancements
- Apply hotfixes for broken pipelines, scripts, or queries in non-production and coordinate controlled deployment to production.
- Optimize ADF activities, Databricks jobs, and SQL queries for performance and cost efficiency.
- Implement data quality checks, schema validation, and error handling improvements.
Incident & Problem Management
- Handle escalated incidents from L1; ensure resolution within SLA.
- Create and maintain Known Error Database (KEDB) and contribute to Problem Records.
- Participate in Major Incident calls, provide technical insights, and lead recovery efforts when required.
Monitoring & Automation
- Enhance monitoring dashboards, alerts, and auto-recovery scripts for proactive issue detection.
- Develop Python utilities or Databricks notebooks for automated validation and troubleshooting.
- Suggest improvements in observability and alert thresholds.
Governance & Compliance
- Ensure all changes follow ITIL Change Management process and are properly documented.
- Maintain secure coding practices, manage secrets via Key Vault, and comply with data privacy regulations.
Technical skills
- Azure Data Factory (ADF): Deep understanding of pipeline orchestration, linked services, triggers, and custom activities.
- Databricks: Proficient in Spark, cluster management, job optimization, and notebook debugging.
- SQL: Advanced query tuning, stored procedures, schema evolution, and troubleshooting.
- Python: Strong scripting skills for data processing, error handling, and automation.
- Azure Services: ADLS, Key Vault, Synapse, Log Analytics, Monitor.
- Familiarity with CI/CD pipelines (Azure DevOps/GitHub Actions) for data workflows.
Non-technical skills
- Strong knowledge of ITIL (Incident, Problem, Change).
- Ability to lead technical bridges, communicate RCA, and propose permanent fixes.
- Excellent documentation and stakeholder communication skills.
- Drive Incident/Problem resolution by assisting in key operational activities in terms of delivery, fixes, and supportability with operations team.
- Experience working in ServiceNow is preferred.
- Attention to detail a must, with focus on quality and accuracy.
- Able to handle multiple tasks with appropriate priority and strong time management skills.
- Flexible about work content and enthusiastic to learn.
- Ability to handle concurrent tasks with appropriate priority.
- Strong relationship skills to work with multiple stakeholders across organizational and business boundaries at all levels.
Education Qualification and Certifications required, If any
Certifications (Preferred):
- Microsoft Certified: Azure Data Engineer Associate (DP-203)
- Databricks Certified Data Engineer Associate
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!
Not the right fit? Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!