Synechron

Data Engineer - Bigdata/Hadoop

Mumbai Full time

Overall Responsibilities:

  • Design and develop large-scale, fault-tolerant data pipelines using Hadoop and related technologies
  • Optimize data workflows for performance and scalability
  • Collaborate with data scientists, analysts, and stakeholders to understand data needs
  • Maintain data quality, security, and governance standards
  • Analyze existing data architecture and recommend improvements
  • Automate data extraction, transformation, and loading processes
  • Troubleshoot and resolve data pipeline issues promptly
  • Document data systems, architecture, and processes

Software Requirements:

  • Strong proficiency in Hadoop ecosystem components like HDFS, MapReduce, YARN
  • Experience with Apache Spark, Hive, Pig, and optionally Kafka
  • Skilled in programming languages: Java, Scala, Python
  • Familiarity with ETL tools and pipelines
  • Knowledge of SQL and NoSQL databases (e.g., HBase, Cassandra)
  • Experience with cloud platforms (AWS, GCP, Azure) is a plus
  • Version control tools such as Git

Category-wise Technical Skills:

  • Big Data Processing: Hadoop (HDFS, MapReduce), Apache Spark, Hive, Pig
  • Programming Languages: Python, Scala, Java
  • Data Storage Solutions: HDFS, HBase, Cassandra
  • Messaging & Streaming Technologies: Kafka, Flink (optional)
  • Data Integration & ETL Tools: Apache NiFi, Airflow, Sqoop
  • Cloud & DevOps Platforms: AWS EMR, S3, Azure Data Lake, GCP Dataproc
  • Databases & Querying: SQL, NoSQL databases
  • Version Control & Collaboration: Git, Bitbucket

Experience:

  • Minimum 5+ years of professional experience in data engineering, with a proven track record of working within Hadoop ecosystems and big data platforms.

Day-to-Day Activities:

  • Building and maintaining data pipelines and workflows
  • Monitoring data pipeline performance and reliability
  • Collaborating with cross-functional teams to gather requirements and deliver solutions
  • Performing data extraction, transformation, and loading (ETL)
  • Optimizing storage and compute processes
  • Troubleshooting system issues and providing support
  • Keeping up-to-date with emerging Big Data trends and tools

Qualifications:

  • Bachelor's or Master’s degree in Computer Science, Information Technology, or related fields
  • Certification in Big Data Technologies is a plus (e.g., Cloudera, Hortonworks)

Soft Skills:

  • Strong analytical and problem-solving skills
  • Excellent communication and collaboration abilities
  • Ability to work independently and as part of a team
  • Adaptability to fast-changing technology landscapes
  • Attention to detail and commitment to quality
  • Good time management and prioritization skills

S​YNECHRON’S DIVERSITY & INCLUSION STATEMENT
 

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice