We are seeking a talented and experienced Big Data Hadoop Developer to join our growing data engineering team. The ideal candidate will have 4-6 years of hands-on experience designing, developing, and optimizing big data solutions using the Hadoop ecosystem, with a strong focus on Apache Spark. You will be responsible for building and maintaining scalable data pipelines, processing large datasets, and collaborating with data scientists and analysts to deliver insights.

Responsibilities:

Design, develop, and maintain robust and scalable ETL processes and data pipelines using Apache Hadoop and Apache Spark.
Write efficient, clear, and well-documented code primarily in Scala, Python, or PySpark for big data processing.
Implement data ingestion, transformation, and loading routines from various sources into Hadoop Distributed File System (HDFS) and other big data stores.
Optimize existing Spark jobs and Hadoop ecosystem components for performance and scalability.
Collaborate with data architects, data scientists, and other stakeholders to understand data requirements and translate them into technical solutions.
Ensure data quality, integrity, and security across all big data platforms.
Participate in code reviews, testing, and deployment of big data applications.
Troubleshoot and resolve issues in big data environments.
Stay up-to-date with the latest trends and technologies in the big data ecosystem.

Qualifications:

Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field.
3-4 years of professional experience in Big Data development.
Proven experience with the Hadoop ecosystem, including HDFS, YARN, Hive, and other related technologies.
Hands on experience in SQL and shell scripting
Strong expertise in Apache Spark for data processing and analysis.
Proficiency in at least one of the following programming languages: Scala, Python, or PySpark.
Experience with building and optimizing large-scale data pipelines.
Familiarity with data warehousing concepts and ETL methodologies.
Solid understanding of distributed computing principles.
Excellent problem-solving skills and attention to detail.
Ability to work independently and as part of a collaborative team.

Preferred Qualifications:

Experience with cloud-based big data services (e.g., AWS EMR, Azure HDInsight, Google Cloud Dataproc).
Experience with Databricks platform.
Knowledge of other big data tools like Kafka, HBase, Flink, or Presto.
Experience with SQL and NoSQL databases.
Familiarity with CI/CD practices and tools (e.g., Git, Jenkins).
Understanding of machine learning concepts and how they apply to big data.

Education:

Bachelor’s degree/University degree or equivalent experience

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Big Data Engineer - (Scala | Spark | Databricks | Cloud)

Job Family Group:

Job Family:

Time Type:

Most Relevant Skills

Other Relevant Skills

Related Jobs

Registered Nurse Outpatient Oncology

Director of Product Management

Data Modeler

Data Governance Engineer

Entry Level Outside Sales Representative

Entry Level Outside Sales Representative