VCU Health

Enterprise Analytics Data Engineer II

Richmond, VA Full time
The Data Engineer is responsible for designing, developing and maintaining scalable data pipelines and data infrastructure transforming raw data into insights.

Leveraging experience in data engineering, data management, data operations and governance frameworks, this position will collaborate with cross-functional teams to architect scalable and efficient data pipelines, optimize data storage and retrieval systems, and ensure data quality and integrity throughout the data lifecycle.

Essential Job Statements

  • Designs and implements scalable data pipelines for ingesting, transforming, and loading multi-format, batch or streaming data from various sources into enterprise Data Lakehouse. 

  • Ensures data quality and integrity through data cleansing, validation, and transformation techniques. 

  • Automates data processing workflows using scripting languages and orchestration tools. 

  • Builds and deploys RESTful API’s and web services for data access and data requests. 

  • Helps define the data management strategy and roadmap for the organization. 

  • Collaborates with data architects, data engineers, and product managers to align data solutions with business strategy. 

  • Stays current with industry trends, vendor product offerings, and evolving data technologies to ensure the organization leverages the best modern data engineering tools. 

  • Develops and maintains technical design documents, coding best practices and standards to ensure consistent and reliable data management processes are in place. 

  • Responsible for maintaining legacy technologies and processes as new ones are adopted. 

  • Mentors junior Data Engineers. 

Patient Population

Not applicable to this position. 

Employment Qualifications 


Required Education: 

Bachelor’s Degree in Business, IT, Math, Data Analytics, Engineering, or related discipline 

Combination of education and experience may be considered in lieu of a degree.

Preferred Education: 

Master’s degree in Business, IT, Math, Data Analytics, Engineering or related discipline 

Licensure/Certification Required: 

Licensure/Certification Preferred: 

Minimum Qualifications 

Years and Type of Required Experience 

5+ years of experience in Data Engineering, Data Management or Data Warehouse/Lakehouse design and development. 

3+ years of experience in Python/PySpark for building and maintaining data pipelines.  

1+ years of experience with cloud based modern data platforms and services (e.g., Azure, GCP or AWS). 

Other Knowledge, Skills and Abilities Required: 

Strong knowledge of database systems, data modeling techniques, and SQL proficiency. 

Experience creating and enhancing ETL and ELT processes. 

Experience working with JSON/YAML/XML/Parquet and other open-source data interchange and storage formats. 

Experience working with big data technologies like Spark, Kafka, NoSQL DB, Splunk etc. 

Experience with agile development processes and concepts. 

Knowledge of DevOps practices and CI/CD processes. 

Knowledge of Microservices architecture, Data Product and Data Mesh concepts. 

Other Knowledge, Skills and Abilities Preferred: 

Displays intellectual curiosity and integrity. 

Working Conditions 

Physical Requirements   

Physical Demands:  

Work Position: Sitting

Additional Physical Requirements/ Hazards    

Physical Requirements:

Hazards:

Mental/Sensory – Emotional     

Mental / Sensory: Strong Recall, Reasoning, Problem Solving, Hearing, Speak Clearly, Write Legibly, Reading, Logical Thinking 

Emotional: Fast-paced environment, Able to Handle Multiple Priorities, Frequent and Intense Customer Interactions, Able to Adapt to Frequent Change 

EEO Employer/Disabled/Protected Veteran/41 CFR 60-1.4.