About Manifest Global
Manifest Global is building the infrastructure for global human capital mobility — connecting students, schools, universities, and employers across 50+ countries. Our portfolio spans Cialfo (AI-powered college counseling, 2,000+ schools), BridgeU (university guidance for international schools globally), Kaaiser (trusted study abroad counseling across India and Southeast Asia), and Explore (AI-powered university outreach, 1,000+ university partners). Together, we move talent across borders at scale. $80M raised. Still early.
Cialfo's University Data Engineering team is the data backbone of everything students see on the platform — university profiles, course listings, entry requirements, fees, deadlines, rankings, and scholarship information across 544 partner universities and thousands more. Every piece of that data has to be collected, validated, and kept current.
We are hiring a Data Automation Engineer to own the automation function end-to-end — building the scrapers, AI-powered workflows, and data pipelines that replace manual data collection with reliable, production-grade automation. You will report to Engineering and work alongside the University Data Engineering team as the sole owner of the technical stack.
What makes this role different from a standard data engineering role: You will not be maintaining someone else's pipelines. You are building the function from scratch. The team has deep domain knowledge — they know what correct university data looks like, and they will QC your output. You bring the technical capability they do not have. Together, you replace hours of manual work per week. Your work ships directly to a product used by hundreds of thousands of students making university decisions.
Your first 90 days have a defined backlog. The first priority is a notification classification pipeline to handle 450 alerts per week, replacing 6 hours of daily manual signal vs. noise triage across the team. Closely behind that is a signal addressal workflow covering 150 signals per week, replacing 6 hours of daily core updates — research, format, verify, push. You will also build an automated quality audit agent that runs nightly across all recent updates, replacing 6–7 hours of daily manual data accuracy checks. Beyond that, you will own rankings and key stats ingestion across 4,441 universities, replacing the full manual collection cycle for QS, THE, and US News rankings, as well as entry requirements extraction from dynamic JavaScript-rendered pages across 150+ universities.
Beyond the initial backlog, you own the full 25-task automation portfolio — maintaining what is already built, extending scrapers when source sites change structure, and designing new automations as the team's data commitments grow.
We have a strong preference for candidates with LLM API usage in production — you have shipped something where Claude or OpenAI was doing real work (classification, extraction, structured output) and you have dealt with the accuracy and reliability problems that come with it. We are also looking for 4–6 years of relevant experience, since independent ownership in messy production environments takes time to develop. Familiarity with N8N or equivalent workflow automation is a plus — you should be able to read, edit, and build N8N workflows without a tutorial, though it does not need to be your primary tool. Experience with unstructured, inconsistent source data — PDFs, scraped HTML, university websites with no consistency across 500 sources — is highly relevant.
EdTech or university data domain knowledge is a plus, though the team teaches this faster than any candidate will self-learn it. SQL is useful for validation queries but is not required on day one.
The notification classification pipeline is live and saving the team 20+ hours per week. You shipped it, it is in production, it has monitoring. You have diagnosed and fixed at least one automation that broke in production without asking Engineering for help. The team comes to you with data collection problems, and you come back with working solutions — not questions about how to approach them. You have made at least one tooling decision that changed how the team operates, and you can explain clearly why you made it. University Data Engineering Leads trust your QC gates without checking every output. Your accuracy track record has earned that.
Cialfo serves hundreds of thousands of students making one of the most important decisions of their lives. The quality of the data on the platform directly affects what universities they see, whether application deadlines are accurate, and whether the fees they are planning around are correct.
The team has deep domain knowledge and operational discipline. What it does not have is the technical capability to automate the work that should not be manual. You are that capability — not a support function, but the reason the team can operate at a scale it currently cannot.
The work is real production automation problems: scrapers covering 4,441 universities, AI classifiers handling 450 alerts per week, quality agents running nightly across every recent data update. The team knows the domain deeply and will QC your work honestly. When something is wrong, you will hear it — that is a feature, not a bug. Every automation you ship converts manual hours into expanded data coverage — more universities, more countries, more students.