About this role
Role Overview:
We’re seeking a Site Reliability Engineering (SRE) Lead to design, build, and maintain resilient, high-scale systems supporting BlackRock’s Private Markets platform. In this hands-on leadership role, you’ll apply deep engineering expertise to solve complex challenges, guide a global team, shape technical direction, and communicate effectively with senior stakeholders—ensuring the reliability of mission-critical systems that power private market investment workflows and decision-making. You will drive the adoption of AI-driven solutions to accelerate incident detection and triage, reduce toil, improve forecasting and capacity planning, and strengthen end-to-end observability and resilience.
Role Responsibilities
Take ownership of project priorities, deadlines and deliverables using Agile methodologies, with clear outcomes around reliability automation and AI-enabled operations
Understand and refine business and functional requirements, translating them into SLOs/SLIs and AI-assisted observability and support capabilities
Hands on approach to getting work done—this role requires a “roll your sleeves up” mentality, including building and operationalizing reliability tooling and automation that measurably reduces toil and improves stability
Be a leader with vision and a partner in brainstorming solutions for team productivity and efficiency to improve engineering effectiveness
Drive priority setting of the engineering teams, balancing foundational reliability work with delivery of new product features
Improve Engineering culture by encouraging continuous focus on reliability across the entire application lifecycle, and by adopting AI-enabled SRE practices (e.g., intelligent alerting, automated diagnosis, and self-healing where appropriate)
Proactive participant in architectural and design decisions, including AI-ready telemetry, data quality, and model integration patterns for operational analytics
Design and implement end-to-end monitoring solutions for application and infrastructure components, leveraging modern observability platforms plus AI/ML techniques for anomaly detection, correlation, and alert noise reduction
Drive the engineering of capacity management and demand forecasting solutions, including predictive analytics/ML approaches where they add measurable value
Act as a culture carrier and leader, passing on SRE knowledge and best practices to the engineering team
Drive detailed root cause investigations for production incidents with rigorous focus on issue avoidance, using AI-assisted correlation/analysis to accelerate time-to-insight
Create/coordinate retros for significant incidents, ensuring learnings are captured in automated/AI-assisted runbooks and embedded into prevention mechanisms
Additional core engineering functions, such as adding custom telemetry metrics/logs/traces to the code base of in-scope applications to enable AI/ML-driven operational insights
Anticipate new opportunities to continuously evolve the resiliency profile of scoped applications and infrastructure
Skills/Qualifications
Must Have
B.S. / M.S. degree in Computer Science, Engineering or a related discipline with 10+ years of experience
Experience leading high performing engineering/SRE teams, with a track record of driving continuous improvement through automation and AI-enabled operations
Demonstrated ability to represent engineering/SRE priorities, status, and risk to senior leadership stakeholders with clear, executive-ready communication
Hands-on experience building or operating AI-assisted capabilities (AIOps, ML-based anomaly detection, or GenAI workflows) in an engineering/production environment
A passion for providing engineering support for highly available, performant full stack applications with a “Student of Technology” attitude
Experience with relational database and NoSQL Database (e.g. Redis, Apache Cassandra)
Our benefits
To help you stay energized, engaged and inspired, we offer a wide range of employee benefits including: retirement investment and tools designed to help you in building a sound financial future; access to education reimbursement; comprehensive resources to support your physical health and emotional well-being; family support programs; and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.
Our hybrid work model
BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.
About BlackRock
At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being. Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.
This mission would not be possible without our smartest investment – the one we make in our employees. It’s why we’re dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive.
For additional information on BlackRock, please visit @blackrock | Twitter: @blackrock | LinkedIn: www.linkedin.com/company/blackrock
BlackRock is proud to be an Equal Opportunity Employer. We evaluate qualified applicants without regard to age, disability, race, religion, sex, sexual orientation and other protected characteristics at law.