Site Reliability Engineering (SRE) Architect
Join us to do the best work of your career and make a profound social impact as a Site Reliability Engineering (SRE) Architect on our Site Reliability Engineering Team in Austin, Texas.
What you’ll achieve
We are seeking a highly experienced Site Reliability Engineering (SRE) Architect to lead the design, evolution, and reliability of our largescale distributed systems. The ideal candidate will demonstrate deep expertise in Dynatrace, AIOps platforms, observability engineering, and AIdriven automation, including handson development with AI agents and modern coding frameworks.
This is a technical leadership role requiring architecturelevel thinking, strong coding ability, and the ability to drive enterprisewide transformation.
Take the first step towards your dream career
Every Dell Technologies team member brings something unique to the table. Here’s what we are looking for with this role:
Essential Requirements
Architecture & Reliability Engineering
Design and architect highly reliable, scalable, and selfhealing systems across hybrid, multicloud, and onprem environments
Establish reliability patterns, guardrails, and architecture standards including SLIs, SLOs, error budgets, and resiliency patterns
Lead root cause prevention strategies, chaos engineering practices, and resilience validation frameworks
Observability & Dynatrace Expertise
Own endtoend observability strategy using Dynatrace, including:
Application Performance Monitoring (APM)
Infrastructure monitoring
Log analytics
Realuser monitoring (RUM)
Custom instrumentation and dashboards
Architect deterministic and AIdriven alerting, Davis AI configurations, and servicelevel dependency mapping
AIOps & Automation
Lead adoption and integration of AIOps platforms (Dynatrace Davis AI, ServiceNow AIOps, Moogsoft, or equivalent)
Build intelligent automation pipelines for:
Predictive incident detection
Autoremediation
Noise reduction and event correlation
Operational anomaly detection
Drive automation-first operations to reduce toil and improve operational efficiency
Coding & AI Agents
Develop and integrate AI agents capable of:
Automated troubleshooting
Intelligent runbook execution
Workflow automation
LLM-driven operational insights
Write highquality code in languages such as Python, Go, TypeScript, or Java
Build internal tools, automation frameworks, and platform APIs
CrossFunctional Leadership
Partner with SRE teams, platform engineering, application engineering, cybersecurity, and infrastructure groups
Provide architectural governance, participate in design reviews, and influence engineering standards
Mentor engineers on reliability, observability, and automation best practices
Desirable Requirements
• Bachelor’s degree with 12+ years of experience, Master’s or PhD with 8+ years of experience, or an equivalent combination of education and experience
Compensation
Dell is committed to fair and equitable compensation practices. The salary range for this position is $212,500 to $275,000.
Benefits and Perks of working at Dell Technologies
Your life. Your health. Supported by your benefits. You can explore the overall benefits experience that awaits you as a Dell Technologies team member — right now at MyWellatDell.com
Who we are
We believe that each of us has the power to make an impact. That’s why we put our team members at the center of everything we do. If you’re looking for an opportunity to grow your career with some of the best minds and most advanced tech in the industry, we’re looking for you.
Dell Technologies is a unique family of businesses that helps individuals and organizations transform how they work, live and play. Join us to build a future that works for everyone because Progress Takes All of Us.
Dell Technologies is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. Read the full Equal Employment Opportunity Policy here.