T. Rowe Price

Principal Site Reliability Engineer, Infrastructure Observability

Owings Mills, MD Full time

At T. Rowe Price, we identify and actively invest in opportunities to help people thrive in an evolving world. As a premier global asset management organization with more than 85 years of experience, we provide investment solutions and a broad range of equity, fixed income, and multi-asset capabilities to individuals, advisors, institutions, and retirement plan sponsors. We take an active, independent approach to investing, offering our dynamic perspective and meaningful partnership so our clients can feel more confident. 

We believe doing the right thing for our clients and our associates is good business. With a career at the firm, you can expect opportunities to create real impact at work and in your community. You’ll enjoy resources to support your career path, as well as compensation, benefits, and flexibility to enrich your life. Here, you’ll find a collaborative culture that respects and values differences and colleagues who share a spirit of generosity 

Join us for the opportunity to grow and make a difference in ways that matter to you. 

Role Summary

In this role as Principal Site Reliability Engineer, Infrastructure Observability you will help formulate, develop, and implement a team of Site Reliability Engineers (SREs) focused on the observability, sustainability, scalability, measurability and recoverability of T. Rowe Price’s innovative cloud & on-prem solutions by leveraging automation and best-of-breed tools. The successful candidate will have a strong operations & engineering background, is hands-on when needed, and has expertise in the cloud environments (public, private), infrastructure operations, DevOps practices, CI/CD toolchain and systems, code build and deployment, incident response, and 24x7 monitoring and support.

The candidate will also have extensive experience operating within a SRE function within a complex, distributed environment. They will have a demonstrated ability to work horizontally and vertically within an organization with diverse partners and sponsor groups.

Responsibilities

  • Possesses extensive knowledge in own area of expertise and extensive in-depth knowledge of the broader portfolio for comprehensive understanding of up/downstream impacts across technology infrastructure
  • Responsibility for the design of technology solutions to prevent or minimize service disruptions
  • Prevents technology service disruptions through technology solution recommendations and automations
  • Fosters a culture of deep learning through blameless post-mortems to improve the shared goal of reliability across services
  • Transform operations teams by facilitating internal change to adopt SRE standard methodologies across the organization and driving strategic growth in this area within Global Technology
  • Analyzes incidents impacting technology availability for high-level trends across the broad portfolio
  • Drive initiatives to reduce or prevent technology failures in a complex, distributed technology environment
  • Pulls together information from disconnected systems into cohesive views of the technology portfolio for identifying trends, redundancies, and risk
  • Demonstrates outstanding awareness of the complexities of the tech and asset management industries
  • May lead initiatives of varying degrees of complexity that span multi-functional areas and of varying degrees of complexity
  • Contributes to definition of target state architecture and design of the technology environment

Qualifications
Required:

  • Bachelor's degree or the equivalent combination of education and relevant experience AND 10+ years of experience designing and operating cloud infrastructure with senior‑level impact.
  • 5+ years building and supporting solutions in Amazon AWS
  • 5+ years of experience building and running a DevOps and/or SRE function
  • Experience with implementation and operation of the chaos model at scale
  • Strategic and program-level implementation experience
  • Demonstrable experience implementing new technology, tools, and platforms
  • System administration and scripting experience
  • Demonstrable experience leveraging automation to proactively prevent or quickly remediate incidents
  • Fluent in multiple programming languages (e.g., Python, Java, GO, Node.js, .Net Core, etc)
  • Proficiency with database development (SQL Server, PostgreSQL, MySQL, etc)
  • Proficiency with defining, right-sizing, tracking, and reporting on Service Level Objectives (SLOs), Service Level Indicators (SLIs), system availability, and the progress and outcomes related to reliability
  • Experience with implementing and managing Error Budgets
  • Proficiency with understanding and explaining incident situations and their recovery plans to prevent recurrence
  • Knowledge/experience driving dashboard standardization across the ecosystem for observability, APM and infrastructure monitoring, and application-specific logging
  • Knowledge/experience with observability tools such as New Relic, SolarWinds DPA, Elastic Stack, Prometheus, Grafana, Splunk, and cloud native tools
  • Knowledge/experience with cloud management tools such as Ansible, Terraform, Vault, and Vagrant
  • Works independently, with guidance in only the most complex situations
  • Makes sound decisions with limited facts or resources
  • Balances strategic and pragmatic concerns when solving problems
  • Adjusts communication style and materials to suit a given audience
  • Able to clearly articulate operational principles, practices, and policies
  • Stays abreast of industry trends and technologies
  • Accountable for work of self and others; sets standards around which others will operate
  • Maintains a broad internal professional network and knows when to engage/activate it
  • Develops or mentor’s diverse talent on the team
  • Ability to be on-call and/or work during off-hours

Preferred:

  • Cloud or SRE‑related certifications
  • Working knowledge of Azure

Applicants for employment in the US must have work authorization that does not now or in the future require sponsorship of a visa for employment authorization in the United States (e.g., H1-B visa, F-1 visa (OPT), TN visa or any other non-immigrant work status)      

FINRA Requirements  
FINRA licenses are not required and will not be supported for this role.   
 
Work Flexibility  
This role is eligible for hybrid work, with up to three days per week from home.  

Base Salary Ranges

Please review the job posting for the location of this specific opportunity.

$159,000.00 - $272,000.00 for the location of: Maryland, Colorado, Washington and remote workers
$175,000.00 - $299,000.00 for the location of: Washington, D.C.
$199,000.00 - $339,000.00 for the location of: New York, California

Placement within the range provided above is based on the individual’s relevant experience and skills for the roleBase salary is only one component of our total compensation packageEmployees may be eligible for a discretionary bonus, which is determined upon company and individual performance.

Commitment to Diversity, Equity, and Inclusion

At T. Rowe Price, our associates are our greatest asset. We thrive because our company culture is built on inclusion and because we sustain a work environment where associates can bring their best selves to work every day. The backgrounds, talents, and experiences of our global associates allow us to embrace new ideas and perspectives that move our business priorities forward and enable us to deliver strong client outcomes. Here, you can expect equal opportunity and fair and consistent treatment for all. 

Benefits

We value your goals and needs, at work and in life. As an associate, you’ll be supported with resources, benefits, and work-life balance so you can thrive in ways that matter to you.   

  

Featured employee benefits to enrich your life:   

  • Competitive compensation  

  • Annual bonus eligibility  

  • A generous retirement plan  

  • Hybrid work schedule  

  • Health and wellness benefits, including online therapy  

  • Paid time off for vacation, illness, medical appointments, and volunteering days  

  • Family care resources, including fertility and adoption benefits  

  

T. Rowe Price is an equal opportunity employer and values diversity of thought, gender, and race. We believe our continued success depends upon the equal treatment of all associates and applicants for employment without discrimination on the basis of race, religion, creed, color, national origin, sex, gender, age, mental or physical disability, marital status, sexual orientation, gender identity or expression, citizenship status, military or veteran status, pregnancy, or any other classification protected by country, federal, state, or local law.