Pacific Life

Sr. Site Reliability Engineer II

Newport Beach CA-700 Full time

Job Description:

The Site Reliability Engineering team provides leadership, direction and accountability for platform engineering, system design and end-to-end implementation to meet and exceed the product or platform non-functional requirements including quality, security, reliability, availability and performance.  SREs enable development teams to focus on releasing products with reliability and velocity. 

The Sr II Site Reliability Engineer (SRE) on our dynamic SRE team focuses on driving the SRE charter by using software engineering to enable automation and efficiency in all aspects of platform change management and operations. The main responsibilities include, but are not limited to, optimizing design and engineering for new system and enhancements, including processes and day to day activities, to reliably support product rollout and operation in production. He/she will mentor other staff SRE to adopt and implement the DevSecOps culture throughout the enterprise 

You will have opportunities to gain experience and knowledge in different aspects of devops challenges and implementation to enhance developer workflow and production stability. You will collaborate with other senior team members to evangelize and drive adoption of the SRE mindset and system engineering practice in order to implement technology solutions that will maximize performance and availability in our environment 

Responsibilities 

  • Design and implement orchestration, and tooling solutions to ensure that repetitive administration tasks are performed at a high level of efficiency and free of defect 

  • Design and implement monitoring and recovery tools to provide for site high availability (HA) and disaster recovery (DR) 

  • Design and develop highly available infrastructure and platform components to meet the needs of our growing and evolving product lines 

  • Design and implement security engineering best practices in all our deployed platform and environments 

  • Triage alerts & diagnose/resolve critical issues, manage the implementation of changes 

  • Manage the coordination, documentation, and tracking of critical incidents and corresponding root cause analysis, ensuring rapid and complete issue resolution and appropriate closed loop to customers and other key stakeholders. 

  • Collaborate with Delivery Engineers and DevExp Engineers to enhance and implement continuous integration/continuous deployment orchestration system to reduce friction for software delivery to production 

  • Evangelize the DevSecOps culture and SRE mindset, and mentor others about reliability and best practices. 

  • Identify and work with other engineering discipline to implement opportunities for: 

  • Automation 

  • Signal to noise reduction  

  • Prevention of recurring issues, and other actions to reduce time to mitigate service-impacting events and increase the productivity of cloud operations and development resources 

  • Maintain a strong understanding of IaaS, PaaS, and SaaS offerings with building and maintaining a state-of-the-art, cloud-based environment for massive-scale data processing 

  • Design and implement processes, technology and automation for performance testing. 

  • Ensure that implementation and solution are fully documented, and solution deployed with fully operationalized processes to support the solution lifecycle 

  • Other tasks as assigned 

 

Minimum Qualifications 

  • 8-10 years of experience in infrastructure, system engineering, software engineering 

  • Demonstrable experience in testing methodology, testing automation frameworks and tools for application and/or any-as-code (infrastructure, configuration, development tools such as documentation or diagram as code) 

  • A systematic problem-solving approach, coupled with strong communications skills and a sense of ownership and drive. 

  • Hands-on experience in designing, analyzing, scaling, and troubleshooting medium to large scale distributed systems. 

  • Well-versed with SRE methodologies and passionate about solving operation problems through automation and software engineering. 

  • Ability to communicate effectively vertically and horizontally within the organization via demonstrated written and verbal communication skills. 

  • Hands-on experience supporting and implementing at least 2 of architecture development styles and its product lifecycle management including but not limited to: Microservices, Domain-driven, Event-Driven, Monolithic 

  • Strong understanding of cloud native architecture and microservices design and deployment pattern 

  • Hands-on experience in designing and implementing application and/or platform performance, load and stress testing 

 

Skills Desired 

  • Advanced experience designing and supporting one of the 3 major public cloud provider – AWS is a plus will consider any other public cloud providers experience 

  • Full stack software engineering experience with a solid foundation of at least 2-3 of the following frontend and/or backend technologies: ReactJS (or similar frameworks), Java, Python, SQL, RDBMS or No-SQL Databases. 

  • Hands on strong experience with at least one of configuration management tool experiences with Ansible, Salt, Puppet or Kubernetes configuration tools such as Helm 

  • Hands on strong experience with performance testing tools such as LoadRunner, Jmeter, Blazemeter, Locust, LoadNinja  

  • Advanced experience with at least 1 of Infrastructure as code tooling (IaC) such as Terraform/OpenTofu, Pulumi etc. 

  • Advanced knowledge of at least 1 of release software tooling (e.g. Jenkins or Jenkins X, Spinnaker, Harness, Azure Devops or other Cloud specific cloud environment) 

  • Advanced level of knowledge of Kubernetes and Docker, including experience in Docker image optimization and managing the Docker image lifecycle 

  • Strong experience in at least 2 of the following sets of logging and monitoring tools: ELK stack, Prometheus, Grafana, Stackdriver, New Relic, Datadog, Dynatrace, Splunk or cloud native logging and monitoring in any of the 3 major providers 

  • Advanced level of Linux/Unix/Window OS experience. 

 

Base Pay Range:

The base pay range noted represents the company’s good faith minimum and maximum range for this role at the time of posting. The actual compensation offered to a candidate will be dependent upon several factors, including but not limited to experience, qualifications and geographic location. Also, most employees are eligible for additional incentive pay.

$167,670.00 - $204,930.00

Your Benefits Start Day 1  
 

Your wellbeing is important to Pacific Life, and we’re committed to providing you with flexible benefits that you can tailor to meet your needs. Whether you are focusing on your physical, financial, emotional, or social wellbeing, we’ve got you covered.

  • Prioritization of your health and well-being including Medical, Dental, Vision, and Wellbeing Reimbursement Account that can be used on yourself or your eligible dependents

  • Generous paid time off options including: Paid Time Off, Holiday Schedules, and Financial Planning Time Off

  • Paid Parental Leave as well as an Adoption Assistance Program

  • Competitive 401k savings plan with company match and an additional contribution regardless of participation

You Can Be Who You Are

We are committed to a culture of diversity and inclusion that embraces the authenticity of all employees, partners and communities. We support all employees to thrive and achieve their fullest potential.

What’s life like at Pacific Life? Visit Instagram.com/lifeatpacificlife

EEO Statement:

Pacific Life Insurance Company is an Equal Opportunity /Affirmative Action Employer, M/F/D/V. If you are a qualified individual with a disability or a disabled veteran, you have the right to request an accommodation if you are unable or limited in your ability to use or access our career center as a result of your disability. To request an accommodation, contact a Human Resources Representative at Pacific Life Insurance Company.