AI/ML Engineer - Agentic

This role has been designed as ‘Hybrid’ with an expectation that you will work on average 2 days per week from an HPE office.

Who We Are:

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description:

Job Definition:

The AI/ML Engineer – Agentic is a senior individual contributor responsible for designing, building, and operating a production-grade agentic orchestration platform, including multi-agent workflows and MCP server–based tool infrastructure. The role focuses on enterprise-scale LLM integration, shared retrieval and memory services, and high‑performance backend systems that power agent execution. This position owns reliability, observability, and cloud-native operations for non-deterministic agentic systems in production

Management Level Definition:

Contributions include applying developed subject matter expertise to solve common and sometimes complex technical problems and recommending alternatives where necessary. Might act as project lead and provide assistance to lower level professionals. Exercises independent judgment and consults with others to determine best method for accomplishing work and achieving objectives.

Responsibilities:

Design, build, and own a production-grade agentic orchestration platform, implementing scalable multi-agent workflows using frameworks such as LangGraph or equivalent.
Architect, develop, and operate the MCP server infrastructure, including inter-agent communication, tool/server registries, domain isolation, versioning, and lifecycle management.
Integrate and operate LLM services at enterprise scale, supporting streaming, structured outputs, tool/function calling, and robust error handling across agent workflows.
Build and maintain retrieval and memory services for agentic systems, including RAG pipelines, OpenSearch-backed vector stores, hybrid search, and relevance optimization.
Develop and operate high-performance backend services (FastAPI, gRPC, async systems, messaging) that power orchestration, tool execution, and agent runtime behavior.
Own observability and reliability for non-deterministic systems, delivering end-to-end tracing, monitoring, and cost/performance visibility for agent executions.
Manage cloud-native infrastructure and deployment, including Kubernetes workloads, containerized services, CI/CD pipelines, and resource optimization (CPU/memory, autoscaling).

Education and Experience Required:

Bachelor’s degree in computer science, engineering, information systems, or closely related quantitative discipline. Master’s desirable.
Typically, 4-7 years’ experience.

Knowledge and Skills:

Core Agentic/Orchestration:
- Production experience with agentic frameworks: LangGraph (preferred), Claude Agent SDK, or equivalent (not just prototypes)
- Deep understanding of multi-agent architectures: supervisor/worker patterns, hierarchical agent graphs, ReAct loops, ReWoo
- Hands-on with inter-agent communication protocols: MCP (Model Context Protocol), A2A, tool registry / server registry
LLM & ML Engineering:
- LLM API integration at scale: structured outputs, streaming, function/tool calling, error handling
- RAG pipeline design and optimization: chunking strategies, re-ranking, hybrid search - Know what knobs to turn for what issues
- Vector store experience: OpenSearch or equivalent
- Applied ML intuition: fine-tuning concepts, prompt engineering, evaluations, Qlora, PEFT
Infrastructure & Production Systems:
- Backend development: FastAPI, gRPC, Kafka, Redis, message queues, Async System design: Python, API Design GraphQL and/or REST at enterprise scale
- Observability and monitoring for non-deterministic systems: LangFuse, Prometheus, or equivalent
- Kubernetes: deploying, scaling, and managing workloads (Deployments, Services, ConfigMaps, Secrets)
- Container image management: building, tagging, versioning, and pushing images via Docker; familiarity with a container registry (ECR, GCR, Docker Hub)
- CI/CD pipelines for automated build and deploy (GitHub Actions, Jenkins, ArgoCD, or similar)
- Resource management: CPU/memory limits, autoscaling (HPA/VPA), health probes
Additional Preferred Skills
- Multi-tenant architecture awareness: rate limiting, auth, tenant isolation
- Knowledge base and cost optimization experience: AWS Bedrock, OpenSearch Serverless

Additional Skills:

Cloud Architectures, Cross Domain Knowledge, Design Thinking, Development Fundamentals, DevOps, Distributed Computing, Microservices Fluency, Full Stack Development, Release Management, Security-First Mindset, User Experience (UX)

What We Can Offer You:

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

Unconditional Inclusion

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

Let's Stay Connected:

Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

#unitedstates

#hybridcloud, #networking

Job:

Engineering

Job Level:

TCP_03

"The expected salary/wage range for this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education/training, and/or skill level.
– United States of America: Annual Salary USD 136,500 - 276,500 in California
The listed salary range reflects base salary. Variable incentives may also be offered."

Information about employee benefits offered in the US can be found at https://myhperewards.com/main/new-hire-enrollment.html

HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.

Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.

HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.

No Fees Notice & Recruitment Fraud Disclaimer

It has come to HPE’s attention that there has been an increase in recruitment fraud whereby scammer impersonate HPE or HPE-authorized recruiting agencies and offer fake employment opportunities to candidates. These scammers often seek to obtain personal information or money from candidates.

Please note that Hewlett Packard Enterprise (HPE), its direct and indirect subsidiaries and affiliated companies, and its authorized recruitment agencies/vendors will never charge any candidate a registration fee, hiring fee, or any other fee in connection with its recruitment and hiring process. The credentials of any hiring agency that claims to be working with HPE for recruitment of talent should be verified by candidates and candidates shall be solely responsible to conduct such verification. Any candidate/individual who relies on the erroneous representations made by fraudulent employment agencies does so at their own risk, and HPE disclaims liability for any damages or claims that may result from any such communication.

AI/ML Engineer - Agentic

Related Jobs

MANAGER (BUDGET OFFICER)

OPERATIONS MANAGEMENT AND PLANS ANALYST

PROGRAM ANALYST

BUSINESS OPERATION SPECALIST

Lead Clinical Laboratory Scientist - EDRP Approved

MANAGER (BUDGET OFFICER)