Deutsche Bank

AI/ML DevOps Engineer, AS

Pune - Business Bay Full time

Job Description:

Job Title: AI/ML DevOps Engineer

Location: Pune, India

Corporate Title: AS

Role Description

  • DB Technology is a global team of tech specialists, spread across multiple trading hubs and tech centres. We have a strong focus on promoting technical excellence – our engineers work at the forefront of financial services innovation using cutting-edge technologies.
  • Our India location is one of the big tech centers and growing strongly. We are committed to building a diverse workforce and to creating excellent opportunities for talented engineers and technologists. Our tech teams and business units use agile ways of working to create #GlobalHausbank solutions from our home market.

Strategy and Innovation Engineering (TDI chief strategy office)

  • Deutsche Bank’s Innovation team identifies, evaluates, and incubates cutting-edge technical innovation. It is part of the Chief Strategy Office of the bank’s Technology, Data & Innovation (TDI) function and works globally with all business lines and infrastructure functions of the bank. A focus of the team is to create value for clients and the bank using Artificial Intelligence, Large Language Models (LLM) and other advanced data-driven technologies.  
  • As an L2-AI/ML DevOps Engineer in the Innovation team of the TDI Chief Strategy Office you will primarily focuss on the daily operations and real-time support of AL/ML Systems in production.

What we’ll offer you

As part of our flexible scheme, here are just some of the benefits that you’ll enjoy

  • Best in class leave policy
  • Gender neutral parental leaves
  • 100% reimbursement under childcare assistance benefit (gender neutral)
  • Sponsorship for Industry relevant certifications and education
  • Employee Assistance Program for you and your family members
  • Comprehensive Hospitalization Insurance for you and your dependents
  • Accident and Term life Insurance
  • Complementary Health screening for 35 yrs. and above

Your key responsibilities

  • Manage Incident, Service, Problem and Change Management of Shared AI Platforms
  • Monitor production AI/ML models for performance, latency, accuracy, data drift and model drift, and proactively troubleshoot production issues.
  • Automate Model Packaging, versioning and rollbacks.
  • Monitor model inference speed, latency and accuracy.
  • Optimize resource allocation for cost-effective AI workloads.
  • Detect and mitigate data drift affecting model performance.
  • Troubleshoot model failures, latency issues and deployment errors.
  • Collaborate with L3 engineers and data scientists for escalations.
  • Utilize containerization technologies like Docker to package models and dependencies.
    Continuous Integration/Continuous Deployment (CI/CD):
    • Develop and maintain CI/CD pipelines for automating the testing, integration, and deployment of ML models.
    • Implement version control to track changes in both code and model artifacts.
      Monitoring and Logging:
    • Establish monitoring solutions to track the performance and health of deployed models.
    • Set up logging mechanisms to capture relevant information for debugging and auditing purposes.
  • Optimize ML infrastructure for scalability and cost-effectiveness.
  • Implement auto-scaling mechanisms to handle varying workloads efficiently.
  • Enforce security best practices to safeguard both the models and the data they process.
  • Ensure compliance with industry regulations and data protection standards.
  • Oversee the management of data pipelines and data storage systems required for model training and inference.
  • Implement data versioning and lineage tracking to maintain data integrity.
  • Collaborate with DevOps teams to align MLOps practices with broader organizational goals.
  • Continuously optimize and fine-tune ML models for better performance.
  • Identify and address bottlenecks in the system to enhance overall efficiency.
  • Maintain clear and comprehensive documentation of MLOps processes, infrastructure, and model deployment procedures.
  • Document best practices and troubleshooting guides for the team.

Your skills and experience

  • Excellent communication and presentation skills, highly organized and disciplined.
  • Experienced in working with multiple stakeholders. Ability to create and naturally maintain good business relationships with all stakeholders. 
  • Comfortable working in VUCA (Volatility Uncertainty Complexity Ambiguity) and highly dynamic environments.
  • Expertise on the products/technologies below is required:
    • Google Cloud – GKE, Terraform, IAM, BigQuery, Cloud Shell, Cloud Storage
    • AI/ML – AI Agents, AI concepts, ML models, AI/ML Concepts, Vertex AI, AutoML, BigQuery ML.
    • MLOps & CICD Pipelines, Kubeflow, Vertex AI pipelines
    • Proficiency in Designing, deploying and managing AI agents e..g chatbot, virtual assistants
    • GCP Networking, Networking protocols, Security concepts, VPC, Load balancers
  • Unix servers very basic administration
  • Python, Shell Scripting, SQL
  • Familiarity with fine-tuning and deploying large language models on GCP.
  • Understanding of security best practices, including data governance, encryption, and compliance with AI-related regulations.
  • GCP - Cloud Logging, Cloud Monitoring and AI Model Performance Tracking.
  • 4+ years of work experience in IT; (for AVP – 6+, Associate – 4+)
  • Strong problem-solving skills and a passion for AI research
  • Good inter-personal skills with ability to co-operate and collaborate with other teams

Educational Qualifications:

  • B.E. / B. Tech. / master’s degree in computer science or equivalent
  • Added advantage. –
    • GCP Certifications 
    • Kubernetes Certifications
    • AI/Ml Educational background or Certifications or higher qualifications.

How we’ll support you

  • Training and development to help you excel in your career
  • Coaching and support from experts in your team
  • A culture of continuous learning to aid progression
  • A range of flexible benefits that you can tailor to suit your needs

About us and our teams

Please visit our company website for further information:

https://www.db.com/company/company.html

We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.

Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.

We welcome applications from all people and promote a positive, fair and inclusive work environment.