This is a U.S. based position. All of the programs we support require U.S. citizenship to be eligible for employment. All work must be conducted within the continental U.S.

Who we are:

Raft (https://TeamRaft.com) is a customer-obsessed non-traditional defense tech company dedicated to empowering U.S. military and government agencies with cutting-edge AI/ML and data solutions. We are a leader in autonomous data fusion and Agentic AI, with a purposeful focus on Distributed Data Systems, Platforms at Scale, and Complex Application Development. With headquarters in McLean, VA, our range of clients includes innovative federal and public agencies leveraging design thinking, cutting-edge tech stack, and cloud-native ecosystem. We build digital solutions that impact the lives of millions of Americans.

About the role:

Raft is building mission-critical data platforms for the Department of War that process billions of events per day from hundreds of sensors and operational sources, delivering intelligence to operators who use it to make time-sensitive decisions. Our platform runs across multiple classification levels and deployment environments.

As a Senior DevOps Engineer at Raft, you won’t be operating in a pure infrastructure lane. You will be expected to understand the software you’re deploying, contribute to it when needed, and engage with the data pipelines flowing through the systems you manage. This is a role for someone who thinks end-to-end, from data ingest and pipeline performance through to Kubernetes-based deployment, observability, and secure operations in defense environments.

You will work across cloud and on-premises environments, partner closely with software and data engineers, and help Raft maintain the operational rigor and platform reliability that our most demanding customers depend on.

What You’ll Do

Design, implement, and maintain secure Kubernetes-based infrastructure supporting data platform workloads across cloud and on-premises environments
Build, manage, and improve CI/CD pipelines using GitLab and GitOps-based delivery patterns, enabling reliable, repeatable deployments across multiple classification levels
Develop and maintain Infrastructure as Code (IaC) using tools such as Terraform and Ansible to provision, configure, and lifecycle-manage platform infrastructure
Collaborate directly with software engineers to understand service architectures, dependencies, and runtime behavior, and contribute code-level changes where needed to improve deployability, reliability, or observability
Support and optimize data streaming and processing pipelines built on technologies such as Kafka, Kafka Streams, Flink, and Pinot, diagnosing bottlenecks, tuning configurations, and ensuring data integrity across the platform
Implement and manage platform observability using monitoring (Prometheus, Grafana), logging (Fluentbit, Loki, Kibana), and alerting tooling to maintain operational awareness in production environments
Apply and enforce DevSecOps practices including container hardening, vulnerability scanning, software supply chain security, and compliance-driven deployment patterns in regulated government environments
Manage and debug complex Helm chart deployments, service mesh configurations (Istio), and Kubernetes networking across multi-cluster and multi-environment topologies
Support operations across multiple deployment targets, cloud-hosted (AWS, Azure), on-premises data centers, and edge/tactical environments, adapting platform patterns to the constraints of each
Write clean, maintainable automation and tooling in Java or Go to accelerate platform operations, reduce toil, and improve developer experience across engineering teams
Engage directly with customers at the most operationally demanding locations in the Department of War

What we are looking for:

5+ years of relevant hands-on experience in DevOps or platform engineering roles.
5+ years of production experience with Docker and Kubernetes, including provisioning, operating, and troubleshooting clusters in real-world environments
Strong experience building and maintaining CI/CD pipelines, with hands-on proficiency in GitLab CI, GitOps workflows (Flux, ArgoCD), and modern software delivery practices
Experience supporting data-intensive platforms using streaming technologies such as Kafka, or Flink, including configuration, tuning, and operational support
Solid understanding of data engineering fundamentals, including ETL/ELT pipeline design, data storage patterns, data governance concepts, and integration with downstream consumers
Proficiency with Infrastructure as Code tooling, particularly Terraform; experience with Ansible or similar configuration management tools
Strong Helm proficiency, including authoring and maintaining charts for complex multi-service deployments
Hands-on experience with platform observability tooling: Prometheus, Grafana, Fluentbit, Loki or Elasticsearch/Kibana
Demonstrable software development skills in Java and/or Go, comfortable reading, modifying, and contributing to application codebases, not just deploying them
Experience with cloud infrastructure on AWS and/or Azure, including networking, IAM, storage, and managed Kubernetes services
Strong systems thinking, troubleshooting discipline, and the ability to work independently in a fast-moving environment with competing priorities
Experience applying secure and compliant deployment practices in regulated or government environments
Active Secret clearance required; must be eligible for and willing to obtain a Top Secret/SCI clearance
Ability to obtain Security+ certification within the first 90 days of employment
Ability to travel up to 25%

Highly preferred:

Experience with service mesh technologies, particularly Istio, including traffic management, mTLS, and observability integration
Familiarity with Kubernetes-based ML/AI platforms such as Kubeflow, KServe, or Ray, and experience supporting GPU-enabled workloads
Experience with software supply chain security tools including container image scanning, SBOM generation, and runtime vulnerability management
Background supporting deployments across multiple classification levels or air-gapped / disconnected environments
Experience with package and dependency management across polyglot environments (Maven, Gradle, NPM, Yarn, pip)
Familiarity with compliance frameworks relevant to DoW software deployment, including RMF, STIGs, and IL4/IL5/IL6 requirements
Contributions to or ownership of internal developer platforms, golden path tooling, or shared infrastructure services
Experience with distributed tracing and APM tooling (e.g., OpenTelemetry, Jaeger, Tempo)
Existing TS/SCI clearance strongly preferred

What Success Looks Like

Platform deployments are reliable, repeatable, and secure across every environment Raft operates in, from commercial cloud to classified on-premises
Engineering teams move faster because CI/CD workflows, infrastructure tooling, and deployment patterns are solid, well-documented, and easy to use
Data pipelines running through Raft’s platform are stable, observable, and performant, with clear ownership of issues when they arise
You’ve earned the trust of software engineers by understanding what they’ve built and engaging meaningfully in conversations about architecture, runtime behavior, and operational trade-offs
Compliance and security posture across deployment environments is continuously maintained, not bolt-on

Clearance Requirements:

Minimum active Secret clearance with ability to obtain and maintain an active TS SCI security clearance

Salary Range: $150,000.00 - $200,000.00

Work Type:

Hybrid with up to 25% travel
Active Secret clearance required to start; TS/SCI eligibility required

What we will offer you:

Highly competitive salary
Fully covered healthcare, dental, and vision coverage
401(k) and company match
Take as you need PTO + 11 paid holidays
Education & training benefits
Generous Referral Bonuses
And More!

Our Vision Statement:

We bridge the gap between humans and data through radical transparency and our obsession with the mission.

Our Customer Obsession:

We will approach every deliverable like it's a product. We will adopt a customer-obsessed mentality. As we grow, and our footprint becomes larger, teams and employees will treat each other not only as teammates but customers. We must live the customer-obsessed mindset, always. This will help us scale and it will translate to the interactions that our Rafters have with their clients and other product teams that they integrate with. Our culture will enable our success and set us apart from other companies.

How do we get there?

Public-sector modernization is critical for us to live in a better world. We, at Raft, want to innovate and solve complex problems. And, if we are successful, our generation and the ones that follow us will live in a delightful, efficient, and accessible world where out-of-box thinking, and collaboration is a norm.

Raft’s core philosophy is Ubuntu: I Am, Because We are. We support our “nadi” by elevating the other Rafters. We work as a hyper collaborative team where each team member brings a unique perspective, adding value that did not exist before. People make Raft special. We celebrate each other and our cognitive and cultural diversity. We are devoted to our practice of innovation and collaboration.

We’re an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Senior DevOps Engineer

What You’ll Do

What Success Looks Like

Related Jobs

Physical Therapist

Registered Nurse Outpatient Oncology

QC Operations Specialist II

QC Operations Specialist II

Business Analysis Lead

Optometric Technician