Securly13

Senior Rust Engineer – System, Proxy Hardening & Infrastructure

Pune City, Maharashtra, India Full Time

The Role

The Senior Rust Engineer – System, Proxy Hardening & Infrastructure will own production hardening, performance validation, and infrastructure operations for Securly's Rust web filtering proxy — establishing the reliability and scalability baseline that millions of students depend on, and owning the on-premise Docker proxy appliance for school network environments.

Your mission is to ensure the proxy performs, scales, and deploys reliably in production. You will own canary rollout configuration, CloudFormation-based scaling infrastructure, load testing strategy, and performance baseline definition. You will also own the on-premise Docker appliance — a lightweight Rust proxy deployed in school network environments that intercepts TCP traffic and prepends identity headers.

At L5, this is not purely an infra-execution role. You are expected to identify systemic reliability risks in the proxy stack and propose architectural mitigations, not just react to incidents. You define what 'production-ready' means and enforce that bar.

Level: L5
Experience: 8–15 Years
Location: Pune, India
Work Type: Hybrid (2 days onsite)
Reports To: Filter Engineering Manager

 

What It Means to Be L5 at Securly

L5 at Securly is a Staff Engineer. You are the technical owner, not just an implementer.

  • Drive technical direction for your initiative end-to-end: from architecture to production, with minimal oversight from your engineering manager.
  • Identify and resolve ambiguity in requirements, system boundaries, and design tradeoffs without waiting for a fully-formed spec.
  • Mentor L3/L4 engineers on the team: code reviews, design feedback, pairing, and raising the bar for what production-quality work looks like.
  • Partner with your L6 technical lead and the Distinguished Engineer on architectural decisions, surfacing tradeoffs clearly rather than deferring them upward.
  • Contribute to cross-team engineering standards: you are expected to influence practices beyond your immediate squad.
  • Translate technical context into clear written artifacts that non-engineers (PM, Support, Leadership) can act on.
  • Participate in on-call rotation and own the full incident lifecycle for your system: detection, diagnosis, resolution, and retrospective.

What You'll Do

  • Define what 'production-ready' means for the Rust proxy: establish written performance baselines, reliability criteria, and the specific metrics that trigger a rollback vs. a hotfix.
  • Design and execute load tests to identify performance bottlenecks; profile memory and CPU under peak and sustained load; produce baseline documents the team references across deployments.
  • Own canary rollout configuration and CloudFormation-based ASG scaling policies — modify templates, configure health checks, tune scaling behavior; document every configuration decision with rationale.
  • Identify and propose mitigations for known proxy infrastructure risk areas including Redis behavior under peak throughput, CORS/proxy transparency edge cases, and connection pool exhaustion.
  • Maintain and harden the on-premise Docker proxy appliance — image builds, container networking, minimal image design, and distribution to school network environments.
  • Investigate and resolve production performance issues under peak load; produce written post-mortems that close the loop on root cause.
  • Build monitoring, dashboards, and alerting for proxy infrastructure (CloudWatch, Splunk); define SLO targets and ensure alerting is calibrated to them.
  • Contribute to C++ to Rust business logic conversion work alongside the Proxy Conversion engineer.
  • Support TCP stream handling and socket-level work for the on-premise appliance: bidirectional forwarding, port interception, identity header injection.

Skills & Requirements

Must-Have

  • Rust — same production standard as the Proxy Conversion role: ownership, lifetimes, async/await (Tokio), systems-level networking. 4+ years at production level.
  • Systems / network programming — TCP stream handling, socket programming, bidirectional forwarding; understanding what happens below the HTTP layer.
  • Docker / containerization — image building, distribution, minimal image design for appliance deployment, container networking.
  • Performance testing and profiling — load test design, memory and CPU profiling, bottleneck identification, flame graph analysis, production baseline definition.
  • AWS (CloudFormation, NLB, ASG, EC2) — owns canary rollout; must independently modify CloudFormation templates, configure health checks, and tune ASG scaling policies from day one.
  • Production ownership mindset — demonstrated ability to define reliability criteria, write post-mortems, set SLO targets, and hold a system to a documented standard. L5 engineers set the bar.

Strongly Preferred

  • C++ — required for contributing to the business logic conversion effort; must read and reason about existing proxy C++ code.
  • Redis — performance impact under high load; experience with Redis behavior under peak throughput and failure mode handling.
  • CORS and proxy transparency — experience diagnosing and resolving CORS errors and proxy authentication transparency issues.

Nice to Have

  • RADIUS protocol — the on-premise appliance integrates with RADIUS for user identity.
  • Monitoring / observability — CloudWatch metrics and dashboards, Splunk log integration, SLO/SLA definition.
  • Web filtering / content inspection domain experience — URL categorization, CIPA compliance.

Who You Are

  • You take performance personally — you do not accept 'probably fast enough.' You profile, you measure, you prove it — and then you write it down.
  • You have owned production infrastructure and understand that canary rollouts, health checks, and scaling policies are not afterthoughts — they are the job.
  • You are comfortable in the systems layer: TCP streams, socket programming, and container networking are familiar territory.
  • You identify systemic risks before they become incidents. You advocate for reliability improvements over tactical fixes.
  • You produce written artifacts — post-mortems, baseline documents, SLO definitions — that outlast any individual incident or sprint.
  • You operate well in an environment where production stability is a shared responsibility and on-call is part of the role.

About Securly

Securly processes over 1.1 billion requests per day and 54 TB of data daily, protecting more than 20 million students across 20,000+ schools globally. Since pioneering the first cloud-based web filter for K-12 in 2013, Securly has built one of the most trusted, high-scale platforms for student safety, wellness, and engagement. By turning data into meaningful, actionable intelligence, Securly enables schools to identify risk earlier, reduce harmful incidents, and strengthen student support.

We are proud to be consistently recognized as a Top Place to Work, named a Top 40 Most Used EdTech platform, and included on the GSV 150 list as one of the most transformational growth companies in digital learning and workforce skills.

Benefits

  • Comprehensive Health Insurance (employee, parents, spouse, children)
  • Accidental & Term Life Insurance
  • Learning & Development reimbursement
  • Paid Time Off
  • Public Holidays (10+ per year)
  • Retirement Benefits (EPF & gratuity)
  • Parental Leave (as per statutory norms)
Equal Opportunity Employer
Securly is an Equal Opportunity Employer committed to inclusion, fairness, and respect. We welcome applicants from all backgrounds, identities, and experiences. #LI-REMOTE #LI-DO1