SmarterDx

Improving clinical and financial outcomes with physician-validated AI for documentation and coding.

Staff Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 11-50H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

3 days ago

Salary

$230K - $250K / year

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

We are seeking a Staff Site Reliability Engineer (SRE) to lead the reliability, scalability, and operational excellence of our production systems. This role is responsible for defining and driving SRE practices across the organization, including:

  • SLIs/SLOs
  • Incident management
  • Capacity planning
  • Resilience engineering

You will design and implement automation that reduces toil, improve observability and performance across our Kubernetes and AWS environments, and ensure our systems are highly available and fault-tolerant.

The ideal candidate is a deeply technical engineer with strong distributed systems expertise, a passion for operational rigor, and a track record of improving reliability through thoughtful engineering, automation, and data-driven decision-making.

This role is fully remote within the US

Qualifications

  • 10+ years of software and software reliability engineering experience, with significant time spent operating and scaling distributed systems in production environments.
  • 3+ years of hands-on experience running cloud-native infrastructure in AWS, including deep familiarity with containers, Kubernetes, monitoring, and alerting in live production systems.
  • Proven experience defining and managing SLIs/SLOs, leading incident response, and driving postmortems and systemic reliability improvements.
  • Strong expertise with Terraform and infrastructure-as-code practices for managing production infrastructure safely and reproducibly.
  • Deep experience with Kubernetes architecture and operations, including workload reliability, cluster scaling, networking, and failure modes.
  • Experience working in security-conscious, compliance-oriented environments where reliability and data protection are first-class concerns.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field — or equivalent practical experience operating large-scale systems.

Requirements

  • Reliability engineering experience with production database systems (e.g. Postgres)

Benefits

  • Medical, Dental & Vision – Comprehensive plans with leading insurance providers, covering 75% of your premiums, depending on the plan.
  • Paid Parental Leave – Generous paid leave to support families through birth or adoption: Up to 12 weeks for parents.
  • Remote-First Team – Work from anywhere in the U.S.
  • Unlimited PTO & 10 Holidays – So you can relax and recharge.
  • 401(k) with Traditional & Roth Options – Tax-advantaged retirement savings through Fidelity with a 4% match.
  • Minimal Bureaucracy – A fast-moving, high-impact environment where you can focus on what matters.
  • Incredible Teammates! – Work alongside smart, supportive, and mission-driven colleagues.

Job Requirements

  • 10+ years of software and software reliability engineering experience, with significant time spent operating and scaling distributed systems in production environments.
  • 3+ years of hands-on experience running cloud-native infrastructure in AWS, including deep familiarity with containers, Kubernetes, monitoring, and alerting in live production systems.
  • Proven experience defining and managing SLIs/SLOs, leading incident response, and driving postmortems and systemic reliability improvements.
  • Strong expertise with Terraform and infrastructure-as-code practices for managing production infrastructure safely and reproducibly.
  • Deep experience with Kubernetes architecture and operations, including workload reliability, cluster scaling, networking, and failure modes.
  • Experience working in security-conscious, compliance-oriented environments where reliability and data protection are first-class concerns.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field — or equivalent practical experience operating large-scale systems.
  • Reliability engineering experience with production database systems (e.g. Postgres)

Benefits

  • Medical, Dental & Vision – Comprehensive plans with leading insurance providers, covering 75% of your premiums, depending on the plan.
  • Paid Parental Leave – Generous paid leave to support families through birth or adoption: Up to 12 weeks for parents.
  • Remote-First Team – Work from anywhere in the U.S.
  • Unlimited PTO & 10 Holidays – So you can relax and recharge.
  • 401(k) with Traditional & Roth Options – Tax-advantaged retirement savings through Fidelity with a 4% match.
  • Minimal Bureaucracy – A fast-moving, high-impact environment where you can focus on what matters.
  • Incredible Teammates! – Work alongside smart, supportive, and mission-driven colleagues.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 1,001-5,000Since 2005H1B Sponsor

Develop and implement architecture patterns, automate deployments, manage SaaS platforms, and ensure compliance in a highly available cloud environment.

AWSCI/CDKubernetesPythonRubyShell ScriptingTerraform
United States
$134.6K - $250K / year

Senior DevOps Program Manager

Keeper Security, Inc.

Manage, protect and monitor all your organization's passwords, secrets and remote connections with zero-trust security

DevOps Engineer3 days ago
Full TimeRemoteTeam 501-1,000Since 2011

Senior DevOps Program Manager leading complex initiatives at Keeper Security

AWSCloudCyber SecurityDockerGrafanaJenkinsKubernetesPrometheusTerraform
United States

Senior DevOps Engineer – IL5, FedRAMP High

Keeper Security, Inc.

Manage, protect and monitor all your organization's passwords, secrets and remote connections with zero-trust security

DevOps Engineer3 days ago
Full TimeRemoteTeam 501-1,000Since 2011

Senior DevOps Engineer managing IL5-compliant infrastructure at Keeper Security

AWSAzureJenkinsPythonTerraform
California + 1 moreAll locations: California, Illinois

Senior Database Reliability Engineer

Rithum

Rithum is the heartbeat of commerce

DevOps Engineer3 days ago
Full TimeRemoteTeam 501-1,000Since 1997H1B No Sponsor

Senior Database Reliability Engineer managing database systems for Rithum's commerce network

PythonSQL
Washington
$90K - $140K / year