Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 501-1,000

Location

United States

Posted

1 day ago

Salary

$104.0K - $127.2K / year

AWSTerraformKubernetesEKSIAMVPCDNSTLSLoad BalancersSecurity GroupsCi/cdInfrastructure AS CodeTerraform CDKNetworkingDistributed Systems Debugging

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

The Site Reliability Engineer builds and operates the paved roads that service teams use every day. You take shared infrastructure from idea to module to production, then you keep it boring. This is not a research role and not a hero role. It is delivery with discipline. You build with intention. You do not just make things work, you make them make sense. You challenge assumptions, question defaults, and tighten bolts others ignore. You move fast, but not recklessly. You are becoming the engineer others trust to take ownership and deliver cleanly.

This is a hands-on engineering role who can work independently on well-scoped problems with guidance, follow established patterns, and improve them when the evidence supports change. You partner closely with Security, Networking, and SRE because the platform is where constraints become real. As a Site Reliability Engineer, you help determine whether the platform feels chaotic or calm to everyone else. Your work directly affects developer velocity, operational safety, and trust in the system. When the platform is boring, predictable, and resilient, it is because engineers like you did the work carefully and well.

Core Responsibilities

Cloud Foundations
- Implement cloud infrastructure in AWS using approved patterns and guardrails.
- Support EKS based runtime foundations, including cluster add-ons and shared services.
- Build environment parity across nonprod and prod and flag any required divergence early with evidence.
- Help make cloud primitives predictable, supportable, and easy to consume.
Infrastructure Patterns and Modules
- Develop and maintain reusable platform modules and templates using Terraform or CDKTF where applicable.
- Contribute to baseline building blocks: VPC patterns, IAM primitives, EKS base clusters, ingress patterns, secrets, and shared data stores as assigned.
- Keep modules consumable through sane defaults, versioning, changelogs, and upgrade guidance.
- Reduce drift by enforcing standards through code, not documentation alone.
Automation and Delivery Enablement
- Improve CI workflows for infrastructure changes: plan and apply safety, policy checks, drift detection, and promotion across environments.
- Remove manual steps from provisioning and onboarding by turning them into pipelines and documented runbooks.
- Support internal module consumption patterns, including examples and reference implementations.
- Favor repeatability and clarity over clever one-off solutions.
Operations and Reliability
- Operate platform owned services with an ownership mindset. Ownership is not optional.
- Participate in on call for platform services and follow incident procedures.
- Write and maintain runbooks, dashboards, and alerts for what you ship.
- Drive post-incident follow-ups that reduce repeat failures.
Security, Compliance, and Governance
- Implement least privilege IAM patterns and secure by design defaults.
- Partner with Security to integrate controls into pipelines and platform defaults.
- Treat auditability as a feature: logs, approvals, traceability, and evidence.
- Follow established governance and exception processes and document deviations.

Qualifications

3 plus year's experience in platform engineering, DevOps, SRE, or infrastructure engineering.
Working experience with AWS and infrastructure as code (Terraform preferred, CDKTF acceptable).
Practical Kubernetes experience, preferably EKS (deploying, operating, debugging).
Comfort with networking fundamentals: DNS, TLS, routing, load balancers, and security groups.
Ability to debug pipelines and distributed failures without guessing.
Strong written communication: design notes, runbooks, and crisp status updates.

Benefits

Flexible Personal Time Off (Vacation time)
401K match
Competitive healthcare, dental and vision insurance plans
Paid Parental Leave (Maternity and Paternity leave)
Employee Stock Purchase Program
Free access to Amwell’s Telehealth Services, SilverCloud and The Clinic by Cleveland Clinic’s second opinion program
Free Subscription to the Calm App
Tuition Assistance Program
Pet Insurance

Job Requirements

3 plus year's experience in platform engineering, DevOps, SRE, or infrastructure engineering.
Working experience with AWS and infrastructure as code (Terraform preferred, CDKTF acceptable).
Practical Kubernetes experience, preferably EKS (deploying, operating, debugging).
Comfort with networking fundamentals: DNS, TLS, routing, load balancers, and security groups.
Ability to debug pipelines and distributed failures without guessing.
Strong written communication: design notes, runbooks, and crisp status updates.

Benefits

Flexible Personal Time Off (Vacation time)
401K match
Competitive healthcare, dental and vision insurance plans
Paid Parental Leave (Maternity and Paternity leave)
Employee Stock Purchase Program
Free access to Amwell’s Telehealth Services, SilverCloud and The Clinic by Cleveland Clinic’s second opinion program
Free Subscription to the Calm App
Tuition Assistance Program
Pet Insurance

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More US Remote Jobs

More DevOps Engineer Jobs

Staff Site Reliability Engineer

Jobgether

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

DevOps Engineer1 day ago

Full TimeRemote

This is a senior, hands-on role within a small, high-leverage SRE team, responsible for ensuring the reliability, scalability, and security of a high-growth digital financial platform. The Staff SRE will architect, automate, and optimize cloud infrastructure, focusing on operatio...

View details: Staff Site Reliability Engineer

United States

Apply

Associate Reliability Engineer

Chomps

Protein-packed meat snacks that deliver on taste, simple ingredients and powerful nutrition!

DevOps Engineer1 day ago

Full TimeRemoteTeam 11-50Since 2012H1B No Sponsor

Company Site LinkedIn

Reliability Engineer focused on asset maintenance for packaging equipment at Chomps

View details: Associate Reliability Engineer

United States

$85K - $90K / year

Apply

Site Reliability Engineer

CardioOne

DevOps Engineer1 day ago

Full TimeRemoteTeam 11-50

We are seeking a highly skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, security, and performance of our production systems and services. The SRE will bridge the gap between software development and operations, implementing automation, monitoring, ...

LinuxShell scriptingAWSAzureKubernetesDockerPythonJavaCI/CDTerraformTerragruntAnsibleDatadogMicroservicesDistributed systemsNetworkingSecurityPostgreSQL

View details: Site Reliability Engineer

United States

$130K - $150K / year

Apply

Sr. Site Reliability Engineer

Element Solutions

DevOps Engineer1 day ago

Full TimeRemote

The Senior Site Reliability Engineer acts as the Technical Architecture & Stability Assessment Lead, evaluating the reliability and resilience of complex enterprise infrastructure environments over a structured 16-week assessment period. This role focuses on identifying stability risks, mapping dependencies, and strengthening current architecture to ensure operational continuity during modernization efforts.

View details: Sr. Site Reliability Engineer

United States

$140K - $180K / year

Apply

Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Staff Site Reliability Engineer

Associate Reliability Engineer

Site Reliability Engineer

Sr. Site Reliability Engineer