CertifID

CertifID is the most secure way to send and receive wiring information.

Senior Sire Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 11-50Since 2017H1B No SponsorCompany SiteLinkedIn

Location

Texas

Posted

33 days ago

Salary

Not specified

Bachelor Degree5 yrs expEnglishAWSAzureDistributed SystemsGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheusPythonTerraformGo.net

Job Description

• Own and improve the reliability, availability, and performance of production systems while defining and operationalizing SLIs/SLOs and error budgets. • Design and implement autonomous and semi-autonomous AI agents for monitoring distributed systems and applications. Build agents capable of consuming multi-source observability data (metrics, logs, traces, etc.). • Participate in and help lead an on-call rotation, serving as an escalation point for major incidents and facilitating blameless postmortems. • Build automated workflows to eliminate manual work and design/maintain Infrastructure-as-Code with Terraform. • Improve metrics, logs, traces, and alerting using tools like Datadog or Prometheus to reduce noise and increase signal. • Partner with application teams to implement reliability best practices and mentor junior engineers to foster a culture of knowledge sharing.

Job Requirements

  • 5+ years in SRE, DevOps, Platform Engineering, or Infrastructure Engineering.
  • Proven experience supporting production SaaS systems in Azure (preferred), AWS, or GCP.
  • Strong Linux, networking, and distributed systems troubleshooting skills.
  • Strong experience with containers and orchestration (Kubernetes/EKS/AKS).
  • Expertise with Infrastructure-as-Code (Terraform strongly preferred).
  • Strong scripting/programming skills in Python, Go, Bash, or C#/.NET.
  • Hands-on experience with Datadog, Prometheus/Grafana, or OpenTelemetry.

Benefits

  • Flexible vacation
  • 12 company-paid holidays
  • 10 paid sick days
  • No work on your birthday
  • Health, dental, and vision Insurance (including a $0 option)
  • 401(k) with matching, and no waiting period
  • Equity
  • Life insurance
  • Generous parental paid leave
  • Wellness reimbursement of $300/year
  • Remote worker reimbursement of $300/year
  • Professional development reimbursement
  • Competitive pay
  • An award-winning culture

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Security Engineer – DevSecOps

PrizePicks

PrizePicks is the fastest-growing sports company in North America according to the 2023 Inc. 5000 rankings, two years running, and the largest independent skill-based fantasy sports operator in the country.

DevOps Engineer33 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

Senior Security Engineer developing security practices for PrizePicks' infrastructure

AWSAzureCloudGoogle Cloud PlatformKubernetesTerraform
United States
$120K - $170K / year
DevOps Engineer33 days ago
Full TimeRemoteTeam 501-1,000H1B No Sponsor

DevSecOps Engineer supporting federal consulting for cloud and systems operations

AnsibleAWSAzureChefCloudDockerEC2GrafanaKubernetesLinuxOpenStackPrometheusPuppetSaltStackTerraform
Virginia
DevOps Engineer33 days ago
Full TimeRemoteTeam 501-1,000H1B No Sponsor

Junior DevSecOps Engineer assisting with cloud and systems operations for Aretum

AnsibleAWSAzureChefCloudDockerEC2GrafanaKubernetesLinuxOpenStackPrometheusPuppetSaltStackTerraform
Virginia
Full TimeRemoteTeam 501-1,000H1B No Sponsor

Public Trust Eligibility RequiredThis is a contingent position, meaning employment is dependent upon the successful award of the associated contract to Aretum and completion of any required background investigation or security clearance verification.&a...

Virginia