AlphaSense

The market intelligence and search platform trusted by over 3,500 leading organizations

Staff Site Reliability Engineer

Full TimeRemoteTeam 1,001-5,000Since 2011H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

9 days ago

Salary

$150K - $225K / year

8 yrs expEnglishAWSAzureCloudDNSGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonTcp/ipGo

Job Description

• Architect Reliability Paved Paths: Build frameworks and self-service tooling that let teams own the reliability of their services in a "You Build It, You Run It" culture. • Lead AI-Driven Reliability: Drive our AIOps strategy — automating diagnostics, remediation, and proactive failure prevention. • Champion Reliability Culture: Embed SRE practices across engineering via design reviews, production readiness, and operational standards. • Incident Leadership: Act as Incident Commander during critical events, modeling operational excellence, and ensuring blameless postmortems lead to lasting improvements. • Advance Observability: Deliver end-to-end monitoring, tracing, and profiling (Prometheus, Grafana, OTEL, Continuous Profiling) to optimize performance proactively. • Mentor & Multiply: Elevate engineers across SRE and product teams through mentorship, technical guidance, and knowledge sharing.

Job Requirements

  • 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 3+ of those years operating in a Senior+ SRE position
  • Strong background in running production SaaS systems at scale.
  • Proficiency in at least one programming/scripting language (Python, Go, or similar).
  • Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes.
  • Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing).
  • Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK).
  • Familiarity with advanced observability (OTEL, continuous profiling).
  • Proven incident management experience, including leading high-severity incidents and postmortems.
  • Strong troubleshooting skills across the full stack.
  • Excellent communication and collaboration skills.

Benefits

  • Equity
  • Generous benefits program

Related Categories

Related Job Pages