Jobgether

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Staff Software Engineer - Grafana Cloud Observability, Kubernetes Monitoring

Software EngineerSoftware EngineerFull TimeRemote

Location

United States

Posted

4 days ago

Salary

$175.0K - $210.0K / year

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

This role offers a unique opportunity to shape and advance cloud observability solutions for large-scale systems, focusing on metrics, logs, and traces. You will work on developing and maintaining the backend for observability services, including Kubernetes monitoring, database observability, and cloud infrastructure metrics. The position emphasizes technical leadership, cross-team collaboration, and hands-on contribution to scalable software systems. You will also engage with open-source communities, contributing to projects that enhance observability standards globally. Ideal candidates are experienced engineers who thrive in remote, autonomous environments and are passionate about building high-quality, reliable systems that help customers monitor and optimize their infrastructure. This is a chance to influence technical strategy while mentoring team members and delivering impactful solutions.

  • Design, implement, and maintain scalable integrations for metrics, logs, and traces across cloud and Kubernetes environments.
  • Build middleware, libraries, and services to simplify development and observability workflows.
  • Lead technical direction and strategic planning for observability projects.
  • Collaborate with product, support, and sales teams to ensure holistic, high-quality customer experiences.
  • Contribute to open-source projects and represent the team in relevant technical forums.
  • Mentor team members, review code, and enforce engineering best practices.
  • Take ownership of production systems, ensuring reliability, scalability, and maintainability.

Qualifications

  • 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
  • Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
  • Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
  • Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
  • Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
  • Strong problem-solving, debugging, and mentoring skills.
  • Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
  • Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.

Requirements

  • 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
  • Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
  • Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
  • Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
  • Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
  • Strong problem-solving, debugging, and mentoring skills.
  • Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
  • Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.

Benefits

  • Competitive US-based salary range: $174,986 - $209,983 USD, plus Restricted Stock Units (RSUs).
  • 100% remote work with a global, autonomous culture.
  • Significant career growth opportunities within technical leadership pathways.
  • Generous annual leave policy (30 days) including company-wide shutdown days.
  • Access to modern AI-assisted development tools and frontier models for productivity.
  • Contribution to open-source projects and engagement with a global engineering community.
  • Transparent and collaborative organizational culture with approachable leadership.

Job Requirements

  • 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
  • Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
  • Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
  • Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
  • Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
  • Strong problem-solving, debugging, and mentoring skills.
  • Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
  • Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.

Benefits

  • Competitive US-based salary range: $174,986 - $209,983 USD, plus Restricted Stock Units (RSUs).
  • 100% remote work with a global, autonomous culture.
  • Significant career growth opportunities within technical leadership pathways.
  • Generous annual leave policy (30 days) including company-wide shutdown days.
  • Access to modern AI-assisted development tools and frontier models for productivity.
  • Contribution to open-source projects and engagement with a global engineering community.
  • Transparent and collaborative organizational culture with approachable leadership.

Related Job Pages

More Software Engineer Jobs

Full TimeRemoteTeam 10,001

The Software Developer will be responsible for performing basic problem resolution analysis and corrections, as well as developing new software based on approved design documents. They must also correct system problems as directed by management and ensure projects are delivered on time according to specifications.

United States
Full TimeRemoteTeam 3,222Since 2012

The role involves leading technical initiatives to automate network engineering efforts, ensuring the reliability of global infrastructure, and growing platform infrastructure to meet scaling demands through software development and tooling. Responsibilities also include collaborating inclusively, focusing on operational perfection, preventing recurring customer impact from major incidents, and participating in a well-spread on-call rotation.

United States
$159K - $252K / year

Manager, Detection & Response

DraftKings Inc.

Defining what it means to build and deliver the most extraordinary sports & entertainment experiences.The Crown is Yours

Software Engineer4 days ago
Full TimeRemoteTeam 1,001-5,000Since 2012H1B No Sponsor

Lead and mentor a team in security monitoring, incident response, and threat hunting, enhancing operations and scaling capabilities across cloud platforms.

AWSAzureEdr/Xdr ToolsGCPSiem Platforms
United States
$144.6K - $180.8K / year

Airtable Software Engineer

Jobgether

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Software Engineer4 days ago
ContractRemote

This role offers a unique opportunity to design and scale the technical systems that power complex marketing, design, and operational workflows. As an Airtable Software Engineer, you will build and optimize a sophisticated Airtable ecosystem that supports cross-functional collabo...

United States
$80 - $100 / hour