We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Staff Software Engineer - Grafana Cloud Observability, Kubernetes Monitoring
Location
United States
Posted
4 days ago
Salary
$175.0K - $210.0K / year
No structured requirement data.
Job Description
Role Description
This role offers a unique opportunity to shape and advance cloud observability solutions for large-scale systems, focusing on metrics, logs, and traces. You will work on developing and maintaining the backend for observability services, including Kubernetes monitoring, database observability, and cloud infrastructure metrics. The position emphasizes technical leadership, cross-team collaboration, and hands-on contribution to scalable software systems. You will also engage with open-source communities, contributing to projects that enhance observability standards globally. Ideal candidates are experienced engineers who thrive in remote, autonomous environments and are passionate about building high-quality, reliable systems that help customers monitor and optimize their infrastructure. This is a chance to influence technical strategy while mentoring team members and delivering impactful solutions.
- Design, implement, and maintain scalable integrations for metrics, logs, and traces across cloud and Kubernetes environments.
- Build middleware, libraries, and services to simplify development and observability workflows.
- Lead technical direction and strategic planning for observability projects.
- Collaborate with product, support, and sales teams to ensure holistic, high-quality customer experiences.
- Contribute to open-source projects and represent the team in relevant technical forums.
- Mentor team members, review code, and enforce engineering best practices.
- Take ownership of production systems, ensuring reliability, scalability, and maintainability.
Qualifications
- 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
- Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
- Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
- Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
- Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
- Strong problem-solving, debugging, and mentoring skills.
- Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
- Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.
Requirements
- 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
- Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
- Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
- Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
- Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
- Strong problem-solving, debugging, and mentoring skills.
- Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
- Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.
Benefits
- Competitive US-based salary range: $174,986 - $209,983 USD, plus Restricted Stock Units (RSUs).
- 100% remote work with a global, autonomous culture.
- Significant career growth opportunities within technical leadership pathways.
- Generous annual leave policy (30 days) including company-wide shutdown days.
- Access to modern AI-assisted development tools and frontier models for productivity.
- Contribution to open-source projects and engagement with a global engineering community.
- Transparent and collaborative organizational culture with approachable leadership.
Job Requirements
- 8+ years of experience in software engineering with strong programming skills (Python, Java, Go, Rust, .NET, or similar).
- Hands-on experience operating and monitoring high-scale production systems on Kubernetes, including on-call responsibilities and incident management.
- Familiarity with observability tooling and concepts, including Grafana, Prometheus, Loki, Tempo, and OpenTelemetry.
- Deep understanding of distributed systems, time-series data, scalability, consistency, and high availability.
- Proven track record in technical leadership, guiding architectural decisions, and delivering projects end-to-end.
- Strong problem-solving, debugging, and mentoring skills.
- Excellent communication skills and ability to thrive in a fully remote, collaborative environment.
- Bonus: experience with Prometheus in multi-tenant environments, Kubernetes operators, CKA/CKAD certification, and open-source contributions in observability.
Benefits
- Competitive US-based salary range: $174,986 - $209,983 USD, plus Restricted Stock Units (RSUs).
- 100% remote work with a global, autonomous culture.
- Significant career growth opportunities within technical leadership pathways.
- Generous annual leave policy (30 days) including company-wide shutdown days.
- Access to modern AI-assisted development tools and frontier models for productivity.
- Contribution to open-source projects and engagement with a global engineering community.
- Transparent and collaborative organizational culture with approachable leadership.
Related Guides
Related Job Pages
More Software Engineer Jobs
The Software Developer will be responsible for performing basic problem resolution analysis and corrections, as well as developing new software based on approved design documents. They must also correct system problems as directed by management and ensure projects are delivered on time according to specifications.
The role involves leading technical initiatives to automate network engineering efforts, ensuring the reliability of global infrastructure, and growing platform infrastructure to meet scaling demands through software development and tooling. Responsibilities also include collaborating inclusively, focusing on operational perfection, preventing recurring customer impact from major incidents, and participating in a well-spread on-call rotation.
Manager, Detection & Response
DraftKings Inc.Defining what it means to build and deliver the most extraordinary sports & entertainment experiences.The Crown is Yours
Lead and mentor a team in security monitoring, incident response, and threat hunting, enhancing operations and scaling capabilities across cloud platforms.
Airtable Software Engineer
JobgetherWe use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
This role offers a unique opportunity to design and scale the technical systems that power complex marketing, design, and operational workflows. As an Airtable Software Engineer, you will build and optimize a sophisticated Airtable ecosystem that supports cross-functional collabo...