DroneUp
Integrating advanced autonomy into the national airspace.
SRE – Platform Engineer
Location
United States
Posted
17 days ago
Salary
$125K - $150K / year
Bachelor Degree8 yrs expExperience acceptedEnglishAWSAzureCloudGoogle Cloud PlatformGrafanaKubernetesLinuxMac OSNode.jsPrometheusPythonTerraformUnixGo
Job Description
• Broad domain architect for the internal developer platform and all cloud engineering
• Drive architecture for tooling or in-house software
• Mentor other platform engineers to drive strong engineering practices
• Enablement of platform engineering technical capabilities in our internal client teams in software engineering
• Peer with the senior architects and engineers in software engineering
• Architecture and engineering focused on GCP environment
• Architect and oversee GKE cluster operations and workload management
• Provide feedback to others and participate in peer reviews / pair programming
• Drive the broad adoption of Test Driven Development through designing, development, and debugging unit and integration tests for new and existing infrastructure and code
• Continuous curiosity of existing implementations and new technologies and sharing with the team
• Practice continuous improvement across all job areas and personally / professionally
• Clearly communicate with platform engineering teams and other stakeholders and provide technical direction while doing so
• Stay current with platform changes and third-party libraries.
• Proactively investigate better solutions for current solutions
• An understanding of Open Telemetry and true observability and the difference between it and monitoring and logging
• Grow the engineering culture towards a high-performing team
• Practice the arts of self-service, least privilege and security by default in all solutions
• Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets
• Lead incident response, including on-call rotations, root cause analysis, and post-mortem reviews
• Implement and optimize monitoring, alerting, and observability systems for system reliability
• Collaborate on capacity planning and performance optimization to ensure high availability
• Other duties as assigned
Job Requirements
- Bachelor's degree in Computer Science, Computer Engineering or related field or 8+ years experience as a software engineer
- Proficiency in kubernetes. Optional: CKA, CKAD
- Extensive experience in Unix / Linux
- Polyglot and proficiency in multiple languages (ideally: Golang, NodeJS, Python, HCL and more)
- Knowledge of multi-cloud environment, including GCP, AWS, and Azure (familiar with at least two of these environments)
- Experienced in using git in trunk-based development models
- Experience in use of feature flagging in infrastructure and runtime (k8s)
- Experience with backend database technology is a plus, including supporting and performance enhancements
- Advanced experience working with and creating public cloud resources in Terraform or other infrastructure as code tools
- Experience participating in a 24/7 on-call schedule without supervision and successfully resolving issues without escalation
- Experience using Open Telemetry for observability as well as other monitoring tools such as datadog, new relic and others
- Good understanding of networking and routing principles
- Experience in dockerizing applications and orchestrating them with kubernetes
- Familiarity with security configuration for web/api services (SSL, Access control)
- Experience with JIRA or other work tracking systems.
- Ability to resolve tickets according to priority order and collaborating with the Technical Product Manager to adjust priorities
- Excellent documentation details, using Confluence or similar tooling – this could include support notes, runbooks, ADRs, etc
- Familiarity with creating an end to end CI/CD pipeline using various tools with artifact storage
- Familiarity with use of MacOS as a desktop and predominantly CLI interfaces
- Experience in a “product mindset” by understanding stakeholder needs, priorities and business value
- Experience with security compliance frameworks including FedRAMP, NIST, and SOC2
- Proven experience in SRE practices, including incident management and reliability engineering
- Familiarity with monitoring tools like Prometheus, Grafana, or Honeycomb for observability
- Experience with chaos engineering, load testing, or reliability testing frameworks.
Benefits
- Employees are expected to provide a high level of security to any personal or private information accessed as part of their work, whether at a DroneUp facility or remotely.
- Participate in security training.
- Remain sensitive to individual rights to personal privacy.
- Comply with company policies.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer17 days ago
ContractRemoteTeam 1,001-5,000
Senior DevOps Engineer designing, building, and operating infrastructure
AWSAzureCloudDockerGoogle Cloud PlatformGrafanaJenkinsKubernetesPrometheusPythonTerraform
United States
DevOps Engineer17 days ago
Full TimeRemoteTeam 1,001-5,000
SRE Manager overseeing operational excellence and reliability at ECI Software Solutions
AWSAzureCloudTerraform
United States
Senior DevOps Engineer, Strong Kubernetes
MAS Global ConsultingModern digital solutions. Exceptional nearshore delivery.
DevOps Engineer17 days ago
Full TimeRemoteTeam 51-200Since 2013
Senior DevOps Engineer at MAS Global Consulting focusing on Kubernetes and developer productivity
AWSAzureCloudDistributed SystemsDockerGoogle Cloud PlatformJenkinsKotlinKubernetesPython
Florida
Junior DevOps Engineer
SimpliFedBecause the key to every happy and healthy baby is a happy and healthy mom
DevOps Engineer17 days ago
Full TimeRemoteTeam 11-50Since 2020
Junior DevOps Engineer maintaining AWS cloud infrastructure at healthcare startup
AWSCloudDockerEC2LinuxPrometheusPythonTerraform