Humana

Humana Inc. (NYSE: HUM) is committed to putting health first – for our teammates, our customers and our company. Through our Humana insurance services and CenterWell healthcare services, we make it easier for the millions of people we serve to achieve their best health – delivering the care and service they need, when they need it. These efforts are leading to a better quality of life for people with Medicare, Medicaid, families, individuals, military service personnel, and communities at large. Equal Opportunity Employer It is the policy of Humana not to discriminate against any employee or applicant for employment because of race, color, religion, sex, sexual orientation, gender identity, national origin, age, marital status, genetic information, disability or protected veteran status. It is also the policy of Humana to take affirmative action, in compliance with Section 503 of the Rehabilitation Act and VEVRAA, to employ and to advance in employment individuals with disability or protected veteran status, and to base all employment decisions only on valid job requirements.

Senior Tech Lead – SRE

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 10,001+Since 1961H1B SponsorCompany SiteLinkedIn

Location

California + 3 moreAll locations: California, Illinois, Montana, South Dakota

Posted

28 days ago

Salary

$106.9K - $147K / year

Bachelor Degree7 yrs expEnglishAWSAzureCloudDistributed SystemsGoogle Cloud PlatformKafkaOraclePostgresPy SparkPythonSQLGo

Job Description

• Lead SRE team initiatives focused on system reliability, automation, and operational excellence. • Architect and implement solutions to enhance availability, performance, and scalability of cloud and on-premises services. • Oversee incident management processes, ensuring timely response and thorough root cause analysis. • Develop and refine monitoring, alerting, and reporting frameworks; ensure actionable insights for service health. • Guide adoption of Infrastructure as Code (IaC) and CI/CD pipelines to streamline deployments and reduce risk. • Collaborate with software engineering and product teams to integrate reliability requirements into design and development. • Mentor engineers on SRE principles, fostering a culture of continuous improvement and operational rigor. • Establish service level objectives (SLOs), service level indicators (SLIs), and error budgets in partnership with stakeholders. • Manage on-call rotations, ensuring effective coverage and knowledge sharing. • Document architectural decisions, operational procedures, and incident retrospectives. • Operational Excellence for AI Systems – Identifying AI/ML Use Cases, Influence and implement SRE best practices including SLIs/SLOs for ML workloads, automated remediation, capacity modeling. • Observability & Monitoring for ML - Define and implement monitoring strategies for model drift, data anomalies, pipeline failures, system performance, and user experience. • Proactive risk identification and mitigation during deployments to ensure system stability. • Ensure long-term stability through Technical Debt Maintaining observability and performance of critical pharmacy applications. • Supporting timely restoration of services during outages, with 24/7 coverage to meet enterprise Service Level Agreements (SLAs). • Driving incident response and root cause analysis to prevent recurrence and improve system resilience.

Job Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
  • 7+ years of relevant experience in SRE, DevOps, or software engineering, including 2+ years in a technical leadership role.
  • Minimum 5 years' relevant experience with Python, Pyspark, Azure Databricks, Snowflake, SQL, ORACLE, POSTGRES, File Transfer, REST API, and KAFKA
  • Proficiency with cloud platforms (AWS, Azure, GCP), container orchestration, and automation tools.
  • Strong scripting and programming skills (e.g., Python, Go, Bash).
  • Deep understanding of distributed systems, networking, and security principles.
  • Proven experience leading large-scale incident response and postmortem processes.
  • Excellent communication and stakeholder management skills.
  • Experience building automation around: CI/CD (ADO YAML pipelines), Testing and validation.

Benefits

  • medical, dental and vision benefits
  • 401(k) retirement savings plan
  • time off (including paid time off, company and personal holidays, volunteer time off, paid parental and caregiver leave)
  • short-term and long-term disability
  • life insurance and many other opportunities

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 98Since 2015

Lead reliability, scalability, security, and automation efforts for business-critical services. Build infrastructure-as-code, implement compliance (FedRAMP/IL5), plan roadmaps, optimize cost, and collaborate with security and architects.

AWSGCPAzurePythonGoRubyJavaAnsibleTerraformPulumiFedrampIl5Dod Impact Level 5CmmcNist 800-53Ai Tools
United States
$120K - $180K / year
Full TimeRemoteTeam 176Since 2006

The Senior DevOps Engineer will manage AWS infrastructure for reliability and performance, improve automation and observability, and enhance security and cost efficiency.

AuroraAWSCdkCloudFormationCloudfrontCloudwatchDockerEcsEksElasticacheMySQLNginxPHPRdsRedisRoute53TerraformWaf
Georgia

Director of DevOps and Site Reliability Engineering (SRE)

CargoSprint

Empowering the people that make global commerce happen.

DevOps Engineer28 days ago
Full TimeRemoteTeam 201-500Since 2012H1B Sponsor

Lead DevOps, SRE, and Database teams to build scalable Azure Cloud infrastructure, implement CI/CD pipelines, and drive automation and security practices.

Ai-Driven ToolingAzure CloudAzure DevopsAzure MonitorCI/CDCosmosdbDockerElkGithub CopilotGrafanaKubernetesMySQLPostgreSQLPrometheusRedisSQL ServerTerraform
United States
Full TimeRemoteTeam 90Since 2015

The DevOps Engineer will architect and maintain AWS infrastructure, manage Kubernetes orchestration, implement CI/CD practices, and support AI/ML deployment, ensuring operational reliability and scalability.

Argo WorkflowsAWSCI/CDDockerGrafanaHasuraKafkaKubernetesLookerPrometheusSnowflakeTerraform
New York