Staff Software Engineer, ML Platform

Full-stack EngineerSoftware EngineerFull TimeRemoteTeam 1-10H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

61 days ago

Salary

Not specified

10 yrs expExperience acceptedEnglishAWSAzureCloudDistributed SystemsGoogle Cloud PlatformKubernetesTerraformType ScriptGo

Job Description

• Build Enterprise-Scale Infrastructure • Leverage infrastructure-as-code to manage complex cloud environments supporting critical ML and AI initiatives. • Design Kubernetes-native systems, including controllers/operators where appropriate. • Improve platform networking, security, and observability • Sustain Platform Health and Performance • Own critical systems in production, including reliability, scalability, security, and cost efficiency. • Identify and proactively address technical debt, operational risk, and platform bottlenecks. • “Learn by doing” — Quickly ramp up to a complex tech stack (Terraform, Kubernetes, Istio, Crossplane, Go, TypeScript) • Enable Teams and Customers to Move Faster • Create abstractions and tooling that make it easier for teams and customers to deploy, run, and scale AI/ML workloads. • Collaborate directly with customers to understand their ML infrastructure challenges and translate them into platform improvements. • Balance speed and rigor—shipping quickly while maintaining a high bar for quality and safety. • Lead Through Influence • Act as a technical leader and mentor across the engineering organization. • Write clear documentation and design proposals that align stakeholders and drive decisions. • Partner closely with product and leadership to shape platform direction and priorities.

Job Requirements

  • 10+ years of engineering experience, with significant time spent on infrastructure, platform, or distributed systems.
  • Deep hands-on experience with Kubernetes in production environments.
  • Strong cloud experience across AWS, GCP, and/or Azure.
  • Proven track record of building and operating secure, scalable MLOps platforms.
  • Deep understanding of infrastructure-as-code (e.g., Terraform, Pulumi, CDK).
  • Strong programming skills in at least one backend language (Go preferred; TypeScript also welcome).
  • Experience diagnosing and debugging complex production issues.
  • Familiarity with modern CI/CD, test-driven development, and DevSecOps practices.
  • Bonus: experience building Kubernetes operators and/or working with service meshes (e.g., Istio).
  • Comfortable owning large, ambiguous problems from inception to production.
  • Excellent communicator, able to clearly explain complex systems to both technical and non-technical audiences.
  • Experience working directly with customers and incorporating feedback into technical decisions.
  • Ability to operate autonomously while keeping stakeholders informed and aligned.
  • Customer-first and product-oriented.
  • Curious, adaptable, and eager to learn new systems and domains.
  • Collaborative, respectful, and willing to lean into hard conversations.
  • Energized by fast-paced environments and meaningful responsibility.

Benefits

  • Competitive cash compensation alongside above-market equity upside
  • Top-tier fully covered medical, dental, and vision insurance
  • Life insurance
  • 401k program
  • Unlimited PTO
  • Monthly half day
  • Citi Bike membership
  • Monthly wellness stipend
  • Office equipment stipend, including reimbursement for approved disability-related accommodations
  • Investment in employee learning and growth opportunities

Related Job Pages

More Full-stack Engineer Jobs

Senior Software Engineer

Tenable

Cloud Security | Operational Technology | Identity Security | and more

Full-stack Engineer61 days ago
Full TimeRemoteTeam 1,001-5,000Since 2002H1B Sponsor

Full Stack Software Engineer developing cybersecurity solutions at Tenable Inc.

AngularAWSCloudCyber SecurityDistributed SystemsDockerDynamoDBElasticSearchJavaJavaScriptKafkaKotlinKubernetesMicroservicesNoSQLPostgresPrometheusReactSplunkSQLTerraformVue.js
California + 2 moreAll locations: California, Maryland, Massachusetts
$137.5K - $183.5K / year
Full TimeRemoteTeam 10,001+Since 1993H1B Sponsor

Partner Enablement Engineer supporting NCCL and GPU applications for AI

AnsibleAWSAzureCloudDockerGoogle Cloud PlatformKubernetesLinuxNode.jsPython
California + 1 moreAll locations: California, Texas
$152K - $218.5K / year

Full Stack Engineer

Fieldwire by Hilti

The all-in-one jobsite management software for field to office communication.

Full-stack Engineer61 days ago
Full TimeRemoteTeam 51-200Since 2013H1B No Sponsor

Mid-Level Fullstack Engineer developing core features for construction management platform

AngularBootstrapRubyRuby on RailsRustSCSS
United States
$145K - $170K / year

Software Engineer – Support Experience

SeatGeek

Help the world experience more live.

Full-stack Engineer61 days ago
Full TimeRemoteTeam 501-1,000Since 2009H1B Sponsor

Software Engineer developing ticketing solutions at SeatGeek

United States
$121K - $175K / year