A distributed marketplace for compute
AI Infrastructure Engineer
Location
United States
Posted
29 days ago
Salary
$150K - $225K / year
Job Description
Job Requirements
- Bare metal Linux depth — you've administered GPU servers at the metal: driver stacks, kernel tuning, firmware, storage configuration. Not just managed K8s.
- NVIDIA GPU stack expertise — drivers, CUDA, NCCL, NVLink, nvidia-smi profiling. You understand how stack compatibility affects performance.
- Kubernetes and orchestration — production experience with K8s, SLURM, or similar. You know how to stand up clusters, not just deploy to them.
- AI Networking fundamentals — TCP/IP, VLANs, bonding, and high-speed interconnects (InfiniBand, RoCE) for distributed workloads.
- Customer-facing communication — you can work directly with engineers at AI platform companies, understand their constraints, and translate that into clear requirements for your team.
- Bias toward scalable solutions — you'd rather build a feature that helps 10 customers than a custom deployment that helps 1.
- Nice to Have HPC or large-scale distributed training environments.
- AI workload experience (vLLM, PyTorch, inference frameworks).
- Storage systems (NVMe, distributed filesystems, CEPH, WEKA).
- IaC and provisioning tools (Terraform, Ansible, Cloud-init, MaaS).
Benefits
- Competitive salary
- Equity ownership
- Healthcare — medical, dental, vision for you and your family
- Remote-first — with hubs in Phoenix, Boulder, and Miami
- Direct impact — your work shapes how GPU infrastructure gets deployed across the AI ecosystem
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
Software Engineer, Privacy Infrastructure Engineering
NetflixWhere you come to do the best work of your life. Follow @WeAreNetflix on Twitter, IG, Facebook, & Youtube for more
Software Engineer building privacy solutions for Netflix infrastructure
Data Infrastructure Engineer
FungaHarnessing forest fungal networks to address the biodiversity and climate crises.
Data Infrastructure Engineer building scalable data solutions at Funga
Design and manage OpenShift clusters, automate tasks, diagnose issues, configure monitoring, and provide technical leadership to clients, ensuring robust infrastructure solutions.
Principal Infrastructure Engineer
OnebriefSoftware for rapid military planning: make planning fast enough for today's environment
Lead infrastructure evolution and scalability for AI-powered military workflow software.