Vultr

Vultr is on a mission to make high-performance cloud computing easy to use, affordable, and locally accessible.

Network DevOps Engineer, RDMA Fabric Automation

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 201-500Since 2014Company SiteLinkedIn

Location

United States

Posted

16 days ago

Salary

$90K - $130K / year

EnglishAnsibleGrafanaJenkinsKafkaLinuxPHPPrometheusPythonRustGo

Job Description

• Automate deployment and operations of large-scale RDMA (RoCEv2) Ethernet fabrics across Vultr data centers. • Build Ansible and Python-based frameworks to provision, validate, and remediate underlay and overlay networks. • Integrate network automation with Vultr’s source-of-truth systems (NetBox, OpsMill) for intent-driven configuration and validation. • Develop telemetry ingestion and correlation pipelines (gNMI, Prometheus, Kafka, custom collectors) for real-time network health and performance metrics. • Collaborate with platform, orchestration, and product engineering teams to optimize RDMA performance, PFC/ECN behavior, and path symmetry across fabrics. • Implement CI/CD workflows for network configuration changes — validation, pre-checks, and rollbacks. • Investigate complex network behaviors across layers — flow hashing, congestion domains, ECMP, and overlay interactions. • Contribute to the design of next-generation GPU and AI interconnect fabrics, ensuring seamless integration into Vultr’s global network architecture.

Job Requirements

  • Solid understanding of modern data center networking: EVPN-VXLAN, BGP, MLAG, QoS, and traffic engineering.
  • Deep familiarity with RoCEv2, RDMA transport tuning, ECN/PFC, and lossless Ethernet design.
  • Strong experience with automation frameworks like Ansible, and languages like Python, Golang, Rust, or PHP
  • Comfort working with telemetry and monitoring stacks — Prometheus, Grafana, Loki, ELK, or similar.
  • Previous experience integrating with NetBox, Nautobot, OpsMill or similar for topology and configuration source-of-truth.
  • Familiarity with CI/CD systems (GitHub Actions, Jenkins, ArgoCD) for continuous delivery of network automation.
  • Strong Linux networking background, including namespaces, netlink, and system-level debugging.

Benefits

  • 100% company-paid insurance premiums for employee medical, dental and vision plans.
  • 401(k) plan that matches 100% up to 4%, with immediate vesting
  • Professional Development Reimbursement of $2,500 each year
  • 11 Holidays + Paid Time Off Accrual + Rollover Plan
  • Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
  • $500 stipend for remote office setup in first year + $400 each following year
  • Internet reimbursement up to $75 per month
  • Gym membership reimbursement up to $50 per month
  • Company paid Wellable subscription

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer16 days ago
Full TimeRemoteTeam 51-200

Site Reliability Engineer ensuring performance of Crunchafi’s cloud-based SaaS platform

AzureCloudDNSDockerKubernetesPythonSQLTerraformGo
Wisconsin

Senior DevOps, Infrastructure Engineer

AlphaHire

The Operating System for Automated Hiring

DevOps Engineer16 days ago
Full TimeRemoteTeam 11-50Since 2020

Senior DevOps Engineer designing infrastructure for observability platform

AWSAzureDistributed SystemsDNSDockerGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonTerraformGo
United States

DevOps Engineer – Mission-Critical Systems

Tactibit Technologies

Mission-focused, innovative, and agile cybersecurity and IT operations support for the most demanding missions.

DevOps Engineer16 days ago
Full TimeRemoteTeam 11-50

DevOps Engineer modernizing legacy architectures for critical mission systems

AWSCloud
Maryland

Lead Site Reliability Engineer

DraftKings Inc.

Defining what it means to build and deliver the most extraordinary sports & entertainment experiences.The Crown is Yours

DevOps Engineer16 days ago
Full TimeRemoteTeam 1,001-5,000Since 2012H1B No Sponsor

Lead Site Reliability Engineer at DraftKings enhancing infrastructure reliability and efficiency

AnsibleAWSChefCloudDockerElixirGoogle Cloud PlatformIoTJavaKubernetesLinuxPythonRubyTerraformGo.NET
United States
$148K - $185K / year