Senior Python Systems Engineer
Location
United States + 42 moreAll locations: United States, United Kingdom, Germany, France, Estonia, Portugal, Hungary, Poland, Ukraine, Romania, Bulgaria, Czech Republic, Slovakia, Belarus, Moldova, Republic Of, Sweden, Greece, Belgium, Italy, Ireland, Switzerland, Netherlands, Finland, Malta, Denmark, Lithuania, Croatia, Spain, Austria, Bosnia And Herzegovina, Iceland, Luxembourg, Macedonia, The Former Yugoslav Republic Of, Montenegro, Norway, Serbia, Slovenia, Albania, Cyprus, Latvia, Monaco
Posted
8 days ago
Salary
Not specified
Job Description
Role Description
We are looking for a Senior Systems Engineer to own the execution layer of the ClearML platform. You will be responsible for some of the critical components that spin up containers, manage GPUs, and tunnel connections that make ClearML work seamlessly in multiple environments.
This role sits at the intersection of Software Engineering and DevOps. You will write Python code that orchestrates infrastructure, manages Docker containers, interacts with the Kubernetes API, and handles low-level networking.
- Agent Development: Design and optimize the clearml-agent, a Python service responsible for pulling jobs, setting up environments, and executing ML pipelines.
- Kubernetes Integration: Write logic to interact directly with K8s APIs, manage Pod life-cycles, and handle Custom Resource Definitions (CRDs).
- Resource Management: Implement logic for dynamic resource allocation (GPU/CPU/Memory) and container orchestration.
- Systems Programming: Build robust daemons and services that interact with OS-level primitives (systemd, signals, I/O streams).
- Networking: Troubleshoot and optimize TCP/IP connections, DNS resolution, and firewall traversal to ensure seamless connectivity for users.
Qualifications
- 8+ years of development experience with a strong focus on Systems Programming.
- Kubernetes Mastery: Deep understanding of Kubernetes architecture (beyond just writing YAML). You should know how to write code that controls K8s.
- Container Internals: Extensive experience with Docker, including building and maintaining images.
- Python for Systems: Experience using Python for automation, daemons, or CLI tools (using libraries like subprocess, socket, asyncio).
- Networking Fundamentals: Strong grasp of HTTP/S, WebSockets, TCP/IP, Proxies, and Reverse Proxies.
- OS Knowledge: Strong understanding of Linux internals and shell scripting.
Requirements
- Experience with GPU hardware management (NVIDIA drivers, CUDA, NVIDIA Container Toolkit).
- Experience building Kubernetes Operators/Controllers (using Kopf or Operator SDK).
- Background in HPC (High-Performance Computing) or Slurm/MPI.
- Experience with Go (Golang) is a plus (for specific K8s components).
Benefits
- Fully Remote & Global – Work from anywhere with a distributed team of top-tier engineers.
- Engineering-First & Autonomous – High ownership, real responsibility, and freedom to design and ship impactful solutions.
- High Growth, High Impact – Your work directly affects thousands of users, from startups to large enterprises.
- Technically Deep Challenges – Build complex, performance-critical systems at the core of modern AI infrastructure.
- Fast Feedback, Real Users – See your work in production quickly and make a measurable difference.
Job Requirements
- 8+ years of development experience with a strong focus on Systems Programming.
- Kubernetes Mastery: Deep understanding of Kubernetes architecture (beyond just writing YAML). You should know how to write code that controls K8s.
- Container Internals: Extensive experience with Docker, including building and maintaining images.
- Python for Systems: Experience using Python for automation, daemons, or CLI tools (using libraries like subprocess, socket, asyncio).
- Networking Fundamentals: Strong grasp of HTTP/S, WebSockets, TCP/IP, Proxies, and Reverse Proxies.
- OS Knowledge: Strong understanding of Linux internals and shell scripting.
- Experience with GPU hardware management (NVIDIA drivers, CUDA, NVIDIA Container Toolkit).
- Experience building Kubernetes Operators/Controllers (using Kopf or Operator SDK).
- Background in HPC (High-Performance Computing) or Slurm/MPI.
- Experience with Go (Golang) is a plus (for specific K8s components).
Benefits
- Fully Remote & Global – Work from anywhere with a distributed team of top-tier engineers.
- Engineering-First & Autonomous – High ownership, real responsibility, and freedom to design and ship impactful solutions.
- High Growth, High Impact – Your work directly affects thousands of users, from startups to large enterprises.
- Technically Deep Challenges – Build complex, performance-critical systems at the core of modern AI infrastructure.
- Fast Feedback, Real Users – See your work in production quickly and make a measurable difference.
Related Guides
Related Job Pages
More Software Engineer Jobs
The Engineer / Scientist 1 supports engineering and technical efforts related to the development, integration, and sustainment of software and data services supporting mission systems and analytic platforms. This role contributes to technical tasks within an Agile development env...
The Engineer / Scientist 4 serves as a senior technical expert responsible for applying advanced engineering or scientific expertise to support the design, development, integration, and sustainment of complex technical systems and mission capabilities. This role provides subject ...
The Software Engineer 3 serves as a senior developer supporting the design, development, and integration of software capabilities supporting mission systems and analytic platforms. This role contributes to the development of scalable cloud-native applications and services support...
Oracle APEX Developer / Application Support - (remote worker opening)
BAE Systems, Inc.BAE Systems, Inc. is the U.S. subsidiary of BAE Systems plc, an international defense, aerospace and security company which delivers a full range of products and services for air, land and naval forces, as well as advanced electronics, security, information technology solutions and customer support services. Improving the future and protecting lives is an ambitious mission, but it’s what we do at BAE Systems. Working here means using your passion and ingenuity where it counts – defending national security with breakthrough technology, superior products, and intelligence solutions. As you develop the latest technology and defend national security, you will continually hone your skills on a team—making a big impact on a global scale. At BAE Systems, you’ll find a rewarding career that truly makes a difference. The Platforms & Services (P&S) sector under BAE Systems, Inc does the big stuff: the armored combat vehicles, naval guns, missile launchers, and naval ship repair…just to name a few. Our employees take pride in the work they do and why they do it. They are on the front lines every day, building our products to protect the lives of those who serve. We may be biased, but we think P&S does some of the coolest work around, and we think you will too.
Develop modular web applications in Oracle APEX, collaborating in all phases of the SDLC, ensuring quality and maintenance of software systems.