OpenFX

Experience a better way to move money

Site Reliability Engineer – SRE

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 1-10Since 2024H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

161 days ago

Salary

Not specified

Bachelor Degree5 yrs expEnglishAWSCloudGrafanaKubernetesPrometheusTerraform

Job Description

• Serve as first responder for production incidents during U.S. operating hours (±2h EST). • Lead triage during outages, analyzing logs, metrics, and traces to identify root causes. • Drive incident postmortems and follow-ups to prevent recurrence. • Communicate clearly and quickly during incidents to internal stakeholders. • Own reliability outcomes across all OpenFX systems, with a focus on uptime, latency, and error budgets. • Enhance observability through logging, metrics, alerting, and dashboards. • Optimize on-call processes and ensure smooth handoffs across IST, EST, and PST coverage. • Partner with DevOps and engineering pods to implement fixes or approve production changes. • Proactively identify systemic reliability risks and propose improvements. • Contribute automation and tooling to reduce manual incident handling. • Champion best practices in reliability engineering and operational excellence.

Job Requirements

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
  • Proven experience leading incident response, running postmortems, and communicating during outages.
  • Strong background with cloud infrastructure (AWS preferred), container orchestration (Kubernetes, ECS), and Infrastructure-as-Code (Terraform, CloudFormation).
  • Familiarity with observability stacks (e.g., Prometheus, Grafana, Datadog, ELK, OpenTelemetry).
  • Ability to triage errors at both the infrastructure and application level, and escalate effectively when deeper intervention is required.
  • Ownership mindset with strong communication skills in high-pressure situations.

Benefits

  • Competitive salary and benefits package.
  • Equity in a rapidly growing company.
  • Opportunity to work on mission-critical infrastructure in fintech.
  • A collaborative team culture with a bias toward ownership and outcomes.
  • The chance to make a direct impact on the resilience of global financial infrastructure.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Site Reliability Engineer

The Voleon Group

Applying statistical machine learning to investment management.

DevOps Engineer162 days ago
Full TimeRemoteTeam 51-200Since 2007H1B No Sponsor

Senior Cluster Site Reliability Engineer scaling research compute clusters for financial technology.

AnsibleAWSCloudGoogle Cloud PlatformGrafanaPrometheusPythonRubyTerraform
California
$205K - $235K / year

Senior DevOps Engineer

Domyn

Domyn empowers enterprises with AI they fully own, govern, and trust.

DevOps Engineer162 days ago
Full TimeRemoteTeam 51-200Since 2016H1B No Sponsor

Senior DevOps Engineer for Domyn building cloud and on-prem enterprise AI infrastructure

AWSAzureCloudDockerGoogle Cloud PlatformJavaJavaScriptKubernetesLinuxPostgresPythonTerraform
United States

DevOps Engineer

Mission Box Solutions

Connecting great companies w/ great people by providing meaningful talent solutions & building impactful relationships.

DevOps Engineer164 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

Talent-pool for DevOps-specialist roles at Mission Box Solutions recruiting agency

New York

Senior DevOps Engineer

Castillians

The world's trusted engineering network

DevOps Engineer165 days ago
ContractRemoteTeam 51-200Since 2006H1B No Sponsor

Senior DevOps Engineer developing and maintaining software solutions for a leading Igaming company

AnsibleAWSAzureCloudDockerGoogle Cloud PlatformGrafanaGroovyJenkinsKubernetesMicroservicesPythonTerraform
United States