Senior AI Platform Engineer
Location
United States
Posted
3 days ago
Salary
$165K - $225K / year
No structured requirement data.
Job Description
- Build and operate core backend services that power AI-enabled workflows (APIs, orchestration, storage, and internal integrations)
- Design scalable data models and registries for versioned artifacts and metadata, with strong traceability and auditability
- Implement secure-by-default service patterns: authN/authZ, audit logs, secrets handling, and least-privilege access
- Build reliability foundations: observability, metrics, tracing, alerting, SLOs, incident response playbooks
- Implement idempotent APIs and state-handling patterns for resilient workflows (retries, partial failure, reconciliation)
- Create integration adapters and event-driven plumbing to safely connect workflows to internal systems
- Establish release and deployment practices: CI/CD pipelines, environment promotion, rollback strategies, and safe migrations
- Partner closely with AI and application engineers to define interfaces, validation layers, and operational constraints
- Identify performance, scalability, and security risks early and ship pragmatic solutions quickly
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent practical experience)
- U.S. Citizenship and the ability to obtain and maintain a U.S. Government security clearance
- 6+ years of experience building and operating backend/platform systems in production
- 3+ years building platforms that support AI/ML systems in production (e.g., evaluation pipelines, model/app runtime infrastructure, artifact/metadata registries, AI workflow orchestration, MLOps)
- Experience operating LLM-enabled systems with production constraints (latency, cost, reliability), including monitoring quality/regressions and enforcing safe tool/data access
- Strong proficiency in one or more backend languages (e.g., Go, Java, Python, C++) and modern infrastructure practices
- Experience designing APIs, data models, and distributed systems with reliability and security best practices
- Experience with workflow/event systems (queues, pub/sub, orchestration, idempotency, state machines) used to run multi-step AI-driven pipelines
- Experience implementing observability for AI systems (metrics, tracing, logs) including quality/reliability signals beyond uptime (e.g., eval scores, rejection rates, cost/latency budgets)
- Experience with production security controls: RBAC/ABAC, audit logs, secrets management, data access boundaries
- Strong communication and documentation skills
- Experience with AI/ML infrastructure (model serving, inference gateways, feature/data pipelines, experiment tracking, artifact registries)
- Experience with GPU-aware infrastructure and/or high-throughput inference (capacity planning, batching, caching, rate limiting)
- Experience building evaluation platforms (offline/online evals, canaries, A/B testing, regression automation, dataset/version management)
- Familiarity with AI safety/security patterns (prompt injection mitigation, tool sandboxing, policy enforcement, data-loss prevention)
- Experience building internal platforms used by multiple teams with clear contracts, SLAs/SLOs, and well-managed migrations
- Experience with multi-tenant or role-based access control systems
- Remote (U.S.) with the option to be based at our Headquarters in San Diego, CA We welcome candidates who are local or open to relocating; relocation assistance is available and may be included in the offer package where appropriate
- Ability to travel up to 10%; may be required for team collaboration, field testing, or customer support
- We offer comprehensive medical, dental, and visions plans
- 401(k) Retirement Savings Plan to invest in your long-term retirement goals
- Equity grants for new hires
- Unlimited PTO
- Extremely generous company holiday calendar, including a holiday hiatus in November, & December
- Generous Parental Leave
- Lifestyle Spending Account
- FSA
- DCFSA
- HSA
- Hospital Indemnity insurance
- Critical Illness insurance
- Accident insurance
- Basic Life/AD&D, short-term and long-term disability insurance, 100% covered by Firestorm. Plus, the option to purchase additional life insurance for you and your family.
- Mental Health Resources: We provide free mental health resources 24/7 including therapy and more. Additional work-life services, such as free legal and financial support, are available to you as well
Export Control Compliance
Equal Opportunity Statement
Related Guides
Related Job Pages
More AI Engineer Jobs
AI Engineering Manager (Medhub)
EvolutionIQLeading the artificial intelligence transformation for insurance carriers.
The AI Engineering Manager will lead, coach, and grow a team of engineers while providing technical guidance and ensuring architectural and code quality standards are met. This role owns the end-to-end execution of engineering projects, establishing repeatable processes for planning, tracking, and reporting delivery.
The AI Access Initiative - General Talent Pool
Evidence ActionWe aim to be a world leader in scaling evidence-based and cost-effective programs to reduce the burden of poverty.
If you're interested in 2AI's work but none of our open roles feel like a strong fit, we encourage you to submit your information to our General Talent Pool. We'll periodically review applications and reach out if we identify a rol...
At Firestorm, we are building autonomous aerial systems that operate where they are needed most, when they are needed most. Our mission requires speed, ingenuity, and a relentless commitment to engineering excellence. We move fast, test constantly, and deliver capability that per...
The role involves building AI systems that solve immediate business problems and can be productized for external clients, focusing on shipping revenue-generating systems quickly. Responsibilities include developing AI-powered content systems, RevOps analytics pipelines, and sales intelligence tools while owning the underlying infrastructure and documenting all work for potential productization.