Open Role

Machine Learning EngineerMachine Learning EngineerFull TimeRemoteTeam 10Since 2025Company Site

Location

California

Posted

24 days ago

Salary

Not specified

Bachelor Degree5 yrs expEnglishAIEvaluation FrameworksLlms

Job Description

About HUD HUD (YC W25) is developing agentic evals for Computer Use Agents (CUAs) that browse the web. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs. Our Mission: People don't actually know if AI agents are working. To make AI agents work in the real world, we need detailed evals for a huge range of tasks. We're backed by Y Combinator, and work closely with frontier AI labs to provide agent evaluation infrastructure at scale. About the role HUD is a fast-growing startup. If you can't find a role on our job board, feel free to suggest a new role, and we'll reach out if we find a good fit. :) Things we might hire for: Building new evaluations/eval environments for HUD's CUA evaluation framework . Building out our CUA evals framework Conducting outbound sales, developing partnerships and improving developer experience for CUA developers Leading and supporting teams of research engineers as they build out our evals General startup operations as we scale Experience Strong candidates may have: Engagement with AI Safety and AI alignment Understanding of LLM evaluation frameworks, particularly multimodal and agentic evaluations Familiarity in using and deploying latest AI tools for operational efficiency Experience in in fullstack LLM deployment, particularly for multimodal and agentic AI evaluations Prior experience in fast-growing startup teams Team & Company Details Team Size : ~15 people currently, mostly full-time in-person, but some remote. Our team: Our team includes 4 international Olympiad medallists (IOI, ILO, IPhO), serial AI startup founders, and researchers with publications at ICLR, NeurIPS etc Company stage: We have received $2 million in seed funding, plus very strong demand and revenue growth beyond that. We are scaling profitably and fast to meet demand. Logistics Employment : Fulltime preferred, but we're willing to consider internship offers. Location : Remote-friendly, but if you’re in the San Francisco Bay Area, we do have an office you can work together in. We prioritise applicants who can show up to meetings in Pacific Time (UTC-7:00/8:00) or China/Singapore Time (UTC +8:00). Visa Sponsorship : We provide support for relocation and visas for strong full-time candidates. For part-time/contract/internship arrangements, we'll work fully remote (which makes things simpler anyway). Timeline : Applications are rolling. The process should involve 1-2 interviews and take less than a week. We prioritize operational aptitude and cultural fit. Motivated candidates are encouraged to apply even if they don't meet all criteria. Due to high volume, we may not actively respond to every application, but feel free to contact us at recruiting@hud.so or elsewhere if we missed your application!

Related Job Pages

More Machine Learning Engineer Jobs

Full TimeRemoteTeam 14Since 2024

Design and implement machine learning algorithms for robotic manipulation, focusing on RL and IL. Collaborate across disciplines to ensure integration and successful deployment.

GazeboGpuImitation LearningMujocoPythonPyTorchReinforcement LearningTpu
California

Staff Engineer, Machine Learning Operations

Monogram Health

Monogram Health is a leading multispecialty provider of in-home, evidence-based care for the most complex of patients who have multiple chronic conditions. Monogram Health takes a comprehensive and personalized approach to a person’s health, treating not only a disease, but all of the chronic conditions that are present. Employs a robust clinical team, leveraging specialists across multiple disciplines Available 24 hours a day, 7 days a week, and on holidays Proven to dramatically improve patient outcomes and quality of life while reducing medical costs

Machine Learning Engineer24 days ago
Full TimeRemoteTeam 414Since 2019

Lead and scale enterprise ML infrastructure and deployment pipelines, own end-to-end model lifecycle (development to production), drive MLOps strategy, ensure production reliability and compliance, mentor teams, and collaborate with clinical and product stakeholders.

PythonAzureAzure MlMlflowKubeflowSagemakerDockerKubernetesAirflowPrefectTerraformArm TemplatesCI/CDGitopsSQLSparkPysparkDatabricksFeature StoresExperiment TrackingA/B TestingFhirHl7Claims Data
Tennessee

Machine Learning Engineer

Stripe

Help increase the GDP of the internet.

Machine Learning Engineer24 days ago
Full TimeRemoteTeam 1,001-5,000Since 2010H1B Sponsor

Senior Machine Learning Engineer building end-to-end ML architecture at Stripe

Distributed SystemsPythonRuby
California + 1 moreAll locations: California, New York

Staff Machine Learning Engineer

Dragos

Dragos is on a relentless mission to defend industrial organizations that provide us with the necessities of modern civilization; running water, functioning electricity, and safe industrial working environments. As the market leader in ICS/OT Cybersecurity, we are dedicated to arming our customers with best-in-class technology, threat intelligence, and services to protect their systems as effectively and efficiently as possible. We’re a remote-first culture with operations in North America, Europe, the Middle East, and APAC. We’re looking for mission-oriented teammates who embody our core values of authenticity, transparency, and trust. Are you ready to make a difference? Come join a mission that can save the world!

Machine Learning Engineer24 days ago
Full TimeRemoteTeam 295Since 2016

Design and implement production-grade ML systems for ICS/OT cybersecurity, build and optimize models (including NLP/LLMs), develop data pipelines and MLOps workflows, collaborate with data teams to deploy and monitor models in cloud and on-prem containerized environments, and troubleshoot production performance and resource issues.

PythonSQLGoRustJavaJvmScikit-LearnPytorchTensorflowHuggingfaceLlmsRagNlpKubernetesDockerCI/CDMlops
United States
$225K / year