Deepgram

Building foundational AI for speech transcription and understanding.

Model Evaluation QA Lead

QA EngineerQA EngineerFull TimeRemoteTeam 51-200Since 2015H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

32 days ago

Salary

$180K - $230K / year

5 yrs expEnglishJenkinsNumpyPandasPython

Job Description

• Model Evaluation Automation: Design, build, and maintain automated model evaluation pipelines that run against every candidate model before release. Implement objective and subjective quality metrics (WER, SER, MOS, latency/throughput) across STT, TTS, and STS product lines. • Release Gate Integration: Embed model quality checkpoints into CI/CD and release pipelines. Define pass/fail criteria, build dashboards for model comparison, and own the go/no-go signal for model promotions to production. • Agent & Model Evaluation Frameworks: Stand up and operate evaluation tooling (Coval, Braintrust, Blue Jay, custom harnesses) for end-to-end voice agent testing—covering accuracy, latency, turn-taking, and conversational quality and custom metrics across real-world scenarios. • Active Learning & Data Ingestion Testing: Partner with the Active Learning team to validate data ingestion infrastructure, annotation pipelines, and retraining automation. Ensure data quality standards are met at every stage of the flywheel. • Industry Benchmark Automation: Automate execution and reporting of industry-standard benchmarks (e.g., LibriSpeech, CommonVoice, internal production-traffic evals). Maintain reproducible benchmark environments and publish results for internal consumption. • Language & Domain Validation: Build and maintain test suites for multi-language and domain-specific model validation. Design coverage matrices that ensure new languages and acoustic domains are systematically evaluated before GA. • Retraining Automation Support: Validate the end-to-end retraining pipeline across all data sources—from data selection and preprocessing through training, evaluation, and promotion—ensuring automation reliability and correctness. • Manual Test Feedback Loop: Design and operate human-in-the-loop evaluation workflows for subjective quality assessment. Build the tooling and processes that translate human feedback into actionable quality signals for the ML team.

Job Requirements

  • 4–7 years of experience in QA engineering, ML evaluation, or a related technical role with a focus on predictive and generative model and data quality.
  • Hands-on experience building automated test/evaluation pipelines for ML models and connecting software features.
  • Strong programming skills in Python; experience with ML evaluation libraries, data processing frameworks (Pandas, NumPy), and scripting for pipeline automation.
  • Familiarity with speech/audio ML concepts: WER, SER, MOS, acoustic models, language models, or similar evaluation metrics.
  • Experience with CI/CD integration for ML workflows (e.g., GitHub Actions, Jenkins, Argo, MLflow, or equivalent).
  • Ability to design and maintain reproducible benchmark environments across multiple model versions and configurations.
  • Strong communication skills—you can translate model quality metrics into actionable insights for engineering, research, and product stakeholders.
  • Detail-oriented and systematic, with a bias toward automation over manual process.

Benefits

  • Medical, dental, vision benefits
  • Annual wellness stipend
  • Mental health support
  • Life, STD, LTD Income Insurance Plans
  • Unlimited PTO
  • Generous paid parental leave
  • Flexible schedule
  • 12 Paid US company holidays
  • Quarterly personal productivity stipend
  • One-time stipend for home office upgrades
  • 401(k) plan with company match
  • Tax Savings Programs
  • Learning / Education stipend
  • Participation in talks and conferences
  • Employee Resource Groups
  • AI enablement workshops / sessions

Related Categories

Related Job Pages

More QA Engineer Jobs

Senior QA Engineer

tvScientific

Connected TV Advertising + Attribution Platform

QA Engineer33 days ago
Full TimeRemoteTeam 51-200Since 2020H1B No Sponsor

Senior QA Engineer for web platform focusing on automation and manual testing

CypressJavaScriptTypeScript
United States
$140K - $220K / year

Quality Assurance Engineer, iOS

Speechify

Get your reading done faster, easier, and on the go. Listen to any book, document, or website with Speechify.

QA Engineer33 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

Quality Assurance Engineer testing iOS app in a distributed team

iOS
United States

Quality Assurance Manager

SFS Group

Inventing success together

QA Engineer33 days ago
ContractRemoteTeam 10,001+Since 1928H1B No Sponsor

Quality Assurance Director overseeing compliance and quality improvement initiatives

Maryland
$45K - $50K / year

Environmental Products - Origination Analyst

Anew

Formed through the combination of Element Markets and Bluesource, Anew offers an unmatched portfolio of carbon reduction solutions. Engage with us to turn climate goals into action.

QA Engineer33 days ago
Full TimeRemoteTeam 144Since 2001

As an Origination Analyst, you will manage forestry-based carbon projects, perform statistical analyses, and conduct feasibility studies.

ArcgisExcelR Statistical Software
United States