Building foundational AI for speech transcription and understanding.

Model Evaluation QA Lead

QA EngineerQA EngineerFull TimeRemoteTeam 51-200Since 2015H1B SponsorCompany Site LinkedIn

Location

United States

Posted

32 days ago

Salary

$180K - $230K / year

5 yrs expEnglishJenkinsNumpyPandasPython

Job Description

• Model Evaluation Automation: Design, build, and maintain automated model evaluation pipelines that run against every candidate model before release. Implement objective and subjective quality metrics (WER, SER, MOS, latency/throughput) across STT, TTS, and STS product lines. • Release Gate Integration: Embed model quality checkpoints into CI/CD and release pipelines. Define pass/fail criteria, build dashboards for model comparison, and own the go/no-go signal for model promotions to production. • Agent & Model Evaluation Frameworks: Stand up and operate evaluation tooling (Coval, Braintrust, Blue Jay, custom harnesses) for end-to-end voice agent testing—covering accuracy, latency, turn-taking, and conversational quality and custom metrics across real-world scenarios. • Active Learning & Data Ingestion Testing: Partner with the Active Learning team to validate data ingestion infrastructure, annotation pipelines, and retraining automation. Ensure data quality standards are met at every stage of the flywheel. • Industry Benchmark Automation: Automate execution and reporting of industry-standard benchmarks (e.g., LibriSpeech, CommonVoice, internal production-traffic evals). Maintain reproducible benchmark environments and publish results for internal consumption. • Language & Domain Validation: Build and maintain test suites for multi-language and domain-specific model validation. Design coverage matrices that ensure new languages and acoustic domains are systematically evaluated before GA. • Retraining Automation Support: Validate the end-to-end retraining pipeline across all data sources—from data selection and preprocessing through training, evaluation, and promotion—ensuring automation reliability and correctness. • Manual Test Feedback Loop: Design and operate human-in-the-loop evaluation workflows for subjective quality assessment. Build the tooling and processes that translate human feedback into actionable quality signals for the ML team.

Job Requirements

4–7 years of experience in QA engineering, ML evaluation, or a related technical role with a focus on predictive and generative model and data quality.
Hands-on experience building automated test/evaluation pipelines for ML models and connecting software features.
Strong programming skills in Python; experience with ML evaluation libraries, data processing frameworks (Pandas, NumPy), and scripting for pipeline automation.
Familiarity with speech/audio ML concepts: WER, SER, MOS, acoustic models, language models, or similar evaluation metrics.
Experience with CI/CD integration for ML workflows (e.g., GitHub Actions, Jenkins, Argo, MLflow, or equivalent).
Ability to design and maintain reproducible benchmark environments across multiple model versions and configurations.
Strong communication skills—you can translate model quality metrics into actionable insights for engineering, research, and product stakeholders.
Detail-oriented and systematic, with a bias toward automation over manual process.

Benefits

Medical, dental, vision benefits
Annual wellness stipend
Mental health support
Life, STD, LTD Income Insurance Plans
Unlimited PTO
Generous paid parental leave
Flexible schedule
12 Paid US company holidays
Quarterly personal productivity stipend
One-time stipend for home office upgrades
401(k) plan with company match
Tax Savings Programs
Learning / Education stipend
Participation in talks and conferences
Employee Resource Groups
AI enablement workshops / sessions

Related Categories

QA Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More US Remote Jobs

More QA Engineer Jobs

Senior QA Engineer

tvScientific

Connected TV Advertising + Attribution Platform

QA Engineer33 days ago

Full TimeRemoteTeam 51-200Since 2020H1B No Sponsor

Company Site LinkedIn

Senior QA Engineer for web platform focusing on automation and manual testing

CypressJavaScriptTypeScript

View details: Senior QA Engineer

United States

$140K - $220K / year

Apply

Quality Assurance Engineer, iOS

Speechify

Get your reading done faster, easier, and on the go. Listen to any book, document, or website with Speechify.

QA Engineer33 days ago

Full TimeRemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

Quality Assurance Engineer testing iOS app in a distributed team

iOS

View details: Quality Assurance Engineer, iOS

United States

Apply

Quality Assurance Manager

SFS Group

Inventing success together

QA Engineer33 days ago

ContractRemoteTeam 10,001+Since 1928H1B No Sponsor

Company Site LinkedIn

Quality Assurance Director overseeing compliance and quality improvement initiatives

View details: Quality Assurance Manager

Maryland

$45K - $50K / year

Apply

Environmental Products - Origination Analyst

Anew

Formed through the combination of Element Markets and Bluesource, Anew offers an unmatched portfolio of carbon reduction solutions. Engage with us to turn climate goals into action.

QA Engineer33 days ago

Full TimeRemoteTeam 144Since 2001

As an Origination Analyst, you will manage forestry-based carbon projects, perform statistical analyses, and conduct feasibility studies.

ArcgisExcelR Statistical Software

View details: Environmental Products - Origination Analyst

United States

Apply

Model Evaluation QA Lead

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More QA Engineer Jobs

Senior QA Engineer

Quality Assurance Engineer, iOS

Quality Assurance Manager

Environmental Products - Origination Analyst