AI Inference Engineer QVAC

AI EngineerMachine Learning EngineerFull TimeRemoteTeam 201-500

Location

United States + 144 more

Posted

3 days ago

Salary

Not specified

C++Java ScriptLlama.cppGgmlONNXDeep LearningTransformersLlmsDiffusion Models

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

You will own the inference backbone behind QVAC's local AI stack: the C++ systems layer that makes models run fast, reliably, and predictably on real user hardware. The role is centered on engineering quality at runtime level, including:

Startup behavior
Memory pressure
Throughput/latency balance
Long-session stability

You will define and evolve the core abstractions that inference features depend on, enabling new capabilities to be added without sacrificing performance or maintainability. This role is for someone who enjoys low-level problem solving, clear technical ownership, and building infrastructure that other teams trust in production. Your work directly enables private, on-device AI experiences and helps set the technical foundation for QVAC's next generation of peer-to-peer AI products.

Responsibilities

Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx
Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments
Integrate AI features into existing products, enriching them with the latest advancements in machine learning

Qualifications

Excellent programming skills in C++, experience in Javascript is a bonus
Strong experience with Llama.cpp and ggml inference engines, facilitating the deployment of models to specific GPU architectures
Good understanding of deep learning concepts and model architectures
Experience with transformers, LLMs, Diffusion models
Demonstrated ability to rapidly assimilate new technologies and techniques
A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D

Important information for candidates

Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:

Apply only through our official channels.
We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page: https://tether.recruitee.com/
Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles.
Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS.
Double-check email addresses. All communication from us will come from emails ending in @tether.to or @tether.io.
We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.

Job Requirements

Excellent programming skills in C++, experience in Javascript is a bonus
Strong experience with Llama.cpp and ggml inference engines, facilitating the deployment of models to specific GPU architectures
Good understanding of deep learning concepts and model architectures
Experience with transformers, LLMs, Diffusion models
Demonstrated ability to rapidly assimilate new technologies and techniques
A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D
Important information for candidates
Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:
Apply only through our official channels.
We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page: https://tether.recruitee.com/
Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles.
Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS.
Double-check email addresses. All communication from us will come from emails ending in @tether.to or @tether.io.
We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.

Related Categories

AI Engineer Machine Learning Engineer AI Research Scientist LLM Engineer Computer Vision Engineer NLP Engineer

Related Job Pages

Remote Full-time Jobs (US)More US Remote Jobs

More AI Engineer Jobs

Staff Software Engineer, AI

Lattice

Lattice is a people success platform that empowers leaders to build engaged, high-performing teams & winning cultures.

AI Engineer3 days ago

Full TimeRemoteTeam 501-1,000Since 2015H1B Sponsor

Company Site LinkedIn

This Staff-level role shapes the foundations that determine AI quality, reliability, and impact at scale. You will architect and scale the infrastructure that powers AI quality, reliability, and reuse across Lattice. Design and scale an end-to-end AI evaluation framework spanning...

PythonLLMRAGLangGraphLangSmithPineconeAWSCI/CDTypeScriptMLflowDataDog

View details: Staff Software Engineer, AI

United States

Apply

AI Solutions and Adoption Manager - Engineering Solutions

GoEngineer

AI Engineer3 days ago

Full TimeRemoteTeam 501-1,000

This role involves leading the identification, validation, and adoption of artificial intelligence solutions across the engineering software portfolio for customers. Key contributions include evaluating AI solutions for engineering environments, connecting AI capabilities to business challenges, and building adoption roadmaps.

CADPDMPLMEngineering WorkflowsPre-salesSoftware ImplementationLLMGenerative AIMachine LearningAutomation

View details: AI Solutions and Adoption Manager - Engineering Solutions

United States

$85K - $115K / year

Apply

Generative AI Specialist - Humanities (English and Japanese)

Innodata Inc

Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are an AI technology solutions provider-of-choice for 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine. By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of AI. Our global workforce includes over 7,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany. We’re poised for a period of explosive growth over the next few years.

AI Engineer3 days ago

Full TimeRemoteTeam 5,001-10,000

Core tasks involve evaluating, annotating, classifying, and augmenting data to help Large Language Models (LLMs) learn language intricacies and reasoning. This includes generating prompts, rewriting responses, summarizing content, and translating between English and Japanese.

View details: Generative AI Specialist - Humanities (English and Japanese)

United States

Apply

Principal, Software Engineer

Paramount

AI Engineer3 days ago

Full TimeRemoteTeam 10,001

The Principal Software Engineer will architect and implement advanced Generative AI and RAG systems to power intelligent recommendations and personalization, while leading the rapid prototyping and full-cycle development of production-grade features for digital platforms. This role also involves establishing scalable platform standards, engineering patterns, and reusable components across the organization.

Generative AIRAGReactNext.jsNode.jsJavaScriptTypeScriptHTMLCSSREST APIGraphQLApollo ClientAWSDockerKubernetesRedisKafkaOpenTelemetryHTML/CSS

View details: Principal, Software Engineer

United States

Apply

AI Inference Engineer QVAC

Job Description

Job Requirements

Related Guides

Related Categories

Related Job Pages

More AI Engineer Jobs

Staff Software Engineer, AI

AI Solutions and Adoption Manager - Engineering Solutions

Generative AI Specialist - Humanities (English and Japanese)

Principal, Software Engineer