Specialty Solutions Engineer, AI - Modern Data Center
Location
United States
Posted
43 days ago
Salary
$250K - $300K / year
Job Description
Role Description
The AHEAD Senior Modern Datacenter Specialist Solutions Engineer (SSE) is a presales technical leader focused on designing, positioning, and enabling enterprise-scale AI infrastructure solutions. This role owns the technical sales motion from discovery through architecture, demonstration, and business case, across the full AI stack: GPU platforms (NVIDIA & AMD), interconnects, orchestration/scheduling, and AI/ML frameworks. The SSE partners closely with Client Directors, Practice Leaders, and strategic vendors to develop market strategies, drive pipeline, and win campaigns in the US enterprise market.
- Lead technical discovery and solution positioning for enterprise AI infrastructure opportunities, translating business outcomes into reference architectures and value propositions.
- Own pre-sales deliverables including architectures, diagrams, sizing, BOMs, proposals.
- Deliver executive and technical presentations focused on NVIDIA AI Enterprise (NVAIE), LLM training/inference, and accelerated analytics.
- Guide clients through technology selection, roadmap development, and business case creation for large-scale AI initiatives.
- Architect end-to-end AI platforms using NVIDIA DGX/HGX, Blackwell (B100/B200), Hopper (H100/H200), Grace/Grace-Hopper (GH200), L40S, NVLink/NVSwitch, InfiniBand (NVIDIA Quantum), RoCE, and DPU offload patterns.
- Design solutions leveraging AMD Instinct (MI300/MI300X) as appropriate, articulating trade-offs in CPU/GPU/DPU, interconnect topology, and cluster scale-out.
- Integrate NVIDIA AI Enterprise components (CUDA, cuDNN, TensorRT, Triton Inference Server, RAPIDS) and common ML frameworks (PyTorch, TensorFlow) with orchestration platforms.
- Experience integrating on-prem GPU clusters with cloud AI services (AWS SageMaker, Azure ML, GCP Vertex AI) for hybrid bursting and workload mobility.
- Advise on MLOps platforms (MLflow, Kubeflow, Weights & Biases), CI/CD, and governance for multi-tenant AI environments.
- Build and maintain relationships with NVIDIA, AMD, Run:AI, OEMs, and networking vendors, aligning campaigns with partner programs and incentives.
- Contribute feedback to vendor engineering and product teams, coordinating joint enablement and reference designs.
- Create repeatable assets such as validated designs, sizing calculators, POV guides, deployment runbooks, and competitive playbooks.
- Mentor SEs and delivery consultants, leading internal training on AI scheduling, performance tuning, and operational best practices.
- Lead proof-of-value (POV) and proof-of-concept (POC) engagements, including success criteria, benchmarking, and recommendations.
Qualifications
- Proven experience architecting and deploying NVIDIA GPU-based AI platforms (NVAIE, DGX/HGX, Blackwell, Hopper, Grace, L40S, H100/H200, B100/B200, GH200) and/or AMD Instinct MI300/MI300X.
- Experience with Run:AI, NVIDIA Base Command, Kubernetes (GPU Operator), Slurm, and/or vSphere with Tanzu for AI/ML workloads.
- Advanced knowledge of AI/ML frameworks and libraries (PyTorch, TensorFlow, RAPIDS, Triton, CUDA, cuDNN, TensorRT).
- Strong understanding of high-speed networking for AI (InfiniBand, RoCE, DPU integration, NVLink, NVSwitch).
- Experience integrating on-prem AI infrastructure with public cloud AI services (AWS SageMaker, Azure ML, GCP Vertex AI) and hybrid architectures.
- Experience leading pre-sales campaigns, POV/POC management, and executive presentations.
- Ability to identify and leverage emerging datacenter and AI technologies to drive innovative solutions.
- Strong analytical skills for troubleshooting complex environments, including storage, compute, networking, and AI workloads.
- Skilled at guiding clients through decision-making with clear, strategic recommendations.
- Proven track record of working effectively across sales, engineering, and vendor teams.
- Knowledge of datacenter security best practices and regulatory compliance.
Requirements
- Bachelor’s degree in computer science, Engineering, or related field; advanced degree a plus.
- 7+ years in datacenter or cloud infrastructure, with 3+ years directly on AI/GPU platforms and orchestration.
Certifications (preferred)
- NVIDIA Certified Systems Engineer / AI Practitioner; NVAIE enablement badges
- Kubernetes CKA/CKS; GPU Operator experience
- AWS/Azure/GCP Architect; relevant AI/ML specialty
Benefits
- Medical, Dental, and Vision Insurance
- 401(k)
- Paid company holidays
- Paid time off
- Paid parental and caregiver leave
- Plus more! See benefits here for additional details.
Job Requirements
- Proven experience architecting and deploying NVIDIA GPU-based AI platforms (NVAIE, DGX/HGX, Blackwell, Hopper, Grace, L40S, H100/H200, B100/B200, GH200) and/or AMD Instinct MI300/MI300X.
- Experience with Run:AI, NVIDIA Base Command, Kubernetes (GPU Operator), Slurm, and/or vSphere with Tanzu for AI/ML workloads.
- Advanced knowledge of AI/ML frameworks and libraries (PyTorch, TensorFlow, RAPIDS, Triton, CUDA, cuDNN, TensorRT).
- Strong understanding of high-speed networking for AI (InfiniBand, RoCE, DPU integration, NVLink, NVSwitch).
- Experience integrating on-prem AI infrastructure with public cloud AI services (AWS SageMaker, Azure ML, GCP Vertex AI) and hybrid architectures.
- Experience leading pre-sales campaigns, POV/POC management, and executive presentations.
- Ability to identify and leverage emerging datacenter and AI technologies to drive innovative solutions.
- Strong analytical skills for troubleshooting complex environments, including storage, compute, networking, and AI workloads.
- Skilled at guiding clients through decision-making with clear, strategic recommendations.
- Proven track record of working effectively across sales, engineering, and vendor teams.
- Knowledge of datacenter security best practices and regulatory compliance.
- Bachelor’s degree in computer science, Engineering, or related field; advanced degree a plus.
- 7+ years in datacenter or cloud infrastructure, with 3+ years directly on AI/GPU platforms and orchestration.
- Certifications (preferred)
- NVIDIA Certified Systems Engineer / AI Practitioner; NVAIE enablement badges
- Kubernetes CKA/CKS; GPU Operator experience
- AWS/Azure/GCP Architect; relevant AI/ML specialty
Benefits
- Medical, Dental, and Vision Insurance
- 401(k)
- Paid company holidays
- Paid time off
- Paid parental and caregiver leave
- Plus more! See benefits here for additional details.
Related Guides
Related Categories
Related Job Pages
More Solutions Engineer Jobs
Lead Finance Solution Architect designing finance solutions for global operations at Cencora
Unified Communication – AI Automation / Integration Engineer
DatavantConnecting the world’s health data to improve patient outcomes.
AI Staff Engineer leading Voice AI initiatives at Datavant
Merchant Sales – Solution Engineer
AutomatticWe are passionate about making the web a better place. Fully distributed since 2005.
Merchant Sales & Solution Engineer role at Automattic
Partner Solution Architect – Incident Response
SentinelOneSecure your enterprise with the autonomous cybersecurity platform. Endpoint. Cloud. Identity. XDR. Now.
Partner Solution Architect focusing on incident response partnerships at SentinelOne