Altarum
Solutions to Advance Health
Principal Data Engineer – ML Platforms
Location
Virginia
Posted
86 days ago
Salary
$144.8K - $188.0K / year
Bachelor Degree7 yrs expEnglishAirflowAmazon RedshiftAWSAzureCloudGoogle Cloud PlatformGrafanaKafkaPrometheusPythonSQLTerraform
Job Description
• Design and operate modern, cloud-agnostic lakehouse architecture using object storage, SQL/ELT engines, and dbt
• Build CI/CD pipelines for data, dbt, and model delivery (GitHub Actions, GitLab, Azure DevOps)
• Implement MLOps systems: MLflow (or equivalent), feature stores, model registry, drift detection, automated testing
• Engineer solutions in AWS and AWS GovCloud today, with portability to Azure Gov or GCP
• Use Infrastructure-as-Code (Terraform, CloudFormation, Bicep) to automate secure deployments
• Build scalable ingestion and normalization pipelines for healthcare and public health datasets
• Create reusable connectors, dbt packages, and data contracts for cross-division use
• Publish clean, conformed, metrics-ready tables for Analytics Engineering and BI teams
• Support Population Health in turning evaluation and statistical models into pipelines
• Define SLOs and alerting; instrument lineage & metadata; ensure ≥95% of data tests pass
• Perform performance and cost tuning (partitioning, storage tiers, autoscaling) with guardrails and dashboards
• Build production-grade pipelines for risk prediction, forecasting, cost/utilization models, and burden estimation
• Develop ML-ready feature engineering workflows and support time-series/outbreak detection models
• Translate R/Stata/SAS evaluation code into reusable pipelines
• Implement Model Card Protocol (MCP) and fairness/explainability tooling (SHAP, LIME)
• Ensure compliance with HIPAA, 42 CFR Part 2, IRB/DUA constraints, and NIST AI RMF standards
• Develop runbooks, architecture diagrams, repo templates, and accelerator code
• Provide technical guidance for proposals and client engagements.
Job Requirements
- 7–10+ years in data engineering, ML platform engineering, or cloud data architecture
- Expert in Python, SQL, dbt, and orchestration tools (Airflow, Glue, Step Functions)
- Deep experience with AWS + AWS GovCloud
- CI/CD and IaC experience (Terraform, CloudFormation)
- Familiarity with MLOps tools (MLflow, Sagemaker, Azure ML, Vertex AI)
- Ability to operate in regulated environments (HIPAA, 42 CFR Part 2, IRB)
- Preferred: Experience with FHIR, HL7, Medicaid/Medicare claims, and/or SDOH datasets
- Databricks, Snowflake, Redshift, Synapse
- Event streaming (Kafka, Kinesis, Event Hubs)
- Feature store experience
- Observability tooling (Grafana, Prometheus, OpenTelemetry)
- Experience optimizing BI datasets for Power BI.
Benefits
- Competitive Medical, Dental and Optical plans
- Generous Paid Time Off, 8 Company observed holidays plus 3 floating holidays
- Tuition Assistance
- 401K Plan (3% employer contribution plus opportunity for gainsharing)
- Life, AD&D & Disability coverage
- A flexible work environment