Data Engineer
Location
United States
Posted
13 days ago
Salary
Not specified
No structured requirement data.
Job Description
Role Description
This role involves delivering a phased data strategy and AI enablement engagement for a client with 30+ datasets in Databricks.
- Profile all 30+ datasets in Databricks: table structures, row counts, data types, distributions, refresh patterns
- Document schemas with inferred relationships and primary/foreign key candidates
- Assess data quality across dimensions: completeness, consistency, accuracy, freshness
- Analyze historical data behavior — determine which datasets use snapshot vs. overwrite patterns
- Support API and integration mapping (test data extraction capabilities)
- Build standardized ingestion framework and data pipelines (Phase 2)
- Implement data quality gates with automated validation and alerting (Phase 2)
- Support workflow integration, feature engineering pipelines, and ML data products (Phases 3-4)
Qualifications
- Strong SQL and Python skills
- Experience with Databricks (notebooks, Spark SQL, Delta Lake)
- Hands-on data profiling, data quality assessment, and technical documentation
- ETL/ELT pipeline development experience
- Comfort working in locked-down enterprise environments with restricted internet access
- Comfort with undocumented, messy data — you'll be making sense of datasets that have limited or no documentation
- Eager to learn AI tooling
Requirements
- Financial services, lending, or banking data experience (strongly preferred)
- Experience with Medallion Architecture (bronze/silver/gold patterns) (strongly preferred)
- Familiarity with Power BI as a downstream consumer (strongly preferred)
- Experience working within VDI-based access environments (strongly preferred)
- Experience with modern AI tool sets (strongly preferred)
Environment
The client's environment is managed with strict security controls. Access is through VDI (Windows) RDP into a dedicated server Databricks. Internet access on work servers is limited. You must be comfortable working within these constraints.
Job Requirements
- Strong SQL and Python skills
- Experience with Databricks (notebooks, Spark SQL, Delta Lake)
- Hands-on data profiling, data quality assessment, and technical documentation
- ETL/ELT pipeline development experience
- Comfort working in locked-down enterprise environments with restricted internet access
- Comfort with undocumented, messy data — you'll be making sense of datasets that have limited or no documentation
- Eager to learn AI tooling
- Financial services, lending, or banking data experience (strongly preferred)
- Experience with Medallion Architecture (bronze/silver/gold patterns) (strongly preferred)
- Familiarity with Power BI as a downstream consumer (strongly preferred)
- Experience working within VDI-based access environments (strongly preferred)
- Experience with modern AI tool sets (strongly preferred)
- Environment
- The client's environment is managed with strict security controls. Access is through VDI (Windows) RDP into a dedicated server Databricks. Internet access on work servers is limited. You must be comfortable working within these constraints.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
B2B Data Engineer building data-driven initiatives
Sr. Data Engineer
NinjaTraderBetter Futures Start Now. Grow your FinTech career at NinjaTrader or start your futures trading journey with us.
Design and maintain data architecture and pipelines, develop data assets, improve data products, and collaborate in an Agile environment.
Senior Associate, Actuarial Data Engineer/Cloud Architect
New York Life Insurance CompanyAt New York Life, our 180-year legacy of integrity, mutuality, and financial strength fuels a future defined by bold transformation. As the largest mutual life insurance company in the U.S., we operate on behalf of our policy owners—not shareholders. That structure allows us to take a long-term view, investing in people, purpose, and innovation that endures. Guided by a clear enterprise vision to become a technology-, data-, and AI-powered company, we’re modernizing our platforms, rearchitecting experiences, and embedding intelligence across our products and services. Our mission has always been about helping people through life’s most meaningful moments. Today, technology is amplifying that mission—enabling us to serve clients, advisors, and communities in more personalized, proactive ways. With a diversified business portfolio spanning insurance, investments, retirement, group benefits, and direct-to-consumer offerings, New York Life delivers the stability of a Fortune 100 company with the agility of one that’s continuously evolving. We’re powered by a values-led culture, inclusive teams, and a shared belief that when our people thrive, so does our company. Here, tradition fuels momentum—and your ideas, energy, and growth power what’s next.
You will design and optimize cloud-based data pipelines in AWS, develop automation frameworks, and enhance data architecture for financial reporting and actuarial modeling.
Data Architect
MSB Consulting - Medicaid to SchoolsStreamlining your Medicaid process to maximize your impact on students.
Data Architect shaping enterprise data strategy and infrastructure for MSB