Senior Data Engineer - INDIA
Location
United States
Posted
3 days ago
Salary
Not specified
Job Description
*Only Consultants local to INDIA are eligible.
- Design, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads
- Build and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency
- Design and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration
- Develop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns
- Create and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring
- Implement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies
- Build automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets
- Collaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions
- Lead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability
- Support cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management
- Monitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures
- Contribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency
- 5+ years of experience in data engineering, software engineering, or similar disciplines
- Hands-on experience with Databricks or Snowflake
- Experience with orchestration tools such as Apache Airflow
- Experience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)
- Advanced SQL skills and experience with OLTP and OLAP data modeling
- Solid understanding of modern data warehousing, data lake, and ELT/ETL design patterns
- Familiarity with data governance tools, especially Microsoft Purview
- Solid programming expertise in Python, PySpark, or similar languages
- Healthcare industry experience, including claims, clinical, FHIR, HL7, or provider data
- Experience with containerization (Docker, Kubernetes) for data workloads
- Experience supporting machine learning workflows or analytical data science pipelines
- Knowledge of distributed computing concepts and performance tuning
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
The Data Engineer will collaborate on building the Business Intelligence platform, involving the analysis and transformation of raw data into useful cloud-based data systems. This includes increasingly complex data and statistical analysis to shape data systems, pipelines, and support line of business quantitative needs.
We are looking for a skilled individual to join our rapidly growing team at Bluelight Consulting. This position is ideal for someone who thrives in a fast-paced, dynamic environment where everyone's opinions and efforts are valued and appreciated. You will have the opportunity to...
The role involves leading the design and implementation of cloud-based data platforms, pipelines, and data hubs, while also supporting the modernization of legacy data systems to cloud-native architectures. Responsibilities include building scalable data pipelines with strong observability and developing reusable components for data ingestion, transformation, and serving.
The role involves leading the development of robust data pipelines, optimizing data architecture, and ensuring high-quality, reliable data flows across systems by designing and implementing cloud-based data platforms and hubs. Responsibilities also include building reusable components for data ingestion, transformation, and serving, while ensuring high availability and data quality best practices.