Posted on: 03/12/2025
Description :
- Lead the design and implementation of end-to-end data pipelines on the Databricks Lakehouse platform.
- Build metadata-driven ingestion and transformation frameworks for multimodal data.
- Enable real-time and batch data pipelines for downstream AI and analytics applications.
- Define and implement scalable data models, schemas, and lineage tracking mechanisms.
AI and Knowledge Graph Enablement:
- Architect knowledge graph pipelines for unified entity resolution and semantic enrichment.
- Partner with AI/ML engineers to design feature stores, embedding workflows, and reasoning layers.
- Integrate vector databases and graph systems (Neo4j, Neptune, etc.) to support intelligent retrieval and recommendations.
- Optimize data preparation and transformation workflows for LLM fine-tuning and inference.
Data Governance, Observability, and Reliability:
- Drive data quality, lineage, and cataloging standards.
- Implement automated validation and observability frameworks in data pipelines.
- Champion CI/CD automation and data versioning best practices using Terraform and GitHub Actions.
- Collaborate with cross-functional teams to enforce compliance, privacy, and data access controls.
Requirements :
- 9+ years of experience in data engineering, AI platform development, or enterprise data architecture.
- Expertise in Databricks, Delta Lake, PySpark, and distributed data processing.
- Advanced proficiency in Python, SQL, and Spark-based transformations.
- Experience with real-time streaming pipelines (Kafka, Debezium, etc.
- Hands-on with knowledge graphs, semantic data modeling, or graph-based analytics.
- Deep understanding of data governance, metadata management, and security frameworks.
- Ability to lead technical discussions across AI, engineering, and product teams.
Preferred Qualifications :
- Exposure to LLM data workflows, embeddings, or retrieval-augmented generation (RAG).
- Familiarity with AWS, GCP, or Azure cloud ecosystems.
- Knowledge of data observability platforms and impact analysis frameworks
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1584513
Interview Questions for you
View All