Posted on: 26/11/2025
Description :
Job Overview :
We are looking for a highly skilled Databricks Gen AI Engineer who can design, build, and optimize end-to-end GenAI solutions on the Databricks Lakehouse Platform. The ideal candidate will have hands-on experience with GenAI workloads, Model Serving, Vector Search, embeddings, and RAG-based applications, along with strong data engineering expertise using Python, PySpark, and SQL. This role requires a deep understanding of distributed systems, LLMs, cloud infrastructure (preferably Azure), and CI/CD pipelines.
Key Responsibilities :
- Design and develop end-to-end GenAI applications leveraging Databricks Model Serving, Vector Search, and embedding generation workflows.
- Build and optimize RAG pipelines, agent-based architectures, and domain-specific LLM integrations.
- Work with OpenAI, Azure OpenAI, or other LLM providers to fine-tune, prompt-engineer, or deploy large language models.
- Create scalable embedding pipelines using Databricks notebooks, MLflow, and Delta Lake.
- Build and manage data pipelines using PySpark, Delta Lake, and Databricks Workflows.
- Implement clustering, schema evolution, time travel, and performance-optimized lakehouse architectures.
- Ensure robust governance using Unity Catalog, including lineage, access control, and secure data sharing.
- Optimize Spark jobs for performance and cost efficiency.
- Develop reusable frameworks for data ingestion, feature engineering, and vector indexing.
- Deploy and monitor models using Databricks Model Serving and integrate with downstream applications.
- Leverage MLflow for experiment tracking, model registry, lifecycle management, and versioning.
- Implement observability, logging, and monitoring for GenAI workflows.
- Build and optimize cloud-native solutions incorporating Databricks, Azure storage, ADF, Key Vault, AKS, etc.
- Set up secure networking, identity, and role-based access following cloud best practices.
- Use Azure OpenAI, Azure Functions, Azure Cognitive Search (if applicable) to enhance GenAI architectures.
- Develop end-to-end CICD pipelines using Azure DevOps, GitHub Actions, or similar tools.
- Automate notebook deployment, model deployment, Delta Live Tables, and workflow executions.
- Apply metadata-driven development or accelerator frameworks for faster delivery.
Required Skills & Qualifications :
- 4- 5 years of hands-on experience in Python, PySpark, and Data Engineering.
- Deep expertise in Databricks GenAI components :
- Model Serving
- Vector Search
- Embedding generation workflows
- Strong understanding of LLMs, embeddings, RAG patterns, agent frameworks.
- Proficiency in SQL, Spark optimization, distributed computing, data partitioning, caching, and job tuning.
- Experience with Azure cloud services and integration with Databricks.
- Solid understanding of data structures, metadata management, version control (Git), and modular coding.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1581182
Interview Questions for you
View All