We are looking for a highly skilled Databricks Gen AI Engineer who can design, build, and optimize end-to-end GenAI solutions on the Databricks Lakehouse Platform. The ideal candidate will have hands-on experience with GenAI workloads, Model Serving, Vector Search, embeddings, and RAG-based applications, along with strong data engineering expertise using Python, PySpark, and SQL. This role requires a deep understanding of distributed systems, LLMs, cloud infrastructure (preferably Azure), and CI/CD pipelines.

Key Responsibilities :

- Design and develop end-to-end GenAI applications leveraging Databricks Model Serving, Vector Search, and embedding generation workflows.

- Build and optimize RAG pipelines, agent-based architectures, and domain-specific LLM integrations.

- Work with OpenAI, Azure OpenAI, or other LLM providers to fine-tune, prompt-engineer, or deploy large language models.

- Create scalable embedding pipelines using Databricks notebooks, MLflow, and Delta Lake.

- Build and manage data pipelines using PySpark, Delta Lake, and Databricks Workflows.

- Implement clustering, schema evolution, time travel, and performance-optimized lakehouse architectures.

- Ensure robust governance using Unity Catalog, including lineage, access control, and secure data sharing.

- Optimize Spark jobs for performance and cost efficiency.

- Develop reusable frameworks for data ingestion, feature engineering, and vector indexing.

- Deploy and monitor models using Databricks Model Serving and integrate with downstream applications.

- Leverage MLflow for experiment tracking, model registry, lifecycle management, and versioning.

- Implement observability, logging, and monitoring for GenAI workflows.

- Build and optimize cloud-native solutions incorporating Databricks, Azure storage, ADF, Key Vault, AKS, etc.

- Set up secure networking, identity, and role-based access following cloud best practices.

- Use Azure OpenAI, Azure Functions, Azure Cognitive Search (if applicable) to enhance GenAI architectures.

- Develop end-to-end CICD pipelines using Azure DevOps, GitHub Actions, or similar tools.

- Automate notebook deployment, model deployment, Delta Live Tables, and workflow executions.

- Apply metadata-driven development or accelerator frameworks for faster delivery.

Required Skills & Qualifications :

- 4- 5 years of hands-on experience in Python, PySpark, and Data Engineering.

- Deep expertise in Databricks GenAI components :

- Model Serving

- Vector Search

- Embedding generation workflows

- Strong understanding of LLMs, embeddings, RAG patterns, agent frameworks.

- Proficiency in SQL, Spark optimization, distributed computing, data partitioning, caching, and job tuning.

- Experience with Azure cloud services and integration with Databricks.

- Solid understanding of data structures, metadata management, version control (Git), and modular coding.