- Design and development of systems for the maintenance of the Azure Databricks, ETL processes, business intelligence and data ingestion pipelines for AI/ML use cases.

- Build, scale, and optimize GenAI and ML workloads across Databricks and other production environments, with strong attention to costefficiency, compliance, and robustness.

- Build ML pipelines to train, serve, and monitor reinforcement learning or supervised learning models using Databricks and MLFlow.

- Create and support ETL pipelines and tables schemas in order to facilitate the accommodation of new and existing data sources for the Lakehouse on Databricks.

- Maintain data governance and data privacy standards.

- Collaborate with data architects, data scientists, analysts and other business consumers to quickly and thoroughly analyze business requirements to populate the data warehouse, optimized for reporting and analytics.

- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.

- Maintain technical documentation and mentor junior data engineers on best practices in data engineering and Lakehouse architecture.

- Drive innovation and contribute to the development of cutting-edge Generative AI and analytical capabilities for the client's Next-Gen research enablement platform.

Minimum Qualifications :

- 7+ years of related experience with a bachelors degree.

- Proven experience designing and deploying applications using Generative AI and large language models (e.g., GPT4, Claude, openweight LLMs).

- Experience with retrievalaugmented generation, embeddingsbased search, agent orchestration, or prompt chaining.

- Familiarity with modern LLM/GenAI tools such as Langchain, LlamaIndex, HuggingFace Transformers, Semantic Kernel, or LangGraph.

- Advanced working SQL knowledge and experience working with relational and NoSQL databases, query authoring (SQL) as well as working familiarity with a variety of databases (e.g. SQL Server).

- Experience building and optimizing data pipelines on Azure Databricks.

- In depth knowledge and hands-on experience with Data Engineering, Machine learning, Data Warehousing and Delta Lake on Databricks.

- Highly proficient in Spark, Python, SQL.

- Working knowledge of Fivetran is a bonus.

- A successful history of manipulating, processing and extracting value from large, disconnected datasets.

- Exceptional stakeholder management and communication skills to effectively communicate across global teams.

Preferred Qualifications :

- Knowledge of BI Tools like PowerBI etc.

- Experience building and deploying ML and feature engineering pipelines to production using MLFlow.

- Experience with building data pipeline from various business applications like Salesforce, NetSuite, etc.

- Knowledge of message queuing, stream processing, and highly scalable data stores.

- Experience or knowledge of working in a compliance-based environment, including building and deploying compliant software solutions throughout the software life cycle is a nice to have.

- Familiarity with cloud-based AI/ML services and Generative AI tools.