Posted on: 13/11/2025
Description :
Location : Remote.
Skillset : Azure, Data Eng., Python.
Job Description :
Responsibilities :
Your Day-to-Day responsibilities include :
- Design and development of systems for the maintenance of the Azure Databricks, ETL processes, business intelligence and data ingestion pipelines for AI/ML use cases.
- Build, scale, and optimize GenAI and ML workloads across Databricks and other production environments, with strong attention to costefficiency, compliance, and robustness.
- Build ML pipelines to train, serve, and monitor reinforcement learning or supervised learning models using Databricks and MLFlow.
- Create and support ETL pipelines and tables schemas in order to facilitate the accommodation of new and existing data sources for the Lakehouse on Databricks.
- Maintain data governance and data privacy standards.
- Collaborate with data architects, data scientists, analysts and other business consumers to quickly and thoroughly analyze business requirements to populate the data warehouse, optimized for reporting and analytics.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Maintain technical documentation and mentor junior data engineers on best practices in data engineering and Lakehouse architecture.
- Drive innovation and contribute to the development of cutting-edge Generative AI and analytical capabilities for the client's Next-Gen research enablement platform.
Minimum Qualifications :
- 7+ years of related experience with a bachelors degree.
- Proven experience designing and deploying applications using Generative AI and large language models (e.g., GPT4, Claude, openweight LLMs).
- Experience with retrievalaugmented generation, embeddingsbased search, agent orchestration, or prompt chaining.
- Familiarity with modern LLM/GenAI tools such as Langchain, LlamaIndex, HuggingFace Transformers, Semantic Kernel, or LangGraph.
- Advanced working SQL knowledge and experience working with relational and NoSQL databases, query authoring (SQL) as well as working familiarity with a variety of databases (e.g. SQL Server).
- Experience building and optimizing data pipelines on Azure Databricks.
- In depth knowledge and hands-on experience with Data Engineering, Machine learning, Data Warehousing and Delta Lake on Databricks.
- Highly proficient in Spark, Python, SQL.
- Working knowledge of Fivetran is a bonus.
- A successful history of manipulating, processing and extracting value from large, disconnected datasets.
- Exceptional stakeholder management and communication skills to effectively communicate across global teams.
Preferred Qualifications :
- Knowledge of BI Tools like PowerBI etc.
- Experience building and deploying ML and feature engineering pipelines to production using MLFlow.
- Experience with building data pipeline from various business applications like Salesforce, NetSuite, etc.
- Knowledge of message queuing, stream processing, and highly scalable data stores.
- Experience or knowledge of working in a compliance-based environment, including building and deploying compliant software solutions throughout the software life cycle is a nice to have.
- Familiarity with cloud-based AI/ML services and Generative AI tools.
Did you find something suspicious?