Posted on: 12/02/2026
Description :
Data Engineering :
- Design and maintain robust ETL/ELT pipelines for ML datasets.
- Implement real-time data streaming for inference (Kafka, Flink).
- Implement data lakes and warehouses optimized for AI workloads.
- Implement partitioning, indexing, and caching strategies for high-performance data retrieval.
- Optimize data storage and retrieval for AI workloads (data lakes, warehouses).
- Establish data validation and lineage tracking to ensure integrity and compliance.
- Apply best practices for handling sensitive data in AI contexts (GDPR, CCPA).
Machine Learning Integration :
- Collaborate on feature engineering and dataset preparation.
- Implement automated data preprocessing for ML models (normalization, encoding, augmentation).
- Collaborate with data scientists to create feature stores and reusable feature pipelines.
- Deploy ML models using MLOps practices (CI/CD, model versioning).
- Maintain version control for datasets, models, and pipelines.
- Monitor model performance and automate retraining workflows.
Required Skills :
- 5+ years in data engineering or ML engineering roles.
- Experience building end-to-end ML pipelines.
- Familiarity with vector databases (Pinecone, Weaviate) and embedding techniques.
- Exposure to generative AI and LLM-based applications.
- Programming : Python, SQL, Spark.
- ML Frameworks : TensorFlow, PyTorch, Scikit-learn.
- Data Tools : Airflow, dbt, Kafka.
- MLOps Tools : MLflow, Kubeflow, TensorFlow Serving.
- Cloud Platforms : AWS, Azure, GCP.
Working Conditions :
- This position may require evening and weekend work for time-sensitive project implementations
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1612129