HamburgerMenu
hirist

Codvo.ai - Data Scientist

Codvo.ai
Anywhere in India/Multiple Locations
5 - 10 Years

Posted on: 10/03/2026

Job Description

Data Scientist


About Us :


At Codvo, we are committed to building scalable, future-ready data platforms that power business impact. We believe in a culture of innovation, collaboration, and growth, where engineers can experiment, learn, and thrive. Join us to be part of a team that solves complex data challenges with creativity and cutting-edge technology.


Role Summary :


Model development, training pipeline, and analytics backend. Works in close coordination with the on-site Data Scientist the on-site person provides site context and validation feedback, the offshore person implements model improvements, retraining logic, and drift detection.


Responsibilities :


Model Development & Training :


- Maintain and improve the physics-based simulation engine 19 equipment families, 64+ fault signatures, first-principles governing equations


- Run model training pipelines : dataset generation, feature engineering, model fitting, hyperparameter tuning, MLflow experiment tracking


- Implement model retraining triggers drift detection (PSI-based), accuracy degradation monitoring, scheduled recalibration


- Build and maintain the champion/challenger evaluation framework shadow scoring, A/B testing, promotion guardrails


- Develop new fault signatures as customer feedback identifies gaps


Analytics & Calibration :


- Implement probability calibration Platt scaling, isotonic regression, ECE monitoring


- Build the adaptive threshold controller feedback-driven alarm threshold adjustment based on false alarm rate and recall


- Develop the CMMS label linking pipeline : match work orders to predictions with confidence scoring


- Analyze prediction outcomes : precision, recall, F1 by equipment family, by fault type, by site


- Produce the weekly and monthly accuracy reports


Feature Engineering & Data Quality :


- Define and maintain feature sets for each equipment family physics-informed features, rolling statistics, cross-tag correlations


- Monitor data quality metrics null rates, stale timestamps, schema violations, sensor drift


- Build the healthy baseline update pipeline daily computation of per-tag statistics from healthy operating data


- Implement the training data snapshot pipeline versioned, reproducible dataset extraction with manifest tracking


Expected Background :


- 4+ years in machine learning engineering or applied data science


- Strong Python skills pandas, scikit-learn, XGBoost/LightGBM, MLflow


- Experience with time-series data, anomaly detection, or predictive maintenance modeling


- Understanding of model deployment patterns model registry, versioning, A/B testing, canary deployments


- Experience with statistical process control, calibration, or reliability engineering is a plus


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in