Responsibilities :

- Develop and implement machine learning models and algorithms for classification, regression, clustering, recommendation, and more.

- Build and maintain data pipelines for training and inference workflows.

- Collaborate with data scientists, product managers, and software engineers to integrate AI models into production systems.

- Optimize model performance and scalability for real-time and batch processing.

- Conduct experiments, evaluate model performance, and iterate based on results.

- Stay up to date with the latest research and advancements in AI/ML and apply them to practical use cases.

- Document code, processes, and model behavior for reproducibility and compliance.

Basic Requirements :

1. Programming Languages :

Python : Core language for AI/ML development. Proficiency in libraries like :

- NumPy, Pandas for data manipulation

- Matplotlib, Seaborn, Plotly for data visualization

- Scikit-learn for classical ML algorithms

- Familiarity with R, Java, or C++ is a plus, especially for performance-critical applications.

2. Machine Learning & Deep Learning Frameworks :

Experience building models using the following :

- TensorFlow and Keras for deep learning

- PyTorch for research-grade and production-ready models

- XGBoost, LightGBM, or CatBoost for gradient boosting

- Understanding of model training, validation, hyperparameter tuning, and evaluation metrics (e.g., ROC-AUC, F1-score, precision/recall).

3. Natural Language Processing (NLP) :

Familiarity with :

- Text preprocessing (tokenization, stemming, lemmatization)

- Vectorization techniques (TF-IDF, Word2Vec, GloVe)

- Transformer-based models (BERT, GPT, T5) using Hugging Face Transformers

- Experience with text classification, named entity recognition (NER), question answering, or chatbot development.

4. Computer Vision (CV) :

Experience with :

- Image classification, object detection, segmentation

- Libraries like OpenCV, Pillow, and Albumentations

- Pretrained models (e.g., ResNet, YOLO, EfficientNet) and transfer learning

5. Data Engineering & Pipelines :

- Ability to build and manage data ingestion and preprocessing pipelines.

- Tools : Apache Airflow, Luigi, Pandas, Dask

- Experience with structured (CSV, SQL) and unstructured (text, images, audio) data.

6. Model Deployment & MLOps :

Experience deploying models as :

- REST APIs using Flask, FastAPI, or Django

- Batch jobs or real-time inference services

- Familiarity with :

- Docker for containerization

- Kubernetes for orchestration

- MLflow, Kubeflow, or SageMaker for model tracking and lifecycle management

7. Cloud Platforms :

Hands-on experience with at least one cloud provider :

- AWS (S3, EC2, SageMaker, Lambda)

- Google Cloud (Vertex AI, BigQuery, Cloud Functions)

- Azure (Machine Learning Studio, Blob Storage)

- Understanding of cloud storage, compute services, and cost optimization.

8. Databases & Data Access :

Proficiency in :

- SQL for querying relational databases (e.g., PostgreSQL, MySQL)

- NoSQL databases (e.g., MongoDB, Cassandra)

- Big data tools like Apache Spark, Hadoop, or Databricks is a plus

9. Version Control & Collaboration :

- Experience with Git and platforms like GitHub, GitLab, or Bitbucket.

- Familiarity with Agile/Scrum methodologies and tools like JIRA, Trello, or Asana.

10. Testing & Debugging :

- Writing unit tests and integration tests for ML code.

- Using tools like pytest, unittest, and debuggers to ensure code reliability.