Job Description

Job Description :


- Should have worked on Data Sourcing & Data Privacy for Generative AI projects

- Strong Hands-on skills for Data pipeline orchestration

- Knowledge of Master Data Management concepts and techniques including modelling, data loads, Data lineage, metadata and data usage

- Experience with one or more EDM toolsets such as OpenText, IBM FileNet, Microsoft SharePoint, and Oracle WebCenter Suite

- Develop best practices, standards, and methodologies to assist in the implementation and execution of Data Governance

- Designing, developing, and researching Machine Learning systems, models, and schemes through Data Science

- Administer the processes for receiving, documenting, tracking and investigating all complaints regarding alleged breaches of Privacy Policies and Practices

- Identify Privacy and Data protection-related risks and driving mitigation efforts throughout the organization

- Should be able to define/Implement Data Protection & Privacy strategies that protect consumers/employee data

- Familiarity with Deep Learning, Machine Learning and NLP/NLG frameworks (like Keras, TensorFlow or PyTorch etc.), HuggingFace Transformers and libraries (like scikit-learn, spacy, gensim, CoreNLP etc.)

- Should have experience in AWS services such as SageMaker, Elasticsearch, and general knowledge of AWS architecture & other services

- Solid knowledge and understanding of supervised, unsupervised and reinforcement learning machine learning algorithms.

- Understanding of the current state of AI/ML, Large Language Models, and Generative AI techniques.

- Identify Data, Data Quality Verification & Validation, Prepare datasets, develop predictive models and Algos, Data Insights, Test & Validate algorithm and models using statistical and other visualizations

- Hands-on experience on one or more LLM models (GPT, LLaMA, BLOOM, BERT, T5, PaLM, Meta, Google Gen AI Studio etc)

- Hands-on experience in AI Cloud Tools (AWS Sage Maker, Tensor Flow, PyTorch, MS Azure, OpenAI, Hugging Face), MLOps/AIOps (domino, mlflow), Low Code RPA (Appian, UI Path), Big Data, Python, Java JS, Full Stack, No SQL, API, Docker & Kubernetes

Basic Skill Set :


Data & Business Intelligence :

- Data Architecture, Data Engineering, Data Governance, Data Quality, Data Lake and Data Warehouse, Data Science, DataOps, Data Discovery, Enterprise BI, Data Visualization

Data Science Toolkit :

- IDEs, Jupyter, Data Analysis & Scinetific Computations libraries (NumPy, SciPy, Pandas, SciKit, Matplotlib), Tableau, Plotly, Log Analytics

AI/ML & Analytics :

- Artificial Intelligence & Machine Learning, Model Building & Deployment, Generative AI, Architectures (Transformers/Diffusion), LLMs, Vector Database, MLOps

Cloud :

- Cloud Assessment, Enablement, Migration, Deployement (AWS, Azure & GCP), Cloud Data Warehouse

Databases :

- Teradata, MSFT SQL Server, Oracle, NoSQL databases (MongoDB), ELK, Snowflake, Postgres, MySQL

Data Management Tools & Web Services :

- Informatica ETL, Eclipse IDE, REST/SOAP web services

AI Algorithms :

- Linear/Logistic Regressions, Classification, Clustering, NLP, LSTM, Time Series Analysis, Ensemble Techniques (Decision Trees etc), Sentimental Analysis

People Leadership :

- Stakeholder/Team Management

info-icon

Did you find something suspicious?