Posted on: 14/07/2025
Job Description :
We are seeking a talented and hands-on AI/ML Engineer with experience in LLM-based architectures, vector search (e. g., Pinecone), and end-to-end model deployment. You'll be working closely with our product and research teams to develop scalable NLP/NLU applications, including RAG pipelines, LLM integrations, and custom model deployments.
Responsibilities :
- Integrate with OpenAI, LLaMA, and Hugging Face models to build conversational AI solutions.
- Work with vector databases (e. g., Pinecone, Weaviate, FAISS) for embedding-based retrieval.
- Fine-tune and serve LLMs (LLaMA, GPT, etc. ) locally or via cloud deployments.
- Implement NLP/NLU tasks including summarization, classification, entity extraction, etc.
- Build and deploy ML pipelines using TensorFlow or PyTorch (preferred but not mandatory).
- Perform model evaluations, optimizations, and monitor post-deployment performance.
- Collaborate with backend and DevOps teams to deploy models using Docker, FastAPI, or other modern tools.
Requirements :
- Strong experience with LLMs (e. g., GPT-4 LLaMA, Falcon).
- Hands-on experience with RAG architectures and embedding pipelines.
- Familiarity with OpenAI APIs, LangChain, or LLM tooling frameworks.
- Working knowledge of vector stores like Pinecone, FAISS, or Weaviate.
- Proficient in Python and libraries like transformers, scikit-learn, spaCy, etc.
- Exposure to model serving & deployment - FastAPI, Flask, Docker, TorchServe, etc.
- Familiarity with NLP/ML lifecycle - from training to inference and monitoring.
- Experience with TensorFlow or PyTorch.
- Experience in deploying LLMs locally (LLaMA with llama.cpp or Ollama).
- Experience in managing Hugging Face Spaces, datasets, or model hubs.
- MLOps experience : CI/CD, model versioning, cloud deployment.
Did you find something suspicious?