Posted on: 28/11/2025
Description :
Responsibilities :
- Lead the design, development, and deployment of production-grade AI and LLM-powered solutions using Python and Golang. Architect and implement scalable backend services and APIs in Golang to serve AI/ML models with high performance and reliability.
- Build and optimize RAG (Retrieval Augmented Generation) pipelines using vector databases, embeddings, and advanced retrieval strategies.
- Design and implement sophisticated prompt engineering strategies, including few-shot learning, chain-of-thought prompting, and context optimization for LLM applications.
- Work with LLMs (GPT-4 Claude, Llama, etc. ) to build intelligent systems for text generation, analysis, classification, and reasoning tasks.
- Develop microservices and data processing pipelines in Golang and Python to handle high-volume data ingestion, transformation, and model inference.
- Design and implement embedding strategies for semantic search, similarity matching, and knowledge retrieval systems.
- Build and maintain vector databases (Qdrant, Pinecone, Weaviate, etc. ) for efficient similarity search and retrieval.
- Optimize model performance, latency, and cost for production deployments across cloud environments.
- Write clean, maintainable, and well-tested code with comprehensive documentation and version control practices.
- Mentor junior team members and conduct code reviews to maintain high engineering standards.
- Collaborate with cross-functional teams to translate business requirements into technical solutions.
Requirements :
- 5 - 6+ years of experience in software engineering, data science, or AI/ML engineering roles.
- Expert-level proficiency in Python with a deep understanding of modern libraries (Pandas, NumPy, FastAPI/Flask, asyncio).
- Strong proficiency in Golang with experience building production-grade microservices, REST APIs, and concurrent systems.
- Deep understanding of LLM architectures, capabilities, and limitations with hands-on experience using OpenAI, Anthropic, Hugging Face, or similar platforms.
- Proven expertise in prompt engineering techniques, including context management, system prompts, and optimization strategies for various use cases.
- Strong knowledge of RAG architectures, including chunking strategies, retrieval methods, and context management.
- Deep understanding of embeddings (text, multimodal), vector databases, and semantic search principles.
- Proficient in SQL with experience in database design, optimization, and working with both relational and vector databases.
- Experience with model fine-tuning, evaluation metrics, and A/B testing of AI systems.
- Strong system design skills with the ability to architect scalable, fault-tolerant AI-powered applications.
- Self-driven, with the ability to own projects end-to-end and deliver production-quality solutions.
Bonus points if you have :
- Experience building and deploying production RAG systems at scale.
- Familiarity with LangChain, LlamaIndex, or similar LLM orchestration frameworks.
- Experience with advanced prompting techniques such as ReAct, Tree of Thoughts, or multi-agent systems.
- Knowledge of model deployment and serving frameworks (vLLM, TGI, Triton).
- Experience with distributed systems and message queues (Kafka, RabbitMQ, Redis).
- Exposure to Kubernetes, Docker, and cloud platforms (AWS, GCP, Azure).
- Understanding of LLM fine-tuning techniques (LoRA, QLoRA, PEFT).
- Experience with monitoring, observability, and LLM ops tools (LangSmith, Weights and Biases).
- Contributions to open-source AI/ML projects or technical writing/speaking experience.
- Familiarity with graph databases (Neo4j) for knowledge graph applications.
Did you find something suspicious?