Description :

Role Overview :

The LLM / Knowledge Engineer will be responsible for building and optimizing the LLM layer and knowledge retrieval systems. The role focuses on prompt engineering, model abstraction, and designing high-performance RAG pipelines to support enterprise AI applications.

Key Responsibilities :

- Design and maintain the LLM abstraction layer (LiteLLM or AWS Bedrock) enabling model swaps without application code changes.

- Implement model routing strategies based on cost, latency, and model capabilities.

- Develop and maintain system prompts and reusable prompt libraries.

- Implement structured output parsing and schema validation for LLM responses.

- Manage context window optimization and token budgeting.

- Design and implement the end-to-end RAG pipeline including ingestion, chunking, embedding, indexing, and retrieval.

- Manage vector databases including schema design, indexing strategies, and query optimization.

- Implement hybrid retrieval approaches (dense + sparse/BM25) and reranking strategies.

- Run prompt and retrieval evaluation experiments using frameworks such as Ragas, DeepEval, or Langfuse.

Required Skills & Experience :

- Strong knowledge of modern LLMs including Claude, GPT-4, Llama, and Mistral

- Hands-on experience with LiteLLM, AWS Bedrock, or Azure OpenAI

- Production experience implementing RAG systems using LlamaIndex or LangChain

- Experience with vector databases such as OpenSearch, Qdrant, Weaviate, or Milvus

- Experience working with embedding models (Titan, Cohere, OpenAI embeddings)

- Strong Python programming skills

- Experience designing chunking strategies and retrieval quality metrics

- Familiarity with DSPy or automated prompt optimization frameworks

- Experience with data storage frameworks like Apache Iceberg or Delta Lake

- Knowledge of knowledge graph technologies such as Neptune or Neo4j