HamburgerMenu
hirist

Job Description

JD :

What youll do :


- Build LLM apps : Design APIs, microservices, and UIs that use function calling, tools, and streaming responses.

- RAG pipelines : Ingest/clean data, chunk/embeddings, set retrieval strategies (BM25/hybrid), and tune for relevance & latency.

- Prompt & policy engineering : Craft prompts, guardrails, and safety checks (PII redaction, jailbreak defense).

- Model ops : Integrate managed (Azure OpenAI) and open-source (Llama, Mistral) models; choose/optimize runtimes (vLLM/Triton).

- Evaluation & quality : Establish automatic evals (correctness, toxicity, hallucination, latency, cost/token); build golden test sets and CI gates.

- Observability : Add tracing, metrics, and logs (OpenTelemetry); set error budgets & SLOs.

- Security & compliance : Secrets/RBAC, data residency, audit trails; align to SOC2/GDPR.

- Cost control : Token budgeting, caching, batching, quantization/LoRA where appropriate.

- Collaboration : Partner with Product, SecOps, and FinOps; review PRs and mentor juniors.

Minimum qualifications :


- 4 - 5 years software engineering (Python or TypeScript) shipping production services.

- Hands-on LLM experience (?12 years) : built at least one production feature using OpenAI/Azure OpenAI/Bedrock/Vertex or OSS models.

- RAG with a vector DB (Pinecone, Redis, pgvector, Weaviate, Milvus) and embedding models.

- Solid with APIs (REST/GraphQL), Git, testing, and CI/CD (GitHub Actions/Azure DevOps).

- Cloud fundamentals on Azure/AWS/GCP, containers (Docker, Kubernetes basics).

- Clear written & verbal communication; comfort with docs and design reviews.

Nice to have :


- Agent frameworks (LangChain, LlamaIndex, Semantic Kernel, OpenAI Assistants), tools & MCPs.

- Evals frameworks (Ragas, DeepEval, Promptfoo), AB testing, offline/online metrics.

- Fine-tuning/LoRA, distillation, quantization; DSPy; retrieval re-ranking.

- Event systems (Kafka), queues (SQS), and caching layers.

- Frontend familiarity (React/Next.js) for rapid prototyping.

Tech stack (example) :


- Models : Azure OpenAI (GPT-4.x),

- Orchestration : OpenAI Assistants

- Data/RAG : Azure Cognitive Search

- Pipelines : GitHub Actions, Docker, Kubernetes/AKS, Terraform (AVM)

- Observability : OpenTelemetry, Grafana

- Testing/Evals : PyTest, SonarCloud


info-icon

Did you find something suspicious?