Posted on: 19/09/2025
What youll build :
Retrieval & data grounding :
- Connectors for warehouses/blobs/APIs; schema validation and PII-aware pipelines; chunking/embeddings; hybrid search with rerankers; multi-tenant index management.
Orchestration & reasoning :
- Function/tool calling with structured outputs; controller logic for agent workflows; context/prompt management with citations and provenance.
Evaluation & observability :
- Gold sets + LLM-as-judge; regression suites in CI; dataset/version tracking; traces with token/latency/cost attribution.
Safety & governance :
- Input/output filtering, policy tests, prompt hardening, auditable decisions.
Performance & efficiency :
- Streaming, caching, prompt compression, batching; adaptive routing across models/providers; fallback and circuit strategies.
Product-ready packaging :
- Versioned APIs/SDKs/CLIs, Helm/Terraform, config schemas, feature flags, progressive delivery playbooks.
How youll work :
- Collaborate asynchronously with Research, Product, and Infra/SRE.
- Share designs via concise docs and PRs; ship behind flags; measure, iterate, and document.
- Enable product teams through well-factored packages, SDKs, and runbooks.
Tech youll use :
LLMs & providers :
- OpenAI, Anthropic, Google, Azure OpenAI, AWS Bedrock; targeted OSS where it fits.
Orchestration/evals :
- LangChain/LlamaIndex or lightweight custom layers; test/eval harnesses.
Services & data :
- Python (primary), TypeScript; FastAPI/Flask/Express; Postgres/BigQuery; Redis; queues.
Ops :
- Docker, CI/CD, Terraform/CDK, metrics/logs/traces; deep experience in at least one of AWS/Azure/GCP.
Did you find something suspicious?