AI Backend Developer - Python/Golang

Hashone Careers

Multiple Locations

3 - 6 Years

4.5

6+ Reviews

Artificial Intelligence Backend Architecture AI Integration Generative AI Microservices Architecture LLM Agentic AI RAG Python Node.js Golang

Posted on: 25/09/2025

Job Description

About the Role :

We are looking for a highly skilled AI Backend Developer to design, build, and scale backend systems that power cutting-edge Generative AI applications. Youll play a pivotal role in integrating Large Language Models (LLMs) into intelligent, agentic workflows and building robust, production-grade infrastructure for AI-powered services.

This is an exciting opportunity to work with state-of-the-art AI models (like LLaMA, Falcon, Mistral, GPT, Claude), experiment with LLM orchestration frameworks, and contribute to real-world AI product development. Youll be part of a cross-functional team that includes ML engineers, researchers, DevOps, and frontend developers.

Key Responsibilities :

- Design and develop scalable, high-performance backend services and APIs for AI-driven applications.

- Implement microservices and event-driven architectures to support agentic workflows and LLM interactions.

- Ensure low-latency and high-availability systems capable of handling real-time AI inference and orchestration.

- Integrate LLMs into backend systems using APIs (e.g., OpenAI, HuggingFace) or self-hosted models (e.g., LLaMA, Mistral, Falcon).

- Optimize prompt workflows, manage context windows, and handle token limitations.

- Work on caching strategies, input/output preprocessing, and inference acceleration for efficient LLM use.

- Build intelligent agents using frameworks like LangChain, Haystack, AutoGen, LlamaIndex, or custom-built orchestration.

- Implement multi-agent coordination, tool usage, memory systems, and agent autonomy.

- Connect agents to external tools, APIs, databases, and user interfaces to enable decision-making and task automation.

- Serve models using tools such as FastAPI, Triton Inference Server, Ray Serve, or TorchServe.

- Work with vector databases (Pinecone, FAISS, Weaviate, Qdrant) to implement RAG (retrieval augmented generation) pipelines.

- Collaborate with ML engineers to productionize fine-tuned or custom models.

- Write unit, integration, and load tests to validate the robustness of AI services.

- Monitor system health, latency, and usage metrics; set up alerts and dashboards.

- Ensure code quality, versioning, and CI/CD pipelines are maintained for backend components.

- Work closely with product managers, ML engineers, and design teams to scope and deliver features.

- Contribute to architectural decisions and backend standards for the AI engineering team.

- Maintain detailed documentation for APIs, internal tooling, and service architecture.

Required Experience & Skills :

- 4 to 8 years of backend development experience, preferably with Python, but open to Node.js, Go, or Java-based stacks.

- Hands-on experience with integrating and deploying LLMs in production (e.g., GPT-4, Claude, LLaMA, Mistral, etc.).

- Proficiency with agentic AI frameworks (e.g., LangChain, Haystack, AutoGen, LlamaIndex).

- Experience with fine-tuning open-source LLMs, including dataset preparation, training, and evaluation.

- Strong understanding of:

- Model serving and API design

- Vector databases and RAG patterns

- Prompt engineering fundamentals

- Experience building scalable microservices and RESTful or gRPC APIs.

Nice-to-Have Skills :

- Experience with cloud platforms (AWS, GCP, Azure), including services like S3, Lambda, ECS, or Vertex AI.

- Familiarity with infrastructure-as-code (Terraform, Pulumi) or container orchestration (Docker, Kubernetes).

- Previous experience contributing to AI/LLM-powered product development.

- Exposure to MLOps tools and practices: model versioning, logging, monitoring, and retraining triggers.