Software Engineer - Large Language Models

Muoro

Bangalore

5 - 7 Years

4.4

7+ Reviews

LLM Python LangChain LLama Deep Learning OpenAPI FastAPI

Posted on: 24/10/2025

Job Description

Description :

Role Overview :

We are seeking a highly skilled Software Engineer specializing in Large Language Models (LLMs) to design, develop, and deploy cutting-edge AI solutions leveraging state-of-the-art transformer architectures.

- The ideal candidate will have strong expertise in deep learning, NLP, and model optimization, combined with software engineering best practices for building scalable AI systems in production.

- Youll collaborate with data scientists, ML engineers, and product teams to build intelligent applications powered by advanced generative AI models such as GPT, LLaMA, Falcon, Mistral, Claude, or similar open-source and proprietary models.

Key Responsibilities :

- Design, train, fine-tune, and evaluate Large Language Models (LLMs) for specific use cases (e.g., summarization, code generation, chatbots, reasoning, and retrieval-augmented generation).

- Experiment with transformer-based architectures (e.g., GPT, T5, BERT, LLaMA, Mistral).

- Develop parameter-efficient fine-tuning (PEFT) strategies such as LoRA, QLoRA, adapters, or prompt-tuning.

- Create and maintain high-quality datasets for pretraining, fine-tuning, and evaluation.

- Optimize model inference using techniques like quantization, distillation, and tensor parallelism for real-time or edge deployment.

- Integrate LLMs into production environments using frameworks like Hugging Face Transformers, PyTorch Lightning, or DeepSpeed.

- Implement scalable model serving solutions using FastAPI, Ray Serve, Triton Inference Server, or similar frameworks.

- Build and maintain APIs or SDKs that expose LLM capabilities to other teams and products.

- Evaluate and experiment with open-source and proprietary foundation models.

- Keep up with the latest trends in Generative AI, NLP, and Transformer models.

- Perform benchmarking, ablation studies, and A/B testing to measure performance, cost, and quality improvements.

- Collaborate with ML Ops and DevOps teams to design CI/CD pipelines for model training and deployment.

- Manage and optimize GPU/TPU clusters for distributed training and inference.

- Implement robust monitoring, logging, and alerting for deployed AI systems.

- Ensure software follows clean code principles, version control, and proper documentation.

- Partner with product managers, data scientists, and UX teams to identify and translate business problems into AI-driven solutions.

- Contribute to internal research initiatives and help shape the companys AI strategy.

- Mentor junior engineers in AI model development, coding standards, and best practices.

Required Technical Skills :

Core Expertise :

- Strong proficiency in Python and deep learning frameworks (PyTorch, TensorFlow, JAX).

- Hands-on experience with transformer architectures and LLM fine-tuning.

- Deep understanding of tokenization, attention mechanisms, embeddings, and sequence modeling.

- Experience with Hugging Face Transformers, LangChain, LlamaIndex, or OpenAI API.

- Experience deploying models using Docker, Kubernetes, or cloud ML services (AWS Sagemaker, GCP Vertex AI, Azure ML, OCI Data Science).

- Familiarity with model optimization (quantization, pruning, distillation).

- Knowledge of retrieval-augmented generation (RAG) pipelines, vector databases (FAISS, Pinecone, Weaviate, Chroma).

Additional Skills (Good to Have) :

- Experience with multi-modal models (text + image, text + code).

- Familiarity with MLOps tools like MLflow, Kubeflow, or Weights & Biases (W&B).

- Understanding of Responsible AI practicesbias mitigation, data privacy, and model explainability.

- Experience contributing to open-source AI projects