Posted on: 18/12/2025
Description :
Key Responsibilities :
Software Engineering & Cloud Infrastructure (Primary Focus) :
- Design, develop, and optimize cloud-native backend services using Python and/or Node.js for AI-driven applications on AWS or Azure
- Build and deploy scalable architectures using serverless computing, containers, and managed cloud services
- Develop and manage Infrastructure as Code (IaC) using Terraform, AWS CloudFormation, or Azure ARM templates to automate and standardize cloud deployments
- Implement and maintain CI/CD pipelines for AI model deployments, backend services, and automated testing using tools such as GitHub Actions, Azure DevOps, or similar
- Design and develop high-performance APIs and microservices (FastAPI, gRPC, REST) to serve AI capabilities such as LLM inference, agent workflows, and AI pipelines
- Optimize systems for security, scalability, latency, reliability, and cost efficiency
Reliability, Monitoring & Observability :
- Ensure high availability and reliability of AI systems through monitoring, logging, and alerting
- Implement observability solutions using Prometheus, CloudWatch, Azure Monitor, or equivalent tools
- Troubleshoot production issues and drive continuous improvements in system stability and performance
Collaboration & Best Practices :
- Collaborate with data scientists, ML engineers, and product teams to productionize AI models and workflows
- Participate in architectural discussions and contribute to cloud and AI platform roadmaps
- Follow best practices in software design, security, and DevOps
Required Skills & Qualifications :
- 710 years of overall software engineering experience
- Strong proficiency in Python and/or Node.js for backend development
- Hands-on experience with AWS or Azure cloud platforms
- Proven experience in cloud-native architectures, including serverless and container-based systems (Docker, Kubernetes)
- Expertise in Infrastructure as Code (Terraform, CloudFormation, ARM templates)
- Experience building and deploying CI/CD pipelines
- Strong experience with API and microservices development (FastAPI, gRPC, REST)
- Knowledge of monitoring, logging, and observability tools (Prometheus, CloudWatch, etc.)
- Solid understanding of security best practices in cloud environments
Preferred Qualifications :
- Experience working with Generative AI / LLM-based systems
- Familiarity with AI model serving, inference optimization, and agent frameworks
- Knowledge of MLOps practices and AI lifecycle management
- Experience with cost optimization strategies in cloud environments
- Exposure to multi-cloud or hybrid architectures
Did you find something suspicious?