HamburgerMenu
hirist

Xcelpros - DevOps Engineer - Cloud Infrastructure

Xcelpros Technologies Pvt Limited
Multiple Locations
5 - 10 Years

Posted on: 15/10/2025

Job Description

Description :

Job Title : Devops Engineer

Experience : 5+yrs

Employment type : Part Time

Timings : It would be for 3 - 4 hours on a daily basis starting from 7pm onwards

Key Responsibilities :

Cloud Infrastructure Management (Azure & AWS) :

- Manage and scale cloud-based environments, ensuring high availability, fault tolerance, and security.

- Implement and maintain multi-tenant architectures in cloud platforms (Azure and AWS), ensuring resource isolation and efficient cost management.

- Configure and optimize resources across both Azure and AWS platforms, leveraging best practices for cloud security, networking, and performance.

CI/CD Pipeline Maintenance :

- Design, implement, and optimize continuous integration and continuous deployment (CI/CD) pipelines across development, staging, and production environments.

- Ensure high availability and minimal downtime for production releases by automating rollbacks, canary deployments, and blue-green deployments.

- Integrate monitoring and alerting into the pipelines to catch issues before they reach production.

AI/ML Ops Integration :

- Work closely with the AI/ML teams to deploy and monitor machine learning models in production environments, ensuring smooth integration with CI/CD pipelines.

- Implement and maintain automated workflows for ML model training, testing, and deployment.

- Use tools like Azure ML, SageMaker, or other relevant services to enable ML lifecycle management.

Test-Driven Development (TDD) Integration :

- Promote and implement Test-Driven Development (TDD) practices within CI/CD pipelines to improve software quality and reduce defects in production.

- Ensure that code quality is maintained by integrating automated unit and integration tests into the pipelines, along with ensuring that sufficient test coverage is in place.

Automation & Scripting :

- Develop automation scripts and tools to streamline repetitive tasks across various environments.

- Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or ARM Templates.

- Create and manage cloud-native services such as Kubernetes (AKS, EKS), containerized applications, and serverless architectures.

Monitoring & Performance Tuning :

- Set up and manage monitoring systems (e.g., Azure Monitor, CloudWatch, Prometheus, Grafana) for tracking system performance, cost, and security metrics.

- Implement logging and alerting systems to ensure rapid detection of issues in production.

- Continuously improve infrastructure performance by analyzing bottlenecks and resource optimization opportunities.

Collaboration & Mentoring :

- Collaborate with software engineers to ensure smooth integration of new code into the pipelines, supporting Agile development cycles.

- Provide guidance and mentorship to junior DevOps engineers and developers on best practices for cloud infrastructure, CI/CD, and automation.

Security & Compliance :

- Ensure cloud infrastructure complies with industry standards for security, privacy, and governance.

- Implement security best practices, including identity and access management (IAM), encryption, and network security.

Skills & Qualifications :

Essential :

Cloud Experience :

- Extensive experience with Azure and AWS cloud platforms, with a strong understanding of their respective services (e.g., Azure Kubernetes Service, AWS EC2, S3, Lambda, CloudFormation, and Azure Resource Manager).

- Experience with multi-tenant architecture at scale on cloud platforms, including designing isolated environments and managing cross-tenant resources.

CI/CD & Automation :

- Strong experience with CI/CD tools such as Jenkins, GitLab CI, Azure DevOps, or AWS CodePipeline.

- Familiarity with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Azure ARM templates.

- Proficient in containerization technologies such as Docker, Kubernetes, and container orchestration platforms (AKS, EKS).

AI/ML Ops Knowledge :

- Familiarity with AI/ML Ops practices and tools (e.g., Azure Machine Learning, AWS SageMaker, MLFlow, Kubeflow).

- Experience automating the deployment and monitoring of machine learning models in production environments.

Programming & Scripting :

Strong programming skills in languages such as Python, Go, or Ruby.

Experience with scripting in Bash, PowerShell, or Python for automating tasks.

Test-Driven Development (TDD) :

- Proven experience integrating Test-Driven Development (TDD) into CI/CD pipelines.

- Familiarity with testing frameworks like JUnit, pytest, Mocha, or equivalent for unit, integration, and functional tests.

Monitoring & Logging :

- Experience with monitoring tools such as Prometheus, Grafana, CloudWatch, Azure Monitor, and Datadog.

- Familiarity with logging frameworks such as ELK Stack, Fluentd, or Splunk.

- Security Best Practices :

- Experience with cloud security practices, including IAM, security groups, VPNs, and encryption standards.

The job is for:

May work from home
info-icon

Did you find something suspicious?