HamburgerMenu
hirist

Site Reliability Engineer - CI/CD Pipeline

SYMPHONYINCUBATOR BUSINESS SERVICES PRIVATE LIMITE
7 - 18 Years
Bangalore

Posted on: 07/04/2026

Job Description

Description :


Job Title :


Role : Site Reliability Engineer - Architect / Principal


Department :


Engineering - IRIS Smart Manufacturing Platform


Overview :


SymphonyAI is at the forefront of innovation, leveraging cutting-edge artificial intelligence and machine learning technologies to transform industries and drive business growth.


As a global leader in AI-powered solutions, SymphonyAI empowers organizations with enterprise applications that rapidly deliver transformative business value across retail, CPG, financial services, manufacturing, media, Enterprise IT, and the public sector.


We are on a mission to build a World Class Engineering Team with a high-performance culture.


Our solutions, hosted on the Iris Smart Manufacturing platform, combine equipment and process domain expertise in Mining & Metals, Oil & Gas, Chemicals & Petrochemicals with the state-of-the-art in data sciences, machine learning, and process optimization.


The IRIS platform supports hybrid deployments and is built using microservices architecture.


We are seeking a highly skilled SRE Architect / Principal to design, implement, and maintain highly available, scalable, and secure systems across cloud and on-premise environments.


The ideal candidate will combine deep technical expertise with a strategic mindset to drive reliability, automation, and performance across mission-critical applications.


This is a hands-on role with architect-level responsibilities, including mentoring teams, shaping platform reliability practices, and influencing operational strategy.


Job Description :


Responsibilities :


- Contribute to the IRIS Platform operations roadmap and execute planned research and development.


- Lead the design, deployment, and operations of large-scale systems on AWS (EKS) or Azure (AKS), ensuring reliability, scalability, and security.


- Serve as the principal architect for platform reliability, performance, and disaster recovery strategies.


- Build and maintain CI/CD pipelines for microservices and containerized applications deployable across EKS/AKS clusters.


- Implement Infrastructure as Code using Terraform, CloudFormation, or AWS CDK for cloud and Kubernetes environments.


- Apply SRE best practices, including SLIs, SLOs, error budgets, incident management, and post-mortem analysis.


- Conduct root cause analysis for production incidents and drive continuous improvement in reliability and operational efficiency.


- Implement monitoring, logging, and alerting across Kubernetes clusters using Prometheus, Grafana, and EFK Stack.


- Optimize platform performance, scalability, and cost in cloud and hybrid environments.


- Mentor junior engineers and act as a technical authority on reliability, cloud architecture, and DevOps practices.


- Ensure compliance with security, governance, and operational standards across all deployments.


Required Skills & Qualifications:


- 7+ years of experience in Site Reliability Engineering (SRE).


- 3+ years of hands-on experience working with Linux systems.


- 4+ years of commercial experience with Kubernetes.


- 2+ years of experience working with Docker.


- 4+ years of experience setting up and managing CI/CD pipelines.


- 4+ years of experience working with automation tools such as Terraform and Ansible.


- Experience with containerization technologies, including Helm, and CI/CD pipelines.


- Good knowledge of security best practices and vulnerability management tools (e. , Acunetix, Snyk, CheckMarx, Trivy).


- Experience troubleshooting production issues and performing root cause analysis.


- Ability to work effectively in an Agile environment.


Preferred Skills & Qualifications:


- Operational knowledge of databases (Postgres, ElasticSearch, Redis, or similar).


- Exposure to configuring web servers such as Nginx.


- Working knowledge of monitoring tools such as Grafana and Prometheus.


- Working knowledge of a messaging framework such as Event Hub, Kafka, RabbitMQ, or similar.


Diversity & Inclusion Statement :


We are committed to building a diverse and inclusive team and encourage candidates from all backgrounds to apply.


About Us :


SymphonyAI is building the leading enterprise AI SaaS company for digital transformation across the most critical and resilient growth industries, including retail, consumer packaged goods, financial crime prevention, manufacturing, media, and IT service management.


Since its founding in 2017, SymphonyAI today serves 1500+ Enterprise customers globally and has grown to 3,000 talented leaders, data scientists, and other professionals across over 30 countries


info-icon

Did you find something suspicious?

Similar jobs that you might be interested in