Posted on: 06/10/2025
Description :
Role : SRE DevOps Engineer with Kafka Expertise
About the Role :
We are looking for a highly skilled and motivated Site Reliability Engineering (SRE) / DevOps Engineer with strong, hands-on experience in Apache Kafka to join our dynamic team. In this role, you will be instrumental in building, maintaining, and scaling our mission-critical infrastructure, focusing heavily on automation, deployment pipelines, and ensuring the reliability and performance of our streaming data platform. This is a fantastic opportunity to drive technical excellence and stability in a fast-paced environment.
Key Responsibilities :
- Kafka Platform Management : Design, implement, and manage large-scale, high-availability Kafka clusters. Optimize Kafka brokers, topics, and consumer/producer configurations for performance and resilience.
- Automation & CI/CD : Design, implement, and maintain robust Continuous Integration and Continuous Deployment (CI/CD) pipelines using Jenkins and scripting languages to automate infrastructure provisioning and application deployments.
- Infrastructure Management : Provision, configure, and manage infrastructure components using Infrastructure as Code (IaC) tools like Ansible or Chef. Ensure consistency and repeatability across environments.
- Scripting & Tooling : Develop and maintain automation scripts using Shell Scripting and Groovy Scripting or configure deployment logic using YAML to streamline operational tasks and integrate various tools.
- System Operations : Administer and troubleshoot Linux-based systems, ensuring optimal performance, security, and stability.
- Database Operations : Perform basic database administration and querying tasks using SQL to support application deployments and troubleshooting.
- Reliability & Monitoring : Implement and manage monitoring, alerting, and logging systems to proactively identify and resolve performance bottlenecks and operational issues, especially within the Kafka ecosystem.
- Process Improvement : Apply basic principles of ITIL / ITSM to manage changes, incidents, and problems efficiently, contributing to service improvement efforts.
Required Skills & Experience :
- Experience : 4+ years of professional experience in a DevOps, SRE, or similar infrastructure role.
- Kafka Expertise (Strong Requirement) : Deep, demonstrable experience in operating, tuning, monitoring, and troubleshooting large-scale, production-grade Apache Kafka environments.
- CI/CD : Proficiency with Jenkins for pipeline creation, job configuration, and deployment automation.
- Configuration Management : Hands-on experience with at least one major configuration management tool (Ansible or Chef).
- Scripting : Advanced proficiency in Shell Scripting and experience with Groovy Scripting and/or configuring systems using YAML.
- Operating System : Strong administration and troubleshooting skills in Linux environments.
- Database Skills : Working knowledge of SQL for querying and basic database operations.
- IT Service Management : Basic understanding of ITIL / ITSM principles (e.g., Incident, Change, and Problem Management).
- Problem-Solving : Excellent analytical and problem-solving skills with a focus on root cause analysis.
Location & Commitment :
- Locations : Pune & Coimbatore
- Notice Period : Immediate joiners preferred
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1556532
Interview Questions for you
View All