Posted on: 16/01/2026
Description:
We are looking for a hands-on candidate with real-time implementation experience in both Kafka and DevOps.
Key Responsibilities :
- Install, configure, and manage highly available Confluent Platform / Apache Kafka clusters (on-premises, cloud, and/or hybrid)
- Monitor cluster health and performance metrics, detect and troubleshoot issues, and perform capacity planning
- Implement and maintain security (SASL, SSL/TLS, Kerberos), ACLs, RBAC, and data encryption
- Automate Kafka operational tasks using Bash/Python/Ansible/Terraform and integrate with CI/CD pipelines
- Collaborate with development, data engineering, and DevOps teams to onboard new streaming use cases
- Perform upgrades, patching, disaster recovery, and ensure high availability
- Maintain documentation and provide 247 production support (rotational shifts)
Requirements:
- Experience with Confluent Cloud.
- Experience in automating Kafka infrastructure using Terraform.
- Hands-on knowledge of managing Kafka configurations using Ansible.
- Experience with observability and monitoring tools such as Datadog and Splunk.
- AWS cloud experience with working knowledge of EC2 ALB, volumes, and security groups and the ability to navigate networking challenges confidently.
- Strong understanding of system design principles for distributed architectures.
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
Systems Administration
Job Code
1602515