Posted on: 22/08/2025
Job Description :
Roles and Responsibilities :
Cluster Architecture & Deployment :
- Design, deploy, and manage RabbitMQ and Kafka clusters across on-premises, cloud, or hybrid environments.
- Configure brokers, exchanges, queues, topics, partitions, replication, and retention policies.
Operation & Monitoring :
- Monitor platform health using tools like Prometheus, Grafana, Kafka Manager, RabbitMQ Management UI, and set up proactive alerting.
- Investigate and troubleshoot issues (e.g., latency, stale messages, downtime).
Security & Compliance :
- Implement access control, authentication (e.g., SASL, Kerberos), encryption (SSL/TLS), and RBAC/ACL policies.
- Enforce data governance and regulatory compliance.
Performance Tuning & Capacity Planning :
- Optimize throughput, latency, and resource usage through fine-tuning broker and pipeline configurations; plan for future scaling capacity.
Automation & Tooling :
- Automate routine tasks using scripting (Shell, Python), IaC tools like Ansible or Terraform, and integrate platforms with CI/CD pipelines.
Disaster Recovery & Resilience :
- Design and maintain backup, restore, and failover strategies to achieve high availability across datacenters or cloud zones.
Collaboration & Support :
- Work with development, DevOps, and data engineering teams to align middleware integration with business requirements.
- Offer guidance, documentation, and knowledge transfer for messaging best practices.
Documentation & Governance :
- Maintain clear records of architecture, configuration standards, incident logs, and operational procedures.
Must Have Skills :
- Overall 7+ years with 3+ years of experience in administering Kafka (Apache, Confluent, MSK) and RabbitMQ in production environments.
- Proven track record in monitoring, optimization, and incident resolution.
- Deep understanding of RabbitMQ and Kafka ecosystems: brokers, connectors, zookeeper/KRaft, schema registry.
- Proficiency with monitoring tools and middleware performance metrics.
- Strong collaboration, communication, and documentation abilities.
- Experience supporting cross-functional teams and mentoring juniors.
- Strong problem-solving skills and attention to detail.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Systems Administration
Job Code
1533688
Interview Questions for you
View All