HamburgerMenu
hirist

Jarvis Technology & Strategy Consulting - Senior DevOps Engineer - Cloud Infrastructure

Posted on: 18/07/2025

Job Description

Role Summary :

We are seeking a Senior DevOps Engineer (SRE) to manage and optimize large-scale, mission-critical production systems.

- The ideal candidate will have a strong problem-solving mindset, extensive experience in troubleshooting, and expertise in scaling, automating, and enhancing system reliability.

- This role requires hands-on proficiency in tools like Kubernetes, Terraform, CI/CD, and cloud platforms (AWS, GCP), along with scripting skills in Ruby, Bash or Go.

- The candidate will drive observability and monitoring initiatives using tools like Prometheus, Grafana, and APM solutions (New Relic).

- Strong communication, incident management skills, and a collaborative approach are essential.

- Experience in team leadership and multi-client engagement is a plus.

Ideal Candidate Profile :

- Solid 4-6 years of experience as an SRE and DevOps with a proven track record of handling large-scale production environments

- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

- Strong Hands-on experience with managing Large Scale Production Systems

- Strong Production Troubleshooting Skills and handling high-pressure situations.

- Strong Experience with Databases (PostgreSQL, MongoDB, ElasticSearch, Kafka)

- Worked on making production systems more Scalable, Highly Available and Fault-tolerant

- Hands-on experience with ELK or other logging and observability tool

- Hands-on experience with Prometheus, Grafana & APM logging tools

- Problem-Solving Mindset

- Strong with skills - K8s, Terraform, Helm, AWS/GCP etc

- Good with Ruby/Go Scripting Automation

- Strong with fundamentals like DNS, Networking, Linux

- Experience with APM tools like - Newrelic, Datadog

- Good experience with Incident Response, Incident Management, Writing detailed RCAs

- Experience with Applications best practices in making apps more reliable and fault-tolerant

- Strong leadership skills and the ability to mentor team members and provide guidance on best practices.

- Able to manage multiple clients and take ownership of client issues.

- Experience with Git and coding best practices

Good to have :

- Team-leading Experience

- Multiple Client Handling

- Good Communication

Key Responsibilities :

Design and Development :

- Architect, design, and develop high-quality, scalable, and secure cloud-based software solutions.

- Collaborate with product and engineering teams to translate business requirements into technical specifications.

- Write clean, maintainable, and efficient code, following best practices and coding standards.

Cloud Infrastructure :

- Develop and optimize cloud-native applications, leveraging cloud services like AWS, Azure, or Google Cloud Platform (GCP).

- Implement and manage CI/CD pipelines for automated deployment and testing.

- Ensure the security, reliability, and performance of cloud infrastructure.

Technical Leadership :

- Mentor and guide junior engineers, providing technical leadership and fostering a collaborative team environment.

- Participate in code reviews, ensuring adherence to best practices and high-quality code delivery.

- Lead technical discussions and contribute to architectural decisions.

Problem Solving and Troubleshooting :

- Identify, diagnose, and resolve complex software and infrastructure issues

- Perform root cause analysis for production incidents and implement preventative measures.

Continuous Improvement :

- Stay up-to-date with the latest industry trends, tools, and technologies in cloud computing and software engineering.

- Contribute to the continuous improvement of development processes, tools, and methodologies

- Drive innovation by experimenting with new technologies and solutions to enhance the platform.

Collaboration :

- Work closely with DevOps, QA, and other teams to ensure smooth integration and delivery of software releases.

- Communicate effectively with stakeholders, including technical and non-technical team members.

Client Interaction & Management :

- Will serve as a direct point of contact for multiple clients.

- Able to handle the unique technical needs and challenges of two or more clients concurrently.

- Involve both direct interaction with clients and internal team coordination.

Production Systems Management :

- Must have extensive experience in managing, monitoring, and debugging production environment

- Will work on troubleshooting complex issues and ensure that production systems are running smoothly with minimal downtime

info-icon

Did you find something suspicious?