Posted on: 12/11/2025
Description :
Requirements :
A day in the life :
- Architect and deploy scalable infrastructure and platform services (Monitoring, logging, etc) with a focus on simplicity and automation.
- Own the performance and reliability of backend services, data pipelines, platform services, etc, and work with developers to ensure monitoring and alerting best practices are being adopted.
- Develop platform capabilities, such as Release management, Monitoring infrastructure, Logging centralization, CI/CD, etc that enables developers to develop and deploy software with high velocity and quality.
- Implement DevSecOps principles and continuously secure systems by conforming to InfoSec best practices.
- Keep a keen eye on infrastructure costs and capacity - and design systems to be cost effective.
- Develop a high level understanding of Fairmatic services and their relations, enabling you to debug and address critical issues and bottlenecks.
- Exemplify and foster Fairmatics humble, collaborative and impact-obsessed culture.
What you will need :
- 6+ years in Site Reliability role (DevOps/System Administration) maintaining Linux systems in cloud environments (We use AWS)
- Excellent understanding of Linux system and Network Fundamentals
- Deep understanding of Monitoring and Alerting systems such as Prometheus, Graphite or equivalent
- Experience with infrastructure automation tools such Ansible, Puppet or equivalent (We use Ansible)
- Expertise with general purpose scripting and programming languages like Python, Ruby or equivalent and Shell scripting (Bash)
- Experience managing HTTP APIs, Message brokers like Kafka, relational databases like Postgresql
- Knowledge of Hadoop and Big Data processing frameworks like Spark is a plus!
- Comfortable working in a highly agile, intensely iterative software development environment
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1573866
Interview Questions for you
View All