Description :
Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.
Role Summary :
Join athenahealth as a Senior Site Reliability Engineer MTS at the Associate level, based in Bangalore Whitefield, working in a hybrid environment.
This role offers an exciting opportunity to contribute to the reliability and scalability of critical cloud infrastructure and services.
You will collaborate closely with engineering teams to ensure high availability and performance of our systems.
This position reports directly to a Senior Manager.
Team Summary :
The Logging, Metrics, and Monitoring (LMM) team plays a pivotal role in delivering observability services and tools that empower engineering teams across Cloud Engineering & Operations and Research & Development.
Our team builds and maintains large-scale, distributed, fault-tolerant systems that collect, store, and analyze vast volumes of log and metric data.
These solutions are essential for hundreds of developers daily, enabling them to monitor, troubleshoot, and optimize web services effectively.
By providing robust observability infrastructure, the LMM team supports data-driven decision-making and continuous improvement across the organization.
Essential Job Responsibilities :
- Develop and maintain scalable, reliable logging, metrics, and monitoring systems using modern cloud-native technologies.
- Manage containerized environments leveraging Docker and Kubernetes to support application deployment and orchestration.
- Analyze system performance and reliability metrics to identify and resolve issues proactively.
- Collaborate with development teams to integrate observability best practices into the software development lifecycle.
- Automate operational processes to improve efficiency and reduce manual intervention.
- Participate in incident response and root cause analysis to enhance system resilience.
- Contribute to the design and implementation of infrastructure as code and configuration management solutions.
Additional Job Responsibilities :
- Assist in capacity planning and infrastructure scaling strategies.
- Support continuous integration and continuous deployment (CI/CD) pipelines to streamline releases.
- Document system architecture, operational procedures, and troubleshooting guides.
- Engage in knowledge sharing and mentoring within the team and broader engineering community.
- Evaluate and recommend new tools and technologies to enhance observability capabilities.
- Participate in cross-functional projects to improve overall platform reliability.
- Support compliance and security initiatives related to infrastructure and monitoring systems.
Expected Education & Experience :
- Bachelors degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
- 2 to 5 years of experience in site reliability engineering, systems engineering, or a related role.
- Proficiency with Linux, Docker, and Kubernetes in production environments.
- Experience with logging, metrics, and monitoring tools and frameworks.
- Strong scripting and automation skills using languages such as Python, Bash, or similar.
- Familiarity with cloud platforms and infrastructure as code tools is preferred.
- Excellent problem-solving skills and ability to work collaboratively in a team environment.
About Athenahealth :
Our vision: In an industry that becomes more complex by the day, we stand for simplicity.
We offer IT solutions and expert services that eliminate the daily hurdles preventing healthcare providers from focusing entirely on their patients powered by our vision to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.
Our company culture: Our talented employees or athenistas, as we call ourselves spark the innovation and passion needed to accomplish our vision.
We are a diverse group of dreamers and do-ers with unique knowledge, expertise, backgrounds, and perspectives.
We unite as mission-driven problem-solvers with a deep desire to achieve our vision and make our time here count.
Our award-winning culture is built around shared values of inclusiveness, accountability, and support.
What We Can Do For You :
Along with health and financial benefits, athenistas enjoy perks specific to each location, including commuter support, employee assistance programs, tuition assistance, employee resource groups, and collaborative workspaces some offices even welcome dogs.
We also encourage a better work-life balance for athenistas with our flexibility.
While we know in-office collaboration is critical to our vision, we recognize that not all work needs to be done within an office environment, full-time.
With consistent communication and digital collaboration tools, athenahealth enables employees to find a balance that feels fulfilling and productive for each individual situation.
In addition to our traditional benefits and perks, we sponsor events throughout the year, including book clubs, external speakers, and hackathons.
We provide athenistas with a company culture based on learning, the support of an engaged team, and an inclusive environment where all employees are valued.