Posted on: 09/04/2026
Key Responsibilities :
- Own and manage the ELK platform across development, test, and production environments
- Design, build, enhance, and troubleshoot Logstash pipelines and Elastic Agent / Fleet-based integrations for ingestion, parsing, enrichment, filtering, routing, and indexing
- Build and support integrations between ELK and enterprise/cloud data sources, especially AWS services including CloudWatch, Kinesis, Firehose, S3, SQS/SNS, along with selected HTTP endpoint-based sources
- Administer and optimize Elasticsearch clusters including index design, mappings, templates, component templates, ILM, shard and replica strategy, storage planning, indexing/search performance, and cluster health
- Manage multi-node Elasticsearch cluster operations including master/data node behavior, shard allocation, rebalancing, failover awareness, node recovery, and resilience planning
- Support Kibana administration including data views, saved objects, alerting, reporting, spaces, roles, role mappings, and overall platform usability
- Manage OpenShift-hosted containerized ELK deployments including services, routes, persistent storage, rollout validation, pod health, operational stability, and platform troubleshooting
- Support Elastic Agent and Fleet Server administration, including agent policies, onboarding, enrollment/connectivity troubleshooting, and ingestion continuity monitoring
- Lead and support deployment, migration, upgrade, patching, version uplift, cutover, validation, and post-deployment monitoring for Elastic Stack 9.x
- Investigate and resolve ingestion failures, mapping conflicts, schema drift, missing data, continuity issues, cluster health issues, and production incidents across the full ingestion chain
- Support security, authentication, RBAC, SSL/TLS, and access control across the ELK ecosystem
- Manage Git-based ELK configuration and deployment artefacts, including required XML, YAML, and other configuration changes, with controlled promotion across environments
- Drive backup, recovery, disaster recovery validation, capacity planning, and platform readiness activities for the on-prem ELK environment
- Maintain technical documentation, SOPs, runbooks, and knowledge transfer artefacts for platform continuity and support readiness
- Work closely with analytics, engineering, infrastructure, and business stakeholders to ensure platform reliability, data availability, and operational support readiness
- Required Skills and Qualification s:
- 7+years of overall experience with strong hands-on expertise in ELK / Elastic Stack engineering and administration
- Deep expertise in Elasticsearch, Logstash, Kibana, Elastic Agent, Fleet, and Fleet Server
- Strong experience with AWS-integrated ELK ingestion, especially CloudWatch, Kinesis, Firehose, S3, SQS/SNS, and similar enterprise ingestion patterns
- Strong experience in on-prem, OpenShift-hosted / containerized ELK environments
- Proven hands-on experience in Elasticsearch cluster administration, including mappings, templates, ILM, shard strategy, storage planning, node recovery, performance tuning, and cluster operations
- Strong experience in Logstash pipeline development and troubleshooting
- Hands-on experience supporting Elastic Stack 9.x, including upgrades, migrations, patching, cutovers, and production support
- Good understanding of Docker, Kubernetes/OpenShift concepts, stateful workloads, persistent storage, services, routes, and platform operations
- Strong experience with monitoring, alerting, backup/recovery, RBAC, SSL/TLS, and operational support in ELK environments
- Hands-on experience with Git-based version control, preferably GitHub, for pipeline code, configuration changes, templates, and deployment artefacts
- Working knowledge of Python for scripting, automation, and ELK-related engineering tasks
- Strong troubleshooting, communication, and stakeholder management skills with the ability to work independently in a production support environment
Did you find something suspicious?
Posted by
Posted in
DevOps / SRE
Functional Area
DevOps / Cloud
Job Code
1627279