Posted on: 29/10/2025
Job Title : Windows System Engineer (Disaster Recovery)
Location : Bangalore, Chennai, Pune, Mumbai, Hyderabad
Experience Level : 5 to 8 years
Job Summary :
We are seeking an experienced Windows System Engineer (DR) to manage, support, and optimize Windows-based enterprise infrastructure with a focus on Disaster Recovery (DR) and Business Continuity Planning (BCP).
The ideal candidate will have strong expertise in failover recovery, networking, database health checks, load balancing, and monitoring/log analytics using Splunk.
The role requires a proactive engineer who ensures systems are resilient, secure, and compliant with defined RTO/RPO objectives.
Key Responsibilities :
Disaster Recovery (DR) Management :
- Plan, implement, and maintain DR environments for Windows-based systems.
- Conduct failover and failback exercises to validate DR readiness and minimize downtime.
- Define and manage RTO (Recovery Time Objective) and RPO (Recovery Point Objective) metrics for critical systems.
- Perform DR drills and document lessons learned for continuous improvement.
System and Network Administration :
- Manage Windows Server environments (2016/2019/2022) including patching, performance tuning, and troubleshooting.
- Configure and monitor network components (DNS, DHCP, IP routing, firewall rules, VLANs) to ensure connectivity during DR operations.
- Collaborate with network teams to validate load balancing and failover mechanisms.
Database and Application Checks :
- Perform DB sanity checks and ensure data consistency post-DR switchovers.
- Coordinate with DBAs and application owners to validate system availability post-recovery.
Load Balancer and Failover Operations :
- Configure, test, and monitor load balancers (F5, Citrix, HAProxy, or similar) to ensure high availability.
- Validate load balancing rules, session persistence, and failover logic during DR scenarios.
Monitoring and Incident Management :
- Use Splunk to monitor system health, event logs, and performance metrics.
- Develop dashboards and alerts for proactive issue detection and resolution.
- Perform root cause analysis (RCA) for system outages and DR failures.
Documentation and Compliance :
- Maintain DR documentation, runbooks, and standard operating procedures (SOPs).
- Support audit and compliance activities by providing recovery metrics and validation reports.
- Collaborate with IT Security and Compliance teams to ensure DR adherence to organizational standards.
Required Skills & Qualifications :
- Bachelors degree in Computer Science, Information Technology, or related field.
- 5 to 8 years of hands-on experience in Windows System Administration and Disaster Recovery Planning.
- Strong knowledge of Windows Server (2016/2019/2022) and Active Directory Services.
- Hands-on experience with failover clustering, replication, and backup technologies (e., Veeam, Commvault, Azure Backup).
- Solid understanding of RTO/RPO concepts and disaster recovery frameworks.
- Experience with networking fundamentals TCP/IP, DNS, DHCP, VLAN, routing, and firewalls.
- Practical knowledge of load balancer configuration and failover testing.
- Experience using Splunk for log management, monitoring, and alerting.
- Familiarity with PowerShell scripting for automation and system checks.
- Excellent troubleshooting, analytical, and documentation skills.
Did you find something suspicious?