HamburgerMenu
hirist

Windows System Engineer - IT Disaster Recovery

SMARTWORK IT SERVICES
Multiple Locations
5 - 8 Years

Posted on: 29/10/2025

Job Description

Job Title : Windows System Engineer (Disaster Recovery)

Location : Bangalore, Chennai, Pune, Mumbai, Hyderabad

Experience Level : 5 to 8 years

Job Summary :

We are seeking an experienced Windows System Engineer (DR) to manage, support, and optimize Windows-based enterprise infrastructure with a focus on Disaster Recovery (DR) and Business Continuity Planning (BCP).

The ideal candidate will have strong expertise in failover recovery, networking, database health checks, load balancing, and monitoring/log analytics using Splunk.

The role requires a proactive engineer who ensures systems are resilient, secure, and compliant with defined RTO/RPO objectives.

Key Responsibilities :

Disaster Recovery (DR) Management :

- Plan, implement, and maintain DR environments for Windows-based systems.

- Conduct failover and failback exercises to validate DR readiness and minimize downtime.

- Define and manage RTO (Recovery Time Objective) and RPO (Recovery Point Objective) metrics for critical systems.

- Perform DR drills and document lessons learned for continuous improvement.

System and Network Administration :

- Manage Windows Server environments (2016/2019/2022) including patching, performance tuning, and troubleshooting.

- Configure and monitor network components (DNS, DHCP, IP routing, firewall rules, VLANs) to ensure connectivity during DR operations.

- Collaborate with network teams to validate load balancing and failover mechanisms.

Database and Application Checks :

- Perform DB sanity checks and ensure data consistency post-DR switchovers.

- Coordinate with DBAs and application owners to validate system availability post-recovery.

Load Balancer and Failover Operations :

- Configure, test, and monitor load balancers (F5, Citrix, HAProxy, or similar) to ensure high availability.

- Validate load balancing rules, session persistence, and failover logic during DR scenarios.

Monitoring and Incident Management :

- Use Splunk to monitor system health, event logs, and performance metrics.

- Develop dashboards and alerts for proactive issue detection and resolution.

- Perform root cause analysis (RCA) for system outages and DR failures.

Documentation and Compliance :

- Maintain DR documentation, runbooks, and standard operating procedures (SOPs).

- Support audit and compliance activities by providing recovery metrics and validation reports.

- Collaborate with IT Security and Compliance teams to ensure DR adherence to organizational standards.

Required Skills & Qualifications :

- Bachelors degree in Computer Science, Information Technology, or related field.

- 5 to 8 years of hands-on experience in Windows System Administration and Disaster Recovery Planning.

- Strong knowledge of Windows Server (2016/2019/2022) and Active Directory Services.

- Hands-on experience with failover clustering, replication, and backup technologies (e., Veeam, Commvault, Azure Backup).

- Solid understanding of RTO/RPO concepts and disaster recovery frameworks.

- Experience with networking fundamentals TCP/IP, DNS, DHCP, VLAN, routing, and firewalls.

- Practical knowledge of load balancer configuration and failover testing.

- Experience using Splunk for log management, monitoring, and alerting.

- Familiarity with PowerShell scripting for automation and system checks.

- Excellent troubleshooting, analytical, and documentation skills.


info-icon

Did you find something suspicious?