개발 공부

(Failover Cluster) Cluster and Client 어뎁터 장애 시 VM Failover 시나리오 본문

windows

(Failover Cluster) Cluster and Client 어뎁터 장애 시 VM Failover 시나리오

아이셩짱셩 2025. 5. 30. 11:10
728x90

📌 Your Configuration

  • Failover Cluster: Windows Server Failover Cluster (WSFC).
  • Cluster Shared Volumes (CSV): Shared storage used by virtual machines.
  • Two network adapters per node:
    • Adapter 1: Handles client access and storage traffic.
    • Adapter 2: Handles only storage traffic.

🔌 Scenario: You unplug the “Client and Storage” adapter

✅ What remains:

  • The storage-only adapter is still connected and active.
  • The CSV network may still be reachable (if the storage-only path is used for CSV redirection).
  • The node is still up and running and possibly reachable by the cluster through other network paths.

❌ What you lose:

  • Client connectivity to that node (e.g., RDP, VM access via that adapter).
  • Possibly cluster communication, if that adapter was also used for cluster heartbeat and votes, and there’s no alternative path.

💡 Should the VM move to another node?

Not necessarily, and here’s why:

  • Cluster failover logic is based on cluster health and resource health, not just client access.
  • As long as:
    • The node is not isolated from the cluster (i.e., cluster heartbeat still working),
    • The node still has access to the CSV and the VM's VHDX files,
    • The VM is still running and healthy (from cluster’s perspective),

👉 The VM will stay on the same node.


📉 When would the VM move?

The VM would fail over to another node only if:

  • The node is removed from cluster membership (e.g., no network path to quorum).
  • The node loses access to the CSV (both adapters down or CSV redirected path fails).
  • The VM resource itself goes into a failed state (e.g., storage IO failure, health check failure).

 

🧯 VM Failover (Live Migration or Crash Failover) Will Happen If:

Case 1: Node loses cluster membership

If the node can’t communicate with the quorum (majority of other cluster members), it will evict itself from the cluster — causing:

  • All cluster roles (including the VM) to be failed over to another node.

Triggers:

  • You unplug the only adapter used for cluster communication.
  • There is no alternate network path to maintain cluster heartbeat.

Case 2: Node loses CSV access

If the unplugged adapter was the only network path that allows the node to access the CSV (Cluster Shared Volume), then:

  • The node loses access to VM disks.
  • The VM may freeze or crash, and the cluster will eventually detect failure and move it.

Triggers:

  • Storage traffic is not redirected over the remaining adapter.
  • No SMB Multichannel or MPIO configured.

Case 3: VM resource health failure

If the VM itself fails (e.g., because it loses disk access or encounters I/O timeout):

  • The VM role in the cluster will be marked as Failed.
  • The cluster will attempt to restart or failover the VM to another node.

Triggers:

  • Disk latency or I/O errors due to sudden loss of access.
  • VHDX becomes inaccessible.

Case 4: CSV goes into redirected mode, then fails

If unplugging causes CSV to enter redirected I/O mode, and:

  • The redirection doesn’t work (e.g., remaining adapter isn't suitable),
  • Or performance is too poor and health checks fail,

Then the cluster may decide to move the owner node of the CSV or the VM itself.

728x90
Comments