windows

(Failover Cluster) Client and Cluster 네트워크 유실 시 VM 옮기는 스크립

아이셩짱셩 2025. 5. 30. 11:32
728x90

✅ PowerShell Script: Monitor & Auto-Evict Node on Network Loss

Here’s a working PowerShell script that:

  1. Checks if the local node has an interface in the "Client and Cluster" network that is disconnected.
  2. Waits for X seconds to see if the state persists.
  3. If so, triggers a force failover by:
    • Moving VMs (if still possible), or
    • Pausing and/or evicting the node to force failover.

🔧 Script Template:

# CONFIGURATION
$WatchedNetworkName = "Client and Cluster"   # Name of the cluster network to monitor
$GracePeriodSeconds = 30                     # How long to wait before taking action
$LocalNode = $env:COMPUTERNAME               # This node

function IsWatchedNetworkDown {
    $net = Get-ClusterNetwork | Where-Object { $_.Name -eq $WatchedNetworkName }
    if (-not $net) {
        Write-Host "Network '$WatchedNetworkName' not found." -ForegroundColor Yellow
        return $false
    }

    $iface = Get-ClusterNetworkInterface -Network $net | Where-Object { $_.Node -eq $LocalNode }
    return $iface.State -ne "Up"
}

# Monitor network status
if (IsWatchedNetworkDown) {
    Write-Host "[$LocalNode] '$WatchedNetworkName' is down. Waiting $GracePeriodSeconds seconds..." -ForegroundColor Yellow
    Start-Sleep -Seconds $GracePeriodSeconds

    if (IsWatchedNetworkDown) {
        Write-Host "[$LocalNode] '$WatchedNetworkName' is still down. Taking action..." -ForegroundColor Red

        # 1. Optionally pause node (prevent VMs from returning here)
        Suspend-ClusterNode -Name $LocalNode -Drain -Force

        # 2. Try to move all clustered VMs off this node
        $vms = Get-ClusterGroup | Where-Object { $_.OwnerNode -eq $LocalNode -and $_.GroupType -eq 'VirtualMachine' }
        foreach ($vm in $vms) {
            Write-Host "Attempting to move VM '$($vm.Name)' to another node..."
            Move-ClusterGroup -Name $vm.Name -MoveToBestNode -Force
        }

        # 3. Optional: Evict the node (use with caution!)
        # Remove-ClusterNode -Name $LocalNode -Force

        # 4. Optional: Restart node (to force failover + reset NIC)
        # Restart-Computer -Force
    } else {
        Write-Host "[$LocalNode] Network recovered during grace period. No action needed." -ForegroundColor Green
    }
} else {
    Write-Host "[$LocalNode] '$WatchedNetworkName' is up. No action needed." -ForegroundColor Green
}

📝 Notes:

  • This script should be run as a scheduled task or service on each node.
  • You can customize the action:
    • Suspend-ClusterNode — drains VMs and prevents future placement.
    • Move-ClusterGroup — triggers graceful migration (will fail if DC/DNS unreachable).
    • Restart-Computer — force restart to simulate node failure (failover will occur).
    • Remove-ClusterNode — extreme, use with care.
728x90