Windows Updates Master Runbook v2 (Consolidated)

Categories: Documentation · Exported: 2026-04-18 00:44

Windows Updates Master Runbook v2 (Consolidated)

Document Control


1) General Process Information

Objective

Standardize Windows update execution across standalone servers, failover clusters, and SQL Always On environments with controlled failover, restart, and validation.

Core Principles

  1. Track progress in the approved tracker (e.g., CriticalAppsSOCluster.xlsx).
  2. Start with systems where synchronization delay is longest (SCOM/SQL primary paths).
  3. Validate before and after each role move/failover/restart.
  4. Execute only in approved change windows.

High-Level Flow

  1. Check update status and readiness.
  2. Move servers to the correct batch (if needed).
  3. Apply updates (interactive or scheduled).
  4. Perform failover/failback procedures (where applicable).
  5. Validate synchronization, services, and evidence.

2) Check Update Status

Script Method

On designated admin host (example in source: dfs-asufs-51), use:

C:\temp\CheckUpdateStatus1.ps1

Steps

  1. Open PowerShell as Administrator.
  2. Run script against current batch list.
  3. If checking another list, edit server list variable (line 2 in source note).
  4. Review results for:
    • pending updates
    • pending reboot
    • install state
  5. Save outputs (updates.csv + evidence screenshots).

Notes


3) Calendar / Change Window

Before Change

During Change

After Change


4) Move Batches

Purpose

When servers are in wrong Software Center batch or missing updates.

Procedure

  1. Send request to operations team (example: SGITTHybridCloudOperations).
  2. Include: server list, current batch, target batch, reason, evidence.
  3. Copy affected support owners.
  4. Link request/ticket ID in change tracker.

Batch Mapping Example (Reference)

Use this pattern to separate primary/secondary update waves.


5) Microsoft System Center (Software Center)

Option 1 — Interactive Install

  1. Launch Software Center.
  2. In Available Software, select only entries with Type = Update.
  3. Click Install Selected.
  4. Monitor status: Waiting to installInstallingFinished/Requires restart.
  5. If prompted, click Restart (or reboot per policy).
  6. After reboot, verify host/services health.

Option 2 — Scheduled Install

  1. Select all required updates (Type = Update).
  2. Click Schedule Selected.
  3. In scheduling prompt, set:
    • Install outside business hours
    • Restart automatically after installation if needed
  4. Click Change software installation settings.
  5. In Options → Work Information:
    • define business hours correctly,
    • set valid workdays,
    • confirm non-working install window.
  6. Click Apply, then OK.
  7. Confirm updates show as Scheduled to install after ....
  8. Validate completion after schedule window.

Notes


6) SCOM Servers During Change

Why first

Old primary DB can take long to sync after failover (up to ~2 hours).

Steps

  1. Verify updates installed correctly.
  2. Verify Always On synchronization healthy.
  3. Perform failover.
  4. When new primary synchronized, restart old primary (no need to wait full old-primary DB recovery first).
  5. Continue remaining servers; periodically check old primary sync progress.
  6. Perform failback when all DBs synchronized (if needed).
  7. Validate final primary sync.

Notes from source


7) Always On (Example: VAVM1327 / VAVM3773)

These servers may require manual process (not fully covered by update job in global domain).

Procedure

  1. Open Failover Cluster Manager.
    • Check roles, topology, shared disk expectations.
  2. Open SSMS.
    • Ensure no critical jobs running.
    • Confirm AO-related jobs behavior.
  3. In Always On High Availability:
    • Availability Groups → Dashboard
    • verify replicas green
    • run failover wizard to synchronous counterpart
  4. Refresh dashboard and verify synchronization.
  5. If dashboard shows reverting/transient state:
    • do not force restart immediately
    • validate role state and primary ownership first
  6. Restart server only after role/sync validation.

Failback to Original Primary

  1. Log onto original primary.
  2. Check pending updates (allow settle period).
  3. In SSMS/AG, connect to current primary and verify no jobs running.
  4. Run failback wizard to original primary.
  5. Validate synchronization and database online status.

8) Failover Clusters (Shared Disks)

Example nodes in source

Procedure

  1. Confirm updates installed on all relevant nodes (including pending restart state if expected).
  2. In Failover Cluster Manager, identify active node.
  3. Verify disk mount ownership on active node.
  4. In SSMS, confirm jobs not running and no immediate conflict schedules.
  5. Move SQL roles to target node:
    • Roles → Move → Select Node
  6. Verify role transition and disk remount on target node.
  7. Confirm SQL service and DB availability.
  8. Restart as required from Software Center and re-validate.

9) Post-Change Validation Checklist


10) Troubleshooting Notes


11) Suggested AI Wiki Structure


12) Scheduling Split + CM Preparation (Friday/Saturday)

Note

Steps to follow

  1. Create a CM for servers highlighted for Thursday/Friday before automatic restart.
  2. Create a second CM for Saturday for the remaining servers.
  3. GIS and OCHA application servers require a special graceful shutdown procedure before restart.

Referenced procedure docs


13) Automatic Launch via SQL Jobs on ARS

CM timing note

Execution model

Job hosts by domain

File location (both servers)

Referenced files