# WheaErrorLog Module Analysis

## Overview

SMM WHEA (Windows Hardware Error Architecture) error logging driver for Intel Purley platform. This module initializes WHEA error record storage in SMRAM, registers SMI handlers for error record update and notification callbacks, and interfaces with platform MP (Multi-Processor) sync data, PCIe MM config space, and WHEA boot-time protocols to log hardware errors during OS runtime.

The module acts as the SMM-side counterpart to the DXE WHEA infrastructure, providing persistent error record buffers that survive across SMI entries.

## Address Range

`0x300` - `0x2FA8` (63 functions, ~154 KB image total including .data)

## Image Characteristics

| Property | Value |
|----------|-------|
| Full path | 0226_WheaErrorLog_7676e07fb4c8/WheaErrorLog.efi |
| MD5 | 24a2707e0606e07658e042b6a8f332e2 |
| Architecture | x64 (PE32+) |
| Image size | 0x25780 (153 KB) |
| .text | 0x300 - 0x3080 (11.6 KB, RX) |
| .rdata | 0x3080 - 0x3D80 (3.2 KB, R) |
| .data | 0x3D80 - 0x25380 (135 KB, RW) |

## Key Functions

| Address | Name | Size | Purpose |
|---------|------|------|---------|
| 0x5E8 | `_ModuleEntryPoint` | 0xED | DXE/SMM entry point |
| 0x21C0 | `sub_21C0` | 0x3B9 | Constructor chain: calls 14 init sub-functions |
| 0x25EC | `sub_25EC` | 0x78 | Main init: disables SET alarm, calls sub_26F4 |
| 0x26F4 | `sub_26F4` | 0x50D | Protocol installation + SMI handler registration |
| 0x2C04 | `sub_2C04` | 0x2EC | WHEA error record creation/handler |
| 0x2EF0 | `sub_2EF0` | 0xB6 | Pre-initialize error status blocks |
| 0x2FA8 | `sub_2FA8` | 0x28 | SMM notify callback to reset state |
| 0x2688 | `sub_2688` | 0x22 | SMI handler to enable WHEA logging |
| 0x2664 | `sub_2664` | 0x22 | SMI handler to disable WHEA logging |
| 0x2170 | `sub_2170` | 0x50 | SMM callback registration wrapper |
| 0x1ACC | `sub_1ACC` | 0x3A3 | Protocol resolution (WheaSiliconHooksLib) |
| 0x13B0 | `sub_13B0` | 0x4B8 | MP sync data initialization |
| 0x11A0 | `sub_11A0` | 0x20F | ACPI PM1 timer-based delay / WHEA boot protocol init |
| 0xC3C | `sub_C3C` | 0x172 | SMM memory allocation - get SMRAM ranges |
| 0xEBC | `sub_EBC` | 0xBA | PCI MMIO base (DxeMmPciBase) initialization |
| 0x1F5C | `sub_1F5C` | 0x60 | WHEA silicon hooks config data initialization |
| 0x2040 | `sub_2040` | 0xB6 | PCD protocol locator |
| 0xFDC | `sub_FDC` | 0x110 | HOB list locator (via SystemTable config table, GUID at 0x3D80) |
| 0x1154 | `sub_1154` | 0x49 | HOB list traversal by GUID |
| 0xA6C | `sub_A6C` | 0x67 | GUID comparison (8-byte QWORD compare) |
| 0x9E8 | `sub_9E8` | 0x82 | ZeroMem wrapper |
| 0x93C | `sub_93C` | 0xA9 | CopyMem wrapper |
| 0xB24 | `sub_B24` | 0x8C | Debug assertion + debug print via CMOS check |
| 0x890 | `sub_890` | 0x4E | LongJump wrapper |
| 0x1F04 | `sub_1F04` | 0x57 | Initialize error record structure (check subtype, translate IDs) |
| 0x1E70 | `sub_1E70` | 0x93 | ID translation table lookup (255-terminated array) |
| 0x26AC | `sub_26AC` | 0x46 | Find WHEA error status block by severity level |
| 0x6D8 | `sub_6D8` | 0x9B | Boot services table library init |
| 0x774 | `sub_774` | 0x3A | Runtime services table lib init |
| 0x7B0 | `sub_7B0` | 0xDF | SMM services table lib init |
| 0x2130 | `sub_2130` | 0x3D | SpinLock init (set to 1) |
| 0x1FBC | `sub_1FBC` | 0x48 | IO port word write (16-bit) |
| 0x2004 | `sub_2004` | 0x39 | IO port dword read (32-bit) |
| 0xAD4 | `sub_AD4` | 0x4F | Get debug print function pointer |

## Entry Points (Public API)

### Protocols Implemented / Consumed

The module consumes the following protocols (identified by GUID):

| GUID Address | GUID | Protocol Name (Inferred) | Consumer |
|-------------|------|--------------------------|----------|
| 0x3D80 | `7739F24C-93D7-11D4-9A3A-0090273FC14D` | EFI_HOB_LIST_GUID (System Table config table) | `sub_FDC` |
| 0x3DA0 | `F4CCBFB7-F6E0-47FD-9DD4-10A8F150C191` | EFI_SMM_BASE2_PROTOCOL_GUID | `sub_7B0`, `sub_13B0` |
| 0x3DB0 | `86B091ED-1463-43B5-82A1-2C8B83CB8917` | EFI_SMM_SYSTEM_TABLE2_PROTOCOL_GUID (gEfiSmmCpuProtocolGuid or SmmBase2) | `sub_26F4`, `sub_1F5C`, `sub_1ACC` |
| 0x3DC0 | `6820ABD4-A292-4817-9147-D91DC8C53542` | EFI_SMM_CPU_PROTOCOL_GUID | `sub_26F4`, `sub_1ACC` |
| 0x3DD0 | `18A3C6DC-5EEA-48C8-A1C1-B53389F98999` | EFI_SMM_CPU_SERVICE_PROTOCOL (SMM CPU Service) | `sub_26F4` |
| 0x3DE0 | `EEE07404-26EE-43C9-9071-4E48008C4691` | EFI_SMM_SW_DISPATCH2_PROTOCOL_GUID | `sub_26F4` |
| 0x3DF0 | `11B34006-D85B-4D0A-A290-D5A571310EF7` | gEfiPcdProtocolGuid (PCD protocol) | `sub_2040` |
| 0x3E00 | `441FFA18-8714-421E-8C95-587080796FEE` | Unknown - EFI_WHEA_BOOT_PROTOCOL | `sub_1ACC` |
| 0x3E10 | `C2702B74-800C-4131-8746-8FB5B89CE4AC` | EFI_SMM_ACCESS_PROTOCOL | `sub_C3C` |
| 0x3E20 | `FD480A76-B134-4EF7-ADFE-B0E054639807` | EFI_MM_PCI_ROOT_BRIDGE_PROTOCOL or gEfiMmPciBaseProtocolGuid | `sub_EBC` |
| 0x3E30 | `A7CED760-C71C-4E1A-ACB1-89604D5216CB` | Unknown - sync data protocol | `sub_13B0` |
| 0x3E40 | `0067835F-9A50-433A-8CBB-852078197814` | EFI_MP_SERVICE_PROTOCOL (MP sync service) | `sub_13B0`, `sub_1F5C`, `sub_1ACC` |
| 0x3E50 | `5B1B31A1-9562-11D2-8E3F-00A0C969723B` | EFI_SMM_MP_PROTOCOL (or gEfiSmmMpProtocolGuid) | `sub_13B0` |
| 0x3E70 | `6D7E4A32-9A73-46BA-94A1-5F2F25EF3E29` | Unknown - WheaSilicon HOB protocol | `sub_1ACC` |
| 0x3E80 | `4A0266FE-FE57-4738-80AB-146E46F03A65` | Unknown - EFI_WHEA_BOOT_PROTOCOL | `sub_1ACC` |
| 0x3E90 | `F08FC315-CC4F-4D8C-B34C-B030C4E7B919` | Unknown - another platform protocol | `sub_1ACC` |
| 0x3EA0 | `9876CCAD-47B4-4BDB-B65E-16F193C4F3DB` | WHEA error source GUID A (Corrected Machine Check) | `sub_2C04` |
| 0x3EB0 | `DC3EA0B0-A144-4797-B55B-53FA242B6E1D` | WHEA error source GUID B (Recoverable) | `sub_2C04` |
| 0x3EC0 | `A5BC1114-6F64-4EDE-B863-3E83ED7C83B1` | WHEA error source GUID C (PCIe / Corrected) | `sub_2C04`, `sub_1F04` |
| 0x3ED0 | `71761D37-32B2-45CD-A7D0-B0FEDD93E8CF` | WHEA error source GUID D (Fatal/Non-Maskable) | `sub_2C04` |
| 0x3EE0 | `D995E954-BBC1-430F-AD91-B44DCB3C6F35` | WHEA error source GUID E (Corrected Machine Check variant) | `sub_2C04` |

### SMI Handlers (Registered via SmmSwDispatch2)

- **`sub_2688`** (0x2688) - SMI handler for SwSmi 157: Enables WHEA error logging (`byte_40A0 = 1`), installs PCI MMIO base
- **`sub_2664`** (0x2664) - SMI handler for SwSmi 158: Disables WHEA error logging (`byte_40A0 = 0`), releases PCI MMIO base
- **`sub_2C04`** (0x2C04) - SMI handler registered via `sub_26F4` callback (caller protocol): Processes WHEA error records, reads CPER data from error sources, writes them into the error status block
- **`sub_2FA8`** (0x2FA8) - SMM notification callback: Clears WHEA state flags (byte_40A1, byte_40A2, byte_40A3)

## Internal Helpers

### Initialization Chain (from `sub_21C0`)

`sub_21C0` calls 14 init functions in sequence. Each call is wrapped in ASSERT_EFI_ERROR. The chain is:

| Order | Function | Purpose |
|-------|----------|---------|
| 1 | `sub_6D8` (0x6D8) | Init gImageHandle, gST, gBS (BootServices) |
| 2 | `sub_774` (0x774) | Init gRT (RuntimeServices) |
| 3 | `sub_7B0` (0x7B0) | Init gSmst (SMM Services Table), locate SMM Base2 protocol |
| 4 | `sub_C3C` (0xC3C) | Get SMRAM ranges via SMM Access Protocol |
| 5 | `sub_EB8` (0xEB8) | Thunk to SMM PCI MMIO base init |
| 6 | `sub_EBC` (0xEBC) | Init PCI MMIO base (DxeMmPciBase) |
| 7 | `sub_F78` (0xF78) | Get PCD value (token 5) - platform configuration |
| 8 | `sub_10EC` (0x10EC) | Init HOB list |
| 9 | `sub_11A0` (0x11A0) | ACPI PM timer delay + WHEA boot protocol init |
| 10 | `sub_13B0` (0x13B0) | MP sync data table init (per-CPU structures) |
| 11 | `sub_1ACC` (0x1ACC) | WHEA silicon hooks: resolve all remaining protocols |
| 12 | `sub_1868` (0x1868) | CPU topology detection (thread/core bits via CPUID) |

### Protocol Resolution (`sub_1ACC`)

Looked-up protocols at 0x3E70, 0x3E40 (MP), 0x3E80 (gEfiWheaBootProtocolGuid), 0x3E90 + HOB GUID at 0x3D90.

The function `sub_1154` searches the HOB list for a specific GUID (at 0x3D90: `5138B5C5-9369-48EC-5B97-38A2F7096675`). The HOB data offset +24 is stored at `qword_4148` as a pointer to the WHEA silicon configuration data.

### Error Record Processing (`sub_2C04`)

The SMI handler at 0x2C04 is the core error logging function:

1. Checks if `byte_40A0` is set (WHEA logging enabled). Returns `EFI_NOT_READY` if not.
2. Calls `sub_2EF0` to check if error status blocks are pre-initialized. If not, initialize them.
3. Extracts error record details from the input context (offset +128 from `a1` / CommunicationBuffer).
4. Calls `sub_1F04` to determine error record subtype. If it matches 0x3EC0 (Corrected/PCIe), performs ID translation via `sub_1E70`.
5. Calls `sub_26AC` to find the appropriate error status block entry by severity (1 or 2).
6. Extracts a pointer from the error status block entry.
7. Determines error severity (192=non-fatal, 80=corrected, 228=fatal) based on source GUID matching (0x3EA0, 0x3EB0, 0x3EC0, 0x3ED0, 0x3EE0) and policy config.
8. If the error source matches the policy config, fills in the error record:
   - Sets severity flags
   - Copies error timestamp
   - Sets error source GUID
   - Copies FRU (Field Replaceable Unit) text if present (16/20 bytes)
   - Copies error data payload of variable size (n192 bytes)
9. Updates the error block header fields (length, severity flags).

## State Management

### Global Variables

| Address | Name | Size | Purpose |
|---------|------|------|---------|
| 0x4098 | `qword_4098` | 8 | Error status accumulator / global init status |
| 0x40A0 | `byte_40A0` | 1 | WHEA logging enabled flag |
| 0x40A1 | `byte_40A1` | 1 | Dual-rank/dual-socket error pending flag |
| 0x40A2 | `byte_40A2` | 1 | Single-rank/single-socket error pending flag |
| 0x40A3 | `byte_40A3` | 1 | Error status blocks initialized flag |
| 0x40A8 | `SystemTable` | 8 | EFI System Table (gST) |
| 0x40B0 | `BootServices` | 8 | EFI Boot Services (gBS) |
| 0x40B8 | `qword_40B8` | 8 | Image handle (gImageHandle) |
| 0x40C0 | `qword_40C0` | 8 | Runtime Services (gRT) |
| 0x40C8 | `qword_40C8` | 8 | SMM System Table (gSmst) |
| 0x40D8 | `qword_40D8` | 8 | PCI MMIO base protocol pointer |
| 0x40E0 | `qword_40E0` | 8 | PCD value (token 5) |
| 0x40E8 | `qword_40E8` | 8 | HOB list pointer (VendorTable) |
| 0x40F0 | `qword_40F0` | 8 | MP sync data protocol pointer |
| 0x40F8 | `qword_40F8` | 8 | SMM Base2 protocol pointer |
| 0x4100 | `qword_4100` | 8 | SMM MP protocol pointer |
| 0x4108 | `unk_4108` | ... | MP sync data structure |
| 0x4110 | `qword_4110` | 8 | Number of WHEA error source entries |
| 0x4126 | `unk_4126` | ... | WHEA error source array (variable, 26 bytes per entry) |
| 0x4138 | `unk_4138` | 8 | MP Service protocol pointer (saved) |
| 0x4140 | `unk_4140` | 8 | SmmCpu protocol pointer (saved) |
| 0x4148 | `qword_4148` | 8 | WHEA silicon HOB data pointer (+24) |
| 0x4150 | `unk_4150` | 8 | WHEA Boot protocol pointer |
| 0x4158 | `unk_4158` | 8 | Saved protocol pointer |
| 0x4160 | `qword_4160` | 8 | PCD protocol pointer |
| 0x4170 | `unk_4170` | ... | WHEA silicon config data structure (328 bytes) |
| 0x4178 | `qword_4178` | 8 | Policy configuration data pointer (from SMM protocol) |
| 0x4180 | `qword_4180` | 8 | SMM CPU Service protocol pointer |
| 0x42E8 | `unk_42E8` | ... | WHEA silicon hooks config |
| 0x42F0 | `unk_42F0` | 8 | Protocol pointer |

### Per-CPU Data Structures (initialized in `sub_13B0`)

Three parallel arrays indexed by CPU (up to 512 CPUs):

| Address | Name | Element Size | Purpose |
|---------|------|-------------|---------|
| 0x9320 | `byte_9320` | 1 | Per-CPU active flag |
| 0x9B20 | `qword_9B20` | 8 | Per-CPU APIC ID map |
| 0xDB20 | `dword_DB20` | 4 | Per-CPU register values |
| 0xFB20 | `byte_FB20` | 1 | Per-CPU status (0=inactive, 1=active, 2=running) |

Also:
| 0x4308 | `byte_4308` | 40 | Per-CPU (up to 0x5000/40=512) sync data structure header |

Layout: `[64 * (socket + 448 * package)] [socket offset = 64, package stride = 448 * 64]`

### SpinLocks
| 0x9308 | `unk_9308` | 8 | SpinLock for MP sync |
| 0x9310 | `unk_9310` | 8 | SpinLock for MP sync |

### SMRAM Range Info
| 0x25328 | `qword_25328` | 8 | Number of SMRAM descriptors (total_size >> 5) |
| 0x25330 | `qword_25330` | 8 | SMRAM range descriptor array pointer |

### WHEA Error Status Block Table (at 0x3F28)

Entry structure (24 bytes per entry):
```
Offset  Size  Field
0x00    2     Type/severity level (1 or 2 = corrected or non-fatal)
0x02    4     Data size
0x08    8     Pointer to error status block structure
0x10    8     Pointer to error source specific data
```

Entry count at `qword_3F90`. Each entry points to an error status block whose first field is a flags/status value.

## Data Structures

### Error Status Block (referenced from 0x3F28 table entries)

```
Offset  Size  Field
0x00    4     Status/flags bits (bit 0=valid, bit 1=overflow, bit 2=corrected overflow,
              bits 4-5=severity, bits 12-15=type mask)
0x04    4     Reserved
0x08    2     Error severity code
0x0A    2     Reserved
0x0C    4     Error source ID
0x10    16    Error source GUID
0x20    2     Timestamp seconds
0x22    2     Timestamp microseconds
```

### Per-CPU Sync Data Structure Entry (64 bytes each)

Layout derived from `sub_13B0` loops:
```
Offset  Size  Field
0x00    8     Reserved (zeroed)
0x08    8     APIC ID (qword_9B20 entry)
0x10    4     Register value (dword_DB20 entry)
0x14    1     Status byte (0/1/2)
0x18-40       Reserved (zeroed)
```

## Calling Patterns

### Module Entry Flow
```
_ModuleEntryPoint (0x5E8)
  -> sub_21C0 (0x21C0)  [constructors]
      -> sub_6D8   - Init boot services (gBS)
      -> sub_774   - Init runtime services (gRT)
      -> sub_7B0   - Init SMM services (gSmst)
      -> sub_C3C   - Get SMRAM ranges
      -> sub_EB8   - Thunk: init PCI MMIO
      -> sub_EBC   - PCI MMIO base init
      -> sub_F78   - Get PCD token 5
      -> sub_10EC  - Init HOB list
      -> sub_11A0  - ACPI PM timer + WHEA boot protocol
      -> sub_13B0  - MP sync data table init
      -> sub_1ACC  - WHEA silicon hooks protocol resolution
  -> sub_25EC (0x25EC)  [main init]
      -> sub_300   - Save/restore callee regs + SetJump
      -> sub_26F4  - Main protocol installation (SmiHandler registration)
  -> sub_257C (0x257C)  [on failure: unload]
      -> sub_DB0   - Thunk: unload handler
```

### SMI Handler Flow (WHEA error logging)
```
SMI Entry -> SmmCpuService Router -> sub_2C04 (0x2C04)
  -> sub_2EF0 (0x2EF0)  [pre-init error status blocks if needed]
      -> sub_9E8  [ZeroMem on each block]
  -> sub_1F04 (0x1F04)  [determine error type/severity]
      -> sub_A6C  [GUID compare - check if PCIe error type]
      -> sub_1E70 [ID translation via lookup table]
  -> sub_26AC (0x26AC)  [find error status block by severity]
  -> sub_93C   [CopyMem - copy FRU text, timestamps, error data]
```

### SMM Enable/Disable Callback Flow
```
sub_2688 - SMI handler 158: enables WHEA
  -> Sets byte_40A0 = 1
  -> Installs PCI MMIO base

sub_2664 - SMI handler 157: disables WHEA
  -> Sets byte_40A0 = 0
  -> Releases PCI MMIO base

sub_2FA8 - SMM Notify callback: reset WHEA state
  -> Clears byte_40A1, byte_40A2, byte_40A3
```

## Dependencies

### Consumed Protocols (located via SMM protocol database / Boot Services)

| Protocol | Located by | Used by |
|----------|-----------|---------|
| EFI_SMM_BASE2_PROTOCOL | `sub_7B0` | `sub_13B0` |
| EFI_SMM_CPU_PROTOCOL | `sub_1ACC` | `sub_26F4`, `sub_1ACC` |
| EFI_SMM_SYSTEM_TABLE2 | `sub_7B0` | All SMM operations |
| EFI_SMM_SW_DISPATCH2 | `sub_26F4` | SMI handler registration |
| EFI_SMM_ACCESS_PROTOCOL | `sub_C3C` | SMRAM enumeration |
| EFI_MM_PCI_ROOT_BRIDGE | `sub_EBC` | PCI configuration access |
| PCD Protocol | `sub_2040` | `sub_F78` |
| MP Service Protocol | `sub_1ACC` | `sub_13B0` |
| SMM MP Protocol | `sub_13B0` | CPU topology init |
| gEfiWheaBootProtocolGuid | `sub_1ACC` | Boot-time WHEA info |
| SystemTable HOB GUID | `sub_FDC` | HOB list discovery |

### Consumed Boot Services Functions

The module uses the following gBS function table offsets:
- Offset 192 (0xC0) - `gBS->RegisterProtocolNotify`
- Offset 208 (0xD0) - `gBS->LocateProtocol` (or SMM equivalent via gSmst at 0x40C8+208)
- Offset 320 (0x140) - `gBS->LocateProtocol` (DXE boot services version)

### Consumed Libraries (linked statically)

- UefiBootServicesTableLib - gBS, gImageHandle init
- UefiRuntimeServicesTableLib - gRT init
- SmmServicesTableLib - SMM table init
- BaseLib - SetJump/LongJump
- BaseMemoryLibRepStr - CopyMem, ZeroMem
- SmmMemoryAllocationLib - SMRAM allocation
- DxeMmPciBaseLib - PCI MMIO configuration
- DxeHobLib - HOB list traversal
- DxePcdLib - PCD protocol access
- BaseIoLibIntrinsic - IO port access (PM1, CMOS)
- BaseSynchronizationLib - SpinLock init
- mpsyncdatalib - MP sync data
- WheaSiliconHooksLib - Platform WHEA hooks

### Consumed By (SMI Handlers)

- SMM Core dispatches to registered SwSmi handlers (0x9D / 157, 0x9E / 158)
- SMM CPU Service Callback calls `sub_2C04` for WHEA error notification
- SMM Notify calls `sub_2FA8` for state cleanup

### Platform Configuration

PCD token 5 is read via `sub_F78`:
- Value stored at `qword_40E0`
- Used in `sub_11A0` at port 0x504 (1280) and 0x508 (1288) - typical ACPI PM1 base

CMOS access in debug logging (`sub_B24`):
- CMOS index 0x70 / 0x71, register 0x4C
- Determines debug verbosity level based on platform type

## Error Source Types (at 0x3EA0-0x3EE0)

| GUID | Inferred Error Type | Severity Code |
|------|--------------------|---------------|
| 0x3EA0 | Machine Check (Corrected) | 0 (non-updated) |
| 0x3EB0 | Recoverable Error | Skip in handler |
| 0x3EC0 | PCIe Corrected Error | 80 (corrected) |
| 0x3ED0 | Fatal/Non-Maskable Interrupt | 80 (corrected, special flag) |
| 0x3EE0 | Corrected Machine Check Variant | 228 (fatal) |

## Notes

- The module is an SMM driver that registers into gSmst (EFI_SMM_SYSTEM_TABLE2). Despite referencing BootServices, it operates in SMM phase.
- Error status block table at 0x3F28 contains entries with 24-byte stride. Entry count is at qword_3F90.
- WHEA error source configuration array at 0x4126 has 26-byte per-entry stride. Count at qword_4110.
- The error handler at 0x2C04 processes a communication buffer (in SMM) at offset +128 from the input argument, which contains: timestamp at +8, FRU text at +16(16 bytes)/+32(20 bytes), error data at +72, subtype GUID at +16, error severity flags at +10.
- The ID translation table in `sub_1E70` is a 255-terminated array of 8-byte entries (4 WORDs each).
- Debug logging uses port 0x70/0x71 CMOS register 0x4C to detect platform debug level.
- ACPI PM1 timer is used in `sub_11A0` for timed delay loops (waiting 357 units on port 0x508/1288).
- The module path indicates this is built from `PurleyPlatPkg/Ras/Whea/WheaErrorLog/`.