Newer
Older
AMI-Aptio-BIOS-Reversed / WheaErrorLog / WheaErrorLog_analysis.md
@Ajax Dong Ajax Dong 2 days ago 18 KB Init

WheaErrorLog Module Analysis

Overview

SMM WHEA (Windows Hardware Error Architecture) error logging driver for Intel Purley platform. This module initializes WHEA error record storage in SMRAM, registers SMI handlers for error record update and notification callbacks, and interfaces with platform MP (Multi-Processor) sync data, PCIe MM config space, and WHEA boot-time protocols to log hardware errors during OS runtime.

The module acts as the SMM-side counterpart to the DXE WHEA infrastructure, providing persistent error record buffers that survive across SMI entries.

Address Range

0x300 - 0x2FA8 (63 functions, ~154 KB image total including .data)

Image Characteristics

Property Value
Full path 0226_WheaErrorLog_7676e07fb4c8/WheaErrorLog.efi
MD5 24a2707e0606e07658e042b6a8f332e2
Architecture x64 (PE32+)
Image size 0x25780 (153 KB)
.text 0x300 - 0x3080 (11.6 KB, RX)
.rdata 0x3080 - 0x3D80 (3.2 KB, R)
.data 0x3D80 - 0x25380 (135 KB, RW)

Key Functions

Address Name Size Purpose
0x5E8 _ModuleEntryPoint 0xED DXE/SMM entry point
0x21C0 sub_21C0 0x3B9 Constructor chain: calls 14 init sub-functions
0x25EC sub_25EC 0x78 Main init: disables SET alarm, calls sub_26F4
0x26F4 sub_26F4 0x50D Protocol installation + SMI handler registration
0x2C04 sub_2C04 0x2EC WHEA error record creation/handler
0x2EF0 sub_2EF0 0xB6 Pre-initialize error status blocks
0x2FA8 sub_2FA8 0x28 SMM notify callback to reset state
0x2688 sub_2688 0x22 SMI handler to enable WHEA logging
0x2664 sub_2664 0x22 SMI handler to disable WHEA logging
0x2170 sub_2170 0x50 SMM callback registration wrapper
0x1ACC sub_1ACC 0x3A3 Protocol resolution (WheaSiliconHooksLib)
0x13B0 sub_13B0 0x4B8 MP sync data initialization
0x11A0 sub_11A0 0x20F ACPI PM1 timer-based delay / WHEA boot protocol init
0xC3C sub_C3C 0x172 SMM memory allocation - get SMRAM ranges
0xEBC sub_EBC 0xBA PCI MMIO base (DxeMmPciBase) initialization
0x1F5C sub_1F5C 0x60 WHEA silicon hooks config data initialization
0x2040 sub_2040 0xB6 PCD protocol locator
0xFDC sub_FDC 0x110 HOB list locator (via SystemTable config table, GUID at 0x3D80)
0x1154 sub_1154 0x49 HOB list traversal by GUID
0xA6C sub_A6C 0x67 GUID comparison (8-byte QWORD compare)
0x9E8 sub_9E8 0x82 ZeroMem wrapper
0x93C sub_93C 0xA9 CopyMem wrapper
0xB24 sub_B24 0x8C Debug assertion + debug print via CMOS check
0x890 sub_890 0x4E LongJump wrapper
0x1F04 sub_1F04 0x57 Initialize error record structure (check subtype, translate IDs)
0x1E70 sub_1E70 0x93 ID translation table lookup (255-terminated array)
0x26AC sub_26AC 0x46 Find WHEA error status block by severity level
0x6D8 sub_6D8 0x9B Boot services table library init
0x774 sub_774 0x3A Runtime services table lib init
0x7B0 sub_7B0 0xDF SMM services table lib init
0x2130 sub_2130 0x3D SpinLock init (set to 1)
0x1FBC sub_1FBC 0x48 IO port word write (16-bit)
0x2004 sub_2004 0x39 IO port dword read (32-bit)
0xAD4 sub_AD4 0x4F Get debug print function pointer

Entry Points (Public API)

Protocols Implemented / Consumed

The module consumes the following protocols (identified by GUID):

GUID Address GUID Protocol Name (Inferred) Consumer
0x3D80 7739F24C-93D7-11D4-9A3A-0090273FC14D EFI_HOB_LIST_GUID (System Table config table) sub_FDC
0x3DA0 F4CCBFB7-F6E0-47FD-9DD4-10A8F150C191 EFI_SMM_BASE2_PROTOCOL_GUID sub_7B0, sub_13B0
0x3DB0 86B091ED-1463-43B5-82A1-2C8B83CB8917 EFI_SMM_SYSTEM_TABLE2_PROTOCOL_GUID (gEfiSmmCpuProtocolGuid or SmmBase2) sub_26F4, sub_1F5C, sub_1ACC
0x3DC0 6820ABD4-A292-4817-9147-D91DC8C53542 EFI_SMM_CPU_PROTOCOL_GUID sub_26F4, sub_1ACC
0x3DD0 18A3C6DC-5EEA-48C8-A1C1-B53389F98999 EFI_SMM_CPU_SERVICE_PROTOCOL (SMM CPU Service) sub_26F4
0x3DE0 EEE07404-26EE-43C9-9071-4E48008C4691 EFI_SMM_SW_DISPATCH2_PROTOCOL_GUID sub_26F4
0x3DF0 11B34006-D85B-4D0A-A290-D5A571310EF7 gEfiPcdProtocolGuid (PCD protocol) sub_2040
0x3E00 441FFA18-8714-421E-8C95-587080796FEE Unknown - EFI_WHEA_BOOT_PROTOCOL sub_1ACC
0x3E10 C2702B74-800C-4131-8746-8FB5B89CE4AC EFI_SMM_ACCESS_PROTOCOL sub_C3C
0x3E20 FD480A76-B134-4EF7-ADFE-B0E054639807 EFI_MM_PCI_ROOT_BRIDGE_PROTOCOL or gEfiMmPciBaseProtocolGuid sub_EBC
0x3E30 A7CED760-C71C-4E1A-ACB1-89604D5216CB Unknown - sync data protocol sub_13B0
0x3E40 0067835F-9A50-433A-8CBB-852078197814 EFI_MP_SERVICE_PROTOCOL (MP sync service) sub_13B0, sub_1F5C, sub_1ACC
0x3E50 5B1B31A1-9562-11D2-8E3F-00A0C969723B EFI_SMM_MP_PROTOCOL (or gEfiSmmMpProtocolGuid) sub_13B0
0x3E70 6D7E4A32-9A73-46BA-94A1-5F2F25EF3E29 Unknown - WheaSilicon HOB protocol sub_1ACC
0x3E80 4A0266FE-FE57-4738-80AB-146E46F03A65 Unknown - EFI_WHEA_BOOT_PROTOCOL sub_1ACC
0x3E90 F08FC315-CC4F-4D8C-B34C-B030C4E7B919 Unknown - another platform protocol sub_1ACC
0x3EA0 9876CCAD-47B4-4BDB-B65E-16F193C4F3DB WHEA error source GUID A (Corrected Machine Check) sub_2C04
0x3EB0 DC3EA0B0-A144-4797-B55B-53FA242B6E1D WHEA error source GUID B (Recoverable) sub_2C04
0x3EC0 A5BC1114-6F64-4EDE-B863-3E83ED7C83B1 WHEA error source GUID C (PCIe / Corrected) sub_2C04, sub_1F04
0x3ED0 71761D37-32B2-45CD-A7D0-B0FEDD93E8CF WHEA error source GUID D (Fatal/Non-Maskable) sub_2C04
0x3EE0 D995E954-BBC1-430F-AD91-B44DCB3C6F35 WHEA error source GUID E (Corrected Machine Check variant) sub_2C04

SMI Handlers (Registered via SmmSwDispatch2)

  • sub_2688 (0x2688) - SMI handler for SwSmi 157: Enables WHEA error logging (byte_40A0 = 1), installs PCI MMIO base
  • sub_2664 (0x2664) - SMI handler for SwSmi 158: Disables WHEA error logging (byte_40A0 = 0), releases PCI MMIO base
  • sub_2C04 (0x2C04) - SMI handler registered via sub_26F4 callback (caller protocol): Processes WHEA error records, reads CPER data from error sources, writes them into the error status block
  • sub_2FA8 (0x2FA8) - SMM notification callback: Clears WHEA state flags (byte_40A1, byte_40A2, byte_40A3)

Internal Helpers

Initialization Chain (from sub_21C0)

sub_21C0 calls 14 init functions in sequence. Each call is wrapped in ASSERT_EFI_ERROR. The chain is:

Order Function Purpose
1 sub_6D8 (0x6D8) Init gImageHandle, gST, gBS (BootServices)
2 sub_774 (0x774) Init gRT (RuntimeServices)
3 sub_7B0 (0x7B0) Init gSmst (SMM Services Table), locate SMM Base2 protocol
4 sub_C3C (0xC3C) Get SMRAM ranges via SMM Access Protocol
5 sub_EB8 (0xEB8) Thunk to SMM PCI MMIO base init
6 sub_EBC (0xEBC) Init PCI MMIO base (DxeMmPciBase)
7 sub_F78 (0xF78) Get PCD value (token 5) - platform configuration
8 sub_10EC (0x10EC) Init HOB list
9 sub_11A0 (0x11A0) ACPI PM timer delay + WHEA boot protocol init
10 sub_13B0 (0x13B0) MP sync data table init (per-CPU structures)
11 sub_1ACC (0x1ACC) WHEA silicon hooks: resolve all remaining protocols
12 sub_1868 (0x1868) CPU topology detection (thread/core bits via CPUID)

Protocol Resolution (sub_1ACC)

Looked-up protocols at 0x3E70, 0x3E40 (MP), 0x3E80 (gEfiWheaBootProtocolGuid), 0x3E90 + HOB GUID at 0x3D90.

The function sub_1154 searches the HOB list for a specific GUID (at 0x3D90: 5138B5C5-9369-48EC-5B97-38A2F7096675). The HOB data offset +24 is stored at qword_4148 as a pointer to the WHEA silicon configuration data.

Error Record Processing (sub_2C04)

The SMI handler at 0x2C04 is the core error logging function:

  1. Checks if byte_40A0 is set (WHEA logging enabled). Returns EFI_NOT_READY if not.
  2. Calls sub_2EF0 to check if error status blocks are pre-initialized. If not, initialize them.
  3. Extracts error record details from the input context (offset +128 from a1 / CommunicationBuffer).
  4. Calls sub_1F04 to determine error record subtype. If it matches 0x3EC0 (Corrected/PCIe), performs ID translation via sub_1E70.
  5. Calls sub_26AC to find the appropriate error status block entry by severity (1 or 2).
  6. Extracts a pointer from the error status block entry.
  7. Determines error severity (192=non-fatal, 80=corrected, 228=fatal) based on source GUID matching (0x3EA0, 0x3EB0, 0x3EC0, 0x3ED0, 0x3EE0) and policy config.
  8. If the error source matches the policy config, fills in the error record:
    • Sets severity flags
    • Copies error timestamp
    • Sets error source GUID
    • Copies FRU (Field Replaceable Unit) text if present (16/20 bytes)
    • Copies error data payload of variable size (n192 bytes)
  9. Updates the error block header fields (length, severity flags).

State Management

Global Variables

Address Name Size Purpose
0x4098 qword_4098 8 Error status accumulator / global init status
0x40A0 byte_40A0 1 WHEA logging enabled flag
0x40A1 byte_40A1 1 Dual-rank/dual-socket error pending flag
0x40A2 byte_40A2 1 Single-rank/single-socket error pending flag
0x40A3 byte_40A3 1 Error status blocks initialized flag
0x40A8 SystemTable 8 EFI System Table (gST)
0x40B0 BootServices 8 EFI Boot Services (gBS)
0x40B8 qword_40B8 8 Image handle (gImageHandle)
0x40C0 qword_40C0 8 Runtime Services (gRT)
0x40C8 qword_40C8 8 SMM System Table (gSmst)
0x40D8 qword_40D8 8 PCI MMIO base protocol pointer
0x40E0 qword_40E0 8 PCD value (token 5)
0x40E8 qword_40E8 8 HOB list pointer (VendorTable)
0x40F0 qword_40F0 8 MP sync data protocol pointer
0x40F8 qword_40F8 8 SMM Base2 protocol pointer
0x4100 qword_4100 8 SMM MP protocol pointer
0x4108 unk_4108 ... MP sync data structure
0x4110 qword_4110 8 Number of WHEA error source entries
0x4126 unk_4126 ... WHEA error source array (variable, 26 bytes per entry)
0x4138 unk_4138 8 MP Service protocol pointer (saved)
0x4140 unk_4140 8 SmmCpu protocol pointer (saved)
0x4148 qword_4148 8 WHEA silicon HOB data pointer (+24)
0x4150 unk_4150 8 WHEA Boot protocol pointer
0x4158 unk_4158 8 Saved protocol pointer
0x4160 qword_4160 8 PCD protocol pointer
0x4170 unk_4170 ... WHEA silicon config data structure (328 bytes)
0x4178 qword_4178 8 Policy configuration data pointer (from SMM protocol)
0x4180 qword_4180 8 SMM CPU Service protocol pointer
0x42E8 unk_42E8 ... WHEA silicon hooks config
0x42F0 unk_42F0 8 Protocol pointer

Per-CPU Data Structures (initialized in sub_13B0)

Three parallel arrays indexed by CPU (up to 512 CPUs):

Address Name Element Size Purpose
0x9320 byte_9320 1 Per-CPU active flag
0x9B20 qword_9B20 8 Per-CPU APIC ID map
0xDB20 dword_DB20 4 Per-CPU register values
0xFB20 byte_FB20 1 Per-CPU status (0=inactive, 1=active, 2=running)

Also:
0x4308 | byte_4308 | 40 | Per-CPU (up to 0x5000/40=512) sync data structure header |

Layout: [64 * (socket + 448 * package)] [socket offset = 64, package stride = 448 * 64]

SpinLocks

| 0x9308 | unk_9308 | 8 | SpinLock for MP sync |
0x9310 | unk_9310 | 8 | SpinLock for MP sync |

SMRAM Range Info

| 0x25328 | qword_25328 | 8 | Number of SMRAM descriptors (total_size >> 5) |
0x25330 | qword_25330 | 8 | SMRAM range descriptor array pointer |

WHEA Error Status Block Table (at 0x3F28)

Entry structure (24 bytes per entry):

Offset  Size  Field
0x00    2     Type/severity level (1 or 2 = corrected or non-fatal)
0x02    4     Data size
0x08    8     Pointer to error status block structure
0x10    8     Pointer to error source specific data

Entry count at qword_3F90. Each entry points to an error status block whose first field is a flags/status value.

Data Structures

Error Status Block (referenced from 0x3F28 table entries)

Offset  Size  Field
0x00    4     Status/flags bits (bit 0=valid, bit 1=overflow, bit 2=corrected overflow,
              bits 4-5=severity, bits 12-15=type mask)
0x04    4     Reserved
0x08    2     Error severity code
0x0A    2     Reserved
0x0C    4     Error source ID
0x10    16    Error source GUID
0x20    2     Timestamp seconds
0x22    2     Timestamp microseconds

Per-CPU Sync Data Structure Entry (64 bytes each)

Layout derived from sub_13B0 loops:

Offset  Size  Field
0x00    8     Reserved (zeroed)
0x08    8     APIC ID (qword_9B20 entry)
0x10    4     Register value (dword_DB20 entry)
0x14    1     Status byte (0/1/2)
0x18-40       Reserved (zeroed)

Calling Patterns

Module Entry Flow

_ModuleEntryPoint (0x5E8)
  -> sub_21C0 (0x21C0)  [constructors]
      -> sub_6D8   - Init boot services (gBS)
      -> sub_774   - Init runtime services (gRT)
      -> sub_7B0   - Init SMM services (gSmst)
      -> sub_C3C   - Get SMRAM ranges
      -> sub_EB8   - Thunk: init PCI MMIO
      -> sub_EBC   - PCI MMIO base init
      -> sub_F78   - Get PCD token 5
      -> sub_10EC  - Init HOB list
      -> sub_11A0  - ACPI PM timer + WHEA boot protocol
      -> sub_13B0  - MP sync data table init
      -> sub_1ACC  - WHEA silicon hooks protocol resolution
  -> sub_25EC (0x25EC)  [main init]
      -> sub_300   - Save/restore callee regs + SetJump
      -> sub_26F4  - Main protocol installation (SmiHandler registration)
  -> sub_257C (0x257C)  [on failure: unload]
      -> sub_DB0   - Thunk: unload handler

SMI Handler Flow (WHEA error logging)

SMI Entry -> SmmCpuService Router -> sub_2C04 (0x2C04)
  -> sub_2EF0 (0x2EF0)  [pre-init error status blocks if needed]
      -> sub_9E8  [ZeroMem on each block]
  -> sub_1F04 (0x1F04)  [determine error type/severity]
      -> sub_A6C  [GUID compare - check if PCIe error type]
      -> sub_1E70 [ID translation via lookup table]
  -> sub_26AC (0x26AC)  [find error status block by severity]
  -> sub_93C   [CopyMem - copy FRU text, timestamps, error data]

SMM Enable/Disable Callback Flow

sub_2688 - SMI handler 158: enables WHEA
  -> Sets byte_40A0 = 1
  -> Installs PCI MMIO base

sub_2664 - SMI handler 157: disables WHEA
  -> Sets byte_40A0 = 0
  -> Releases PCI MMIO base

sub_2FA8 - SMM Notify callback: reset WHEA state
  -> Clears byte_40A1, byte_40A2, byte_40A3

Dependencies

Consumed Protocols (located via SMM protocol database / Boot Services)

Protocol Located by Used by
EFI_SMM_BASE2_PROTOCOL sub_7B0 sub_13B0
EFI_SMM_CPU_PROTOCOL sub_1ACC sub_26F4, sub_1ACC
EFI_SMM_SYSTEM_TABLE2 sub_7B0 All SMM operations
EFI_SMM_SW_DISPATCH2 sub_26F4 SMI handler registration
EFI_SMM_ACCESS_PROTOCOL sub_C3C SMRAM enumeration
EFI_MM_PCI_ROOT_BRIDGE sub_EBC PCI configuration access
PCD Protocol sub_2040 sub_F78
MP Service Protocol sub_1ACC sub_13B0
SMM MP Protocol sub_13B0 CPU topology init
gEfiWheaBootProtocolGuid sub_1ACC Boot-time WHEA info
SystemTable HOB GUID sub_FDC HOB list discovery

Consumed Boot Services Functions

The module uses the following gBS function table offsets:

  • Offset 192 (0xC0) - gBS->RegisterProtocolNotify
  • Offset 208 (0xD0) - gBS->LocateProtocol (or SMM equivalent via gSmst at 0x40C8+208)
  • Offset 320 (0x140) - gBS->LocateProtocol (DXE boot services version)

Consumed Libraries (linked statically)

  • UefiBootServicesTableLib - gBS, gImageHandle init
  • UefiRuntimeServicesTableLib - gRT init
  • SmmServicesTableLib - SMM table init
  • BaseLib - SetJump/LongJump
  • BaseMemoryLibRepStr - CopyMem, ZeroMem
  • SmmMemoryAllocationLib - SMRAM allocation
  • DxeMmPciBaseLib - PCI MMIO configuration
  • DxeHobLib - HOB list traversal
  • DxePcdLib - PCD protocol access
  • BaseIoLibIntrinsic - IO port access (PM1, CMOS)
  • BaseSynchronizationLib - SpinLock init
  • mpsyncdatalib - MP sync data
  • WheaSiliconHooksLib - Platform WHEA hooks

Consumed By (SMI Handlers)

  • SMM Core dispatches to registered SwSmi handlers (0x9D / 157, 0x9E / 158)
  • SMM CPU Service Callback calls sub_2C04 for WHEA error notification
  • SMM Notify calls sub_2FA8 for state cleanup

Platform Configuration

PCD token 5 is read via sub_F78:

  • Value stored at qword_40E0
  • Used in sub_11A0 at port 0x504 (1280) and 0x508 (1288) - typical ACPI PM1 base

CMOS access in debug logging (sub_B24):

  • CMOS index 0x70 / 0x71, register 0x4C
  • Determines debug verbosity level based on platform type

Error Source Types (at 0x3EA0-0x3EE0)

GUID Inferred Error Type Severity Code
0x3EA0 Machine Check (Corrected) 0 (non-updated)
0x3EB0 Recoverable Error Skip in handler
0x3EC0 PCIe Corrected Error 80 (corrected)
0x3ED0 Fatal/Non-Maskable Interrupt 80 (corrected, special flag)
0x3EE0 Corrected Machine Check Variant 228 (fatal)

Notes

  • The module is an SMM driver that registers into gSmst (EFI_SMM_SYSTEM_TABLE2). Despite referencing BootServices, it operates in SMM phase.
  • Error status block table at 0x3F28 contains entries with 24-byte stride. Entry count is at qword_3F90.
  • WHEA error source configuration array at 0x4126 has 26-byte per-entry stride. Count at qword_4110.
  • The error handler at 0x2C04 processes a communication buffer (in SMM) at offset +128 from the input argument, which contains: timestamp at +8, FRU text at +16(16 bytes)/+32(20 bytes), error data at +72, subtype GUID at +16, error severity flags at +10.
  • The ID translation table in sub_1E70 is a 255-terminated array of 8-byte entries (4 WORDs each).
  • Debug logging uses port 0x70/0x71 CMOS register 0x4C to detect platform debug level.
  • ACPI PM1 timer is used in sub_11A0 for timed delay loops (waiting 357 units on port 0x508/1288).
  • The module path indicates this is built from PurleyPlatPkg/Ras/Whea/WheaErrorLog/.