Newer
Older
AMI-Aptio-BIOS-Reversed / LastBootErrorLog / LastBootErrorLog_analysis.md
@Ajax Dong Ajax Dong 2 days ago 20 KB Init

LastBootErrorLog Module

Overview

DXE driver that processes the last boot error log for RAS (Reliability, Availability, Serviceability) purposes on Intel Purley (Xeon Scalable) platforms. Reads error records from HOB (Hand-Off Block), translates them through the WHEA (Windows Hardware Error Architecture) protocol, and dispatches error information to crash handler, error log, and platform-specific storage.

File Identity

  • MD5: 137910c91f16445d1110a15497cad810
  • SHA256: 59eac854dd652a03a271728efee20722b59d502ac19da02d0a5870ade0d3d0ea
  • Build Path: e:\hs\Build\HR6N0XMLK\DEBUG_VS2015\X64\PurleyPlatPkg\Ras\Whea\LastBootErrorLog\LastBootErrorLog\DEBUG\
  • Source: PurleyPlatPkg/Ras/Whea/LastBootErrorLog/ (under PurleyPlatPkg)
  • Library Dependencies: WheaSiliconHooksLib, mpsyncdatalib, DxeMmPciBaseLib, DxeHobLib, SmmMemoryAllocationLib, UefiBootServicesTableLib, DxePcdLib, various MdePkg base libs

Address Range

Segment Start End Size Permissions
HEADER 0x0000 0x02A0 0x2A0 ---
.text 0x02A0 0x44A0 0x4200 rx
.rdata 0x44A0 0x5100 0xC60 r
.data 0x5100 0x268A0 0x217A0 rw
seg004 0x268A0 0x26BA0 0x300 r
.xdata 0x26BA0 0x26D40 0x1A0 r
GAP 0x26D40 0x27000 0x2C0 rw

Total: 84 functions, 65 strings.

Entry Points (Public API)

Address Name Purpose
0x588 _ModuleEntryPoint DXE driver entry: calls sub_3D94 (AutoGen init), sub_41C0 (main init), or sub_4150 (unload)

Key Functions

Initialization Flow

_ModuleEntryPoint (0x588)

  +-- sub_3D94 (0x3D94) -- AutoGen-generated driver init: calls sub-functions in a long chain
     Calls: sub_678, sub_714, sub_750, sub_A44, sub_CC0, sub_D24, sub_E38,
            sub_EEC, sub_10FC, sub_1210, sub_192C
     Each call is wrapped with ASSERT_EFI_ERROR / DebugAssert checks

  +-- sub_41C0 (0x41C0) -- Main DXE driver logic
     |
     +-- sub_2A0 (0x2A0) - SetJump (saves CPU context for LongJump)
     +-- sub_43FC (0x43FC) -- Core initialization
     |     +-- sub_4238 (0x4238) -- Protocol resolution & data structure init
     |     +-- sub_EA0 (0xEA0) -- HOB traversal to find error log HOB
     |     +-- sub_23FC (0x23FC) -- Process last boot error from HOB (main logic)
     |     +-- sub_208C (0x208C) -- Process platform-specific error data
     +-- sub_3D44 (0x3D44) -- Record driver status, LongJump

  +-- sub_4150 (0x4150) -- Unload handler (called if sub_41C0 fails)

Library Initialization Helpers (AutoGen)

Address Function Library Purpose
0x678 sub_678 UefiBootServicesTableLib Stores ImageHandle, SystemTable, BootServices globals
0x714 sub_714 UefiRuntimeServicesTableLib Stores RuntimeServices global
0x750 sub_750 UefiDriverEntryPoint Sets up driver entry point library
0xA44 sub_A44 SmmMemoryAllocationLib Gets SMRAM ranges via gEfiSmmAccess2ProtocolGuid
0xCC0 sub_CC0 SmmServicesTableLib Initializes Smst global
0xD24 sub_D24 (stub) No-op
0xE38 sub_E38 DxeMmPciBaseLib Gets MmPciBase (PCIe config space)
0xEEC sub_EEC AcpiTimerLib Calibrates ACPI timer via I/O ports
0x10FC sub_10FC SmmMmPciBaseLib Gets PCI USRA protocol (gEfiMmPciBaseProtocolGuid)
0x1210 sub_1210 mpsyncdatalib Initializes sync data structures (spinlocks, CPU topology arrays)
0x192C sub_192C WheaSiliconHooksLib Main WHEA setup: resolves protocols, reads HOB for error data

WHEA / Last Boot Error Processing

Address Function Purpose
0x192C sub_192C WHEA Silicon Hooks init: registers callback, resolves WHEA protocol, reads HOB
0x18F4 sub_18F4 Notification callback: re-resolves WHEA boot protocol GUID
0x23FC sub_23FC Process last boot error record: parses HOB data, dispatches based on error type
0x1CD0 sub_1CD0 Handle error type: MSR + 0x19 range (error groups 9-11)
0x1D88 sub_1D88 Handle error type: MSR + 0x12 range (error groups 4-5, 12, 19)
0x1E40 sub_1E40 Handle error type: MSR range (error groups 0-3)
0x1EF8 sub_1EF8 Handle error type: Corrected error via sub_34BC processor error decode
0x1F60 sub_1F60 Handle error type: Clear error flags (type 2 - WHEA clear)

Processor Error Decode Pipeline

Address Function Purpose
0x34BC sub_34BC Main error decode: reads MSR 0x179 (MCG_CAP), decodes error source address
0x33A8 sub_33A8 Get CPU topology info (socket/core/thread) for the error address
0x32B8 sub_32B8 Check if this is a recoverable error via smi_handler or cmc_handler
0x2CD8 sub_2CD8 Read MC (Machine Check) MSRs to validate error address
0x2B64 sub_2B64 Verify error address matches actual MSR state with CPU topology checks
0x2EC4 sub_2EC4 Search/find error in WHEA error bank, dispatch to handler callback
0x3158 sub_3158 Decode corrected machine check and update error status structure
0x2E48 sub_2E48 Read CPU topology info from CPU_CSR protocol via SBIOS interface

Error Handler Callbacks (registered in WHEA protocol table)

Address Function Purpose
0x2A74 sub_2A74 cmc_handler (Corrected Machine Check): stores error record in cache table
0x2E08 sub_2E08 smi_handler (SMI): calls sub_2A74 if error type is 14 (SMI)
0x2E30 sub_2E30 ue_handler (Uncorrectable Error): stub, just validates pointer

Error Record Cache Management

Address Function Purpose
0x2914 sub_2914 Store error record fields into cache entry (socket, core, thread, severity)
0x2964 sub_2964 Find matching error in cache table by address, update fields
0x27D4 sub_27D4 Build error notification structure for crash handler / WHEA event
0x208C sub_208C Process platform-specific errors (4 sockets x 21 threads) via WHEA boot protocol
0x202C sub_202C Clear platform-specific error via WHEA protocol
0x3918 sub_3918 Init WHEA protocol table: resolves MM_IO protocol, registers callbacks
0x11B8 sub_11B8 Build PCIe config space address from socket/core/thread/bus/function/register

Data Flow Functions

Address Function Purpose
0xEA0 sub_EA0 Find HOB matching a GUID reference
0xD28 sub_D28 Get HOB list from system configuration table (gEfiHobListGuid)
0x3C9C sub_3C9C Compare two GUIDs
0x3C18 sub_3C18 ZeroMem wrapper
0x3B68 sub_3B68 CopyMem wrapper
0x3D04 sub_3D04 Initialize spinlock
0x3D44 sub_3D44 LongJump: restore CPU context from SetJump buffer
0x3B2C sub_3B2C I/O port read (inl)
0x3AE0 sub_3AE0 I/O port write (outw)

Debug/Assert Infrastructure

Address Function Purpose
0x92C sub_92C Debug print / assert log (writes to debug console)
0x9B8 sub_9B8 DeadLoop / CpuBreakpoint on assertion failure
0xA1C sub_A1C Check if debug asserts are enabled (returns gBS global)
0xA28 sub_A28 Check report status code enable
0xA34 sub_A34 Check severity level for debug output

Global Variables

Address Name Type Purpose
0x5620 SystemTable ptr UEFI System Table pointer
0x5628 BootServices ptr UEFI Boot Services pointer
0x5630 qword_5630 ptr ImageHandle
0x5640 qword_5640 ptr Runtime Services (or Smst from SMM)
0x5658 VendorTable ptr HOB list pointer (from gEfiHobListGuid)
0x5660 qword_5660 ptr PCI USRA protocol (gEfiMmPciBaseProtocolGuid)
0x5668 qword_5668 ptr SMM sync data protocol (mSmst_syncdata)
0x5670 qword_5670 ptr MP sync protocol
0x5678 qword_5678 ptr CPU data protocol (gEfiCpuDataProtocolGuid)
0x5680 unk_5680 GUID Smst sync protocol GUID
0x5688 qword_5688 ptr WHEA protocol entry from SMM sync
0x5690 qword_5690 ptr WHEA boot protocol (gEfiWheaBootProtocolGuid)
0x5698 qword_5698 ptr HOB error record pointer (from gEfiLastBootErrorHobGuid)
0x56A0 qword_56A0 ptr Alternate error source (from WHEA protocol)
0x56A8 qword_56A8 ptr WHEA boot protocol for error dispatch
0x56B0 qword_56B0 ptr CPU_CSR access protocol (from SMM sync data)
0x56CA byte_56CA u8 Flag: platform error data processed
0x56CB byte_56CB u8 Flag: error clear has been performed
0x56D0 qword_56D0 ptr SMI handler / CMC handler protocol (MM_IO)
0x56E0 unk_56E0 struct Error cache table (20 entries x 16 bytes each)
0x5820 byte_5820 u8 Flag: cache table wrapping (round-robin)
0x5824 dword_5824 u32 Cache table round-robin index
0x5828 psub_2E30 ptr Function pointer: UE handler (registered for WHEA)
0x5830 qword_5830 ptr MM_IO protocol interface
0x5838 qword_5838 ptr MM_IO protocol instance (resolved from gEfiMmIoTrapProtocolGuid)
0x5840 psub_2E08 ptr Function pointer: SMI handler (registered for WHEA)
0x5848 psub_2A74 ptr Function pointer: CMC handler (registered for WHEA)
0x5850 qword_5850 ptr SMI handler / CMC handler protocol (second instance)
0xA868 qword_A868 u64 Spinlock for CPU topology
0xA870 qword_A870 u64 Spinlock for error tracking
0xA878 dword_A878 u32 Threads per core bitmask
0x26880 dword_26880 u32 Cores per socket bitmask
0x26888 qword_26888 u64 Number of SMRAM ranges
0x26890 qword_26890 u64 SMRAM range table pointer
0x5241 i u8 Loop counter (leftover from AutoGen)

Error Cache Structure (at 0x56E0)

20 entries, each 16 bytes:

+0x00: u8    valid
+0x01: u8    socket_id
+0x02: u8    core_bit  (bit in core mask)
+0x03: u8    thread_id
+0x04: u8    severity/type
+0x05-0x07: padding
+0x08: u64   error_address (bits 63:6 masked = address, bits 5:0 = status)

Protocol GUIDs Used

The module resolves protocols via BootServices->LocateProtocol (offset 320 in protocol table). The untyped GUID references are embedded in the .data section at 0x5100-0x5230. Based on function context:

Reference Address Protocol Name Usage
0x5100 gEfiHobListGuid Get HOB list from configuration table
0x5110 Guid (16 bytes) Sub-GUID for error notification structure
0x5120 Guid (16 bytes) Sub-GUID for error notification structure
0x5130 gEfiMpSyncProtocolGuid MP sync protocol for multi-core data
0x5140 gEfiSmmCpuSyncProtocolGuid SMM CPU sync data
0x5150 gEfiSmmCpuProtocolGuid SMM CPU protocol
0x5160 Guid Platform-specific structure pointer
0x5170 gEfiWheaBootProtocolGuid WHEA boot-time protocol
0x5180 gEfiMpServiceProtocolGuid MP service protocol
0x51B0 gEfiSmmAccess2ProtocolGuid SMRAM access protocol
0x51C0 gEfiMmPciBaseProtocolGuid MM PCI base protocol
0x51D0 gEfiCpuDataProtocolGuid CPU data (topology)
0x51F0 gEfiSmiHandlerProtocolGuid SMI handler registration
0x5200 gEfiWheaBootProtocolGuid WHEA boot protocol (notification registration)
0x5210 gEfiMmIoTrapProtocolGuid MM I/O trap protocol
0x5220 gEfiLastBootErrorHobGuid Last boot error HOB GUID

Error Record Processing Flow

Step 1: HOB Discovery (sub_192C)

  1. Resolves gEfiWheaBootProtocolGuid via BootServices->LocateProtocol
  2. Registers sub_18F4 as notification callback
  3. Resolves gEfiSmmCpuSyncProtocolGuid (protocol at 0x5140) to get SMM sync data
  4. Resolves gEfiSmmCpuProtocolGuid (protocol at 0x5150) for WHEA protocol table
  5. Resolves gEfiSmiHandlerProtocolGuid for SMI handler registration
  6. Calls sub_EA0 to find HOB matching gEfiLastBootErrorHobGuid
  7. Stores HOB pointer in qword_5698 (offset +24 from HOB header)

Step 2: Error Record Parsing (sub_23FC)

The HOB error record starts with a header:

+0x00: u16  structure_size (must be >= 2)
+0x02: u8   type (1=error, 2=clear)
+0x03: ...
+0x08: u16  sub_type (error group identifier):
            Groups: 4-5, 12, 19 -> sub_1EF8 (corrected errors via processor decode)
                    9-11, 7-8  -> sub_1CD0
                    0-3        -> sub_1E40
                    13-18      -> sub_1D88
+0x0A: u64  flags (bit 61=valid, bit 57=platform_error)

Based on error type:

  • type 1 (error event): Dispatches to type-specific handler based on sub_type
  • type 2 (clear event): Calls sub_1F60 to clear error flags, sets byte_56CB = 1

Step 3: Alternative Path (sub_23FC - no HOB)

If qword_5698 is NULL but qword_56A0 (WHEA protocol alternate) is set:

  • Iterates 8 CPU slots
  • For each active slot, calls SMI handler via qword_5688+8 to check/clear the error
  • If the return code indicates a recoverable error, calls sub_202C to clear it

Step 4: Platform Error Processing (sub_208C)

For each of 4 sockets:

  1. Checks if the socket has any active threads (from unk_5240 structure)
  2. For each of 21 threads per socket:
    • Reads error status via PCIe config space (sub_11B8 builds the config address)
    • If error is present (checked via sub_3AA8 - MMIO read, sub_3B28)
    • Dispatches error data to WHEA boot protocol (qword_56A8+40, function index 5 = type 25)
  3. Also checks 8 "special" error slots

Step 5: Processor Error Decode (sub_34BC)

The main decode function:

  1. Reads error type from input structure field at +6
  2. Determines memory error type (Sparing/Lockstep/Any) based on type value
  3. Reads MSR 0x179 (MCG_CAP - Machine Check Global Capability)
  4. Determines error correction mode (6 modes: 0=Corrected, 2=Deferred, 3=Recoverable, 5=Uncorrected, etc.)
  5. Routes to appropriate handler callback based on mode:
    • Mode 2: cmc_handler (qword_5848 = sub_2A74)
    • Mode 5: ue_handler (qword_5828 = sub_2E30)
    • Other: smi_handler (qword_5840 = sub_2E08)
  6. For mode 0 (corrected/deferred), decodes memory controller address to DIMM location

Data Structures

Last Boot Error HOB (input from PEI)

+0x00: u16  Length          // Structure size
+0x02: u8   Type            // 1=error, 2=clear
+0x03: u8   Severity        // Error severity
+0x04: u8   Socket/Bus      // Socket identifier
+0x05-0x07: u8[3]           // Error address fields
+0x08: u16  SubType         // Error group identifier
+0x0A: u64  ErrorFlags      // Flags with valid/severity bits

WHEA Error Output Structure (64 bytes at stack)

Output from sub_34BC decode, consumed by handler callbacks:

+0x00: u16  length           // Structure size
+0x02: u64  status_flags     // Status bits (0x3B = valid mask)
+0x0A: u16  reserved
+0x0C: u16  error_type       // 0=No error, 2=Deferred, 3=Recoverable, 8=SMI, 14=Corrected
+0x0E: u16  socket_id|flags  // Socket ID in upper bits
+0x12: u16  core_id
+0x14: u16  thread_id
+0x16: u16  severity
+0x18: u16  error_code
+0x1A: u16  reserved2
+0x1C: u32  reserved3
+0x20: u32  mc_status_low
+0x24: u32  mc_status_high
+0x28: u32  mc_addr_low
+0x2C: u32  mc_addr_high
+0x30: u32  mc_misc
+0x34-0x3F: u8[12] reserved

Cache Table Entry (20 entries at 0x56E0, 16 bytes each)

+0x00: u8   valid
+0x01: u8   socket_id
+0x02: u8   core_bit       // bit in the core mask
+0x03: u8   thread_id
+0x04: u8   error_type/severity
+0x05-0x07: padding
+0x08: u64  error_address  // address with MC (address << 6) format

Calling Patterns

Module Initialization

_ModuleEntryPoint(ImageHandle, SystemTable)
  sub_3D94()  -- AutoGen: initialize all library constructors
  sub_41C0()  -- Main: SetJump, init protocols, process HOB
    if success: sub_3D44(LongJump) -- commit
    if fail: sub_4150() -- unload libs

Error Record Processing

HOB available (qword_5698 != NULL):
  sub_23FC()
-- type==1 (error):
     |-- sub_type in {4,5,12,19}: sub_1EF8() -> sub_34BC(decode) -> dispatch callback
     |-- sub_type in {9,10,11,7,8}: sub_1CD0() -> WHEA boot (type=1)
     |-- sub_type in {0,1,2,3}: sub_1E40() -> WHEA boot (type=1)
     |-- sub_type in {13,14,15,16,17,18}: sub_1D88() -> WHEA boot (type=1)
-- type==2 (clear): sub_1F60() -> WHEA boot (type=25)
     sets byte_56CB = 1

HOB not available but qword_56A0 set:
  sub_23FC() alternate path
    for 8 CPU slots:
      if active: call smi_handler -> sub_202C() to clear

Error Decode + Handler Dispatch

sub_34BC(input_hdr, output_block)
-- Read MCG_CAP MSR (0x179)
-- Determine correction mode (0-6)
-- If mode==0 (corrected):
     sub_33A8() -> Get CPU topology for error address
     sub_2CD8() -> Read MCi_STATUS MSRs, validate address match
     sub_2B64() -> Verify with SMI handler protocol
     sub_2EC4() -> Find error in bank, dispatch to handler callback
       |-- sub_2E48() -> Read CPU topology from CSR
       |-- For found error: run callback (cmc/smi/handler)
       |-- For not found: sub_27D4(build notification) -> callback
-- Return to sub_1EF8, route to WHEA boot protocol

Dependencies

Consumed (this module calls)

Module/Protocol Functions Purpose
UefiBootServicesTableLib LocateProtocol, etc. Protocol resolution, BS table access
UefiRuntimeServicesTableLib RT table access Runtime services pointer
DxeHobLib GetFirstGuidHob, GetNextGuidHob HOB traversal for error records
WheaSiliconHooksLib Register notification, protocol lookup WHEA boot protocol interface
mpsyncdatalib MP sync, CPU data init Multi-processor synchronization
DxeMmPciBaseLib MmPciBase MMIO PCIe config space access
SmmMemoryAllocationLib SMRAM range discovery Memory allocation in SMM
DxePcdLib PCD get/set Fixed-at-build PCDs
BaseIoLibIntrinsic inl, outw I/O port access for ACPI timer
BaseLib SetJump, LongJump Error recovery context save/restore
BaseMemoryLibRepStr CopyMem, ZeroMem Memory operations
BaseSynchronizationLib Initialize spinlock Spinlock for CPU topology data

Consumed By (other modules call this)

This module is a DXE driver that installs WHEA protocol interfaces. Based on context:

  • WheaErrorInj: May use the WHEA boot protocol interfaces installed here
  • ProcessorErrorHandler: Shares the same WHEA protocol table
  • PlatformErrorHandler: May consume error status populated by this driver
  • CrystalRidge (NVDIMM): May use the same sync data infrastructure

Notes

  1. DXE-only phase: This driver operates in DXE, not SMM. It reads HOB data produced by PEI modules and translates it.

  2. SetJump/LongJump pattern: sub_2A0 saves context at sub_2A0-style JMP_BUF (248 bytes: 8 GP regs, 16 XMM regs, MXCSR, return address). sub_3D44 restores via LongJump. This is used as a poor-man's try/catch around the error processing pipeline.

  3. Round-robin cache: The 20-entry error cache at 0x56E0 uses a round-robin replacement policy (dword_5824 tracks the insertion index, byte_5820 = wrapping flag).

  4. CPU topology encoding: The sub_16C8 function derives thread/core bit widths from CPUID, used to pack APIC IDs into the error structures. Values stored in dword_26880 (cores bitmask) and dword_A878 (threads bitmask).

  5. PCIe config address format (sub_11B8): Builds address as:

    • Low 12 bits: register offset
    • Bits 12-14: function (3 bits)
    • Bits 15-19: device (5 bits)
    • Bits 20-24: bus (5 bits)
    • Bits 25-31: segment/domain
    • High bits: socket id
    • Bit 9 of flags: extended config space access (0x200 = 512 byte access)
  6. Build path string: Matches Build\HR6N0XMLK\DEBUG_VS2015\X64\ indicating this is a debug build for the HR6N0XMLK platform variant (likely HR650X).

  7. WHEA protocol interface offsets (from qword_56A8):

    • +8: Unknown function
    • +32: Function index 4 (type 17) -- log/clear error
    • +40: Function index 5 (type 25) -- log platform error
    • +56: Function index 7 (type 1) -- dispatch error record