# FpgaErrorHandler Module

## Overview
SMM driver that handles FPGA (Field Programmable Gate Array) error status monitoring and correction for the Intel Purley platform. This module runs in System Management Mode (SMM), monitors FPGA error registers via MMIO, and performs error acknowledgment and system reset when critical FPGA errors are detected. Also integrates with the MpSyncData library for multi-processor synchronization.

## Address Range
0x280 - 0x1E80 (0x1C00 bytes .text, 44 functions)

## Key Functions

| Address | Name | Purpose |
|---------|------|---------|
| 0x514 | _ModuleEntryPoint | SMM module entry point, initializes error handler |
| 0x5C0 | sub_5C0 | Auto-generated UEFI init: saves ImageHandle, SystemTable, BootServices, RuntimeServices, gSmst |
| 0xEAC | sub_EAC | Main FPGA error handler registration -- locates protocols, registers callbacks |
| 0xDFC | sub_DFC | Error status collection callback -- reads FPGA error status registers per socket |
| 0xD48 | sub_D48 | Error polling function -- checks FPGA error bits and triggers correction |
| 0xCB4 | sub_CB4 | Fatal error handler -- logs error, writes GPIO, triggers warm reset via 0xCF9 |
| 0xC90 | sub_C90 | Error status query -- checks if a specific error bit is set |
| 0xB48 | sub_B48 | Error clear function -- clears FPGA error status registers |
| 0xBF0 | sub_BF0 | Error buffer clear function -- resets FPGA error buffer to zero |
| 0xB38 | sub_B38 | FPGA error presence check -- returns whether any FPGA error is active |
| 0xA30 | sub_A30 | Error logging function -- writes error info via MpSyncData protocol |
| 0x1580 | sub_1580 | MpSyncData library initialization -- sets up CPU topology tracking |

## Entry Points (Public API)

- **0x514** `_ModuleEntryPoint`: Standard UEFI SMM driver entry point. Saves boot services, locates protocols, registers FPGA error callbacks, and initializes error monitoring.

## Internal Helpers

### UEFI Infrastructure (Auto-generated or library code)

- **0x5C0** `sub_5C0`: UEFI boot services initialization. Saves ImageHandle, SystemTable, gBS, gRT, gSmst pointers from the UEFI system table. Generated by AutoGen.c
- **0x280** `sub_280`: `SetJump` implementation -- saves CPU register context (GP registers, XMM, MXCSR) into a jump buffer structure for non-local goto. Used to protect the entry point during error handling.
- **0x320** `sub_320`: `LongJump` implementation -- restores CPU context from a jump buffer and longjmps back. Used to recover from errors during initialization.
- **0x11E0** `sub_11E0`: `SetJump` validation wrapper -- validates jump buffer alignment (8-byte aligned) and non-null.
- **0x3A0** `sub_3A0`: `ZeroMem` -- zero-fills memory buffers (aligned + tail).
- **0x12FC** `sub_12FC`: `ZeroMem` wrapper with validation -- validates buffer non-null and length bounds.
- **0x430** `sub_430`: `RDTSC` wrapper -- reads timestamp counter.
- **0x420** `sub_420`: `_mm_pause` wrapper -- CPU spin-loop hint.
- **0x4A0** `sub_4A0`: `_enable()` -- enable interrupts.
- **0x4B0** `sub_4B0`: `_disable()` -- disable interrupts.
- **0x4C0** `sub_4C0`: `__getcallerseflags()` -- read EFLAGS.
- **0x440** `sub_440`: `__cpuid` wrapper (leaf-based) -- CPUID with EAX input, returns EAX/EBX/ECX/EDX.
- **0x470** `sub_470`: `__cpuid` wrapper (function-based) -- CPUID with query type input.

### MMIO/I/O Access Wrappers

- **0x128C** `sub_128C`: 64-bit MMIO read -- reads a QWORD from an MMIO address with alignment check.
- **0x12BC** `sub_12BC`: 64-bit MMIO write -- writes a QWORD to an MMIO address with alignment check.
- **0x1228** `sub_1228`: 16-bit MMIO read -- reads a WORD from IO address with alignment check.
- **0x1258** `sub_1258`: 16-bit MMIO write (constant 0x500) -- writes 0x500 to a WORD MMIO address.
- **0x1D54** `sub_1D54`: `__indword` -- reads a DWORD from an I/O port.
- **0x1D24** `sub_1D24`: Unaligned read -- reads a QWORD from potentially unaligned address.

### PCI Express Helpers

- **0x1544** `sub_1544`: PCI Express MMIO address translation -- validates PCIe address (upper bits must be zero) and adds the MMIO base from `qword_2E68`.
- **0x143C** `sub_143C`: MMIO read via PciRootBridge protocol -- reads a DWORD from PCI config space (Bus 0, Dev 0, Func 0, Reg 0xF8000 = 1015808) using protocol at `qword_2E58`.
- **0x1C00** `sub_1C00`: PCH info query -- reads LPC device ID to determine PCH SKU, validates it (checks for supported PCH), returns PCH-specific data from `unk_2B00`.
- **0x1BA0** `sub_1BA0`: GPIO value read -- reads a GPIO value via MMIO (address 0xFD000148 + shifted offset) using PCH info.

### Memory Management

- **0x13D4** `sub_13D4`: Free pool -- frees the allocated buffer at `qword_23FF0`. Uses either InternalSmmFreePool or gBS->FreePool depending on context.
- **0x1360** `sub_1360`: MM RAM range check -- checks if an address is within any registered SMM memory region.
- **0x13A4** `sub_13A4`: SMM allocate pool -- allocates memory via the SMM memory allocator.

### Protocol Lookups (Singleton pattern)

- **0x10C8** `sub_10C8`: Lazy-loads debug logging protocol at GUID `unk_2A60` into `qword_2E50`.
- **0x1C98** `sub_1C98`: Lazy-loads PCD protocol at GUID `unk_2A50` into `qword_2EA0`.
- **0x146C** `sub_146C`: Lazy-loads HOB list from SystemTable's HOB list pointer.

### HOB/Data Comparison

- **0x1D84** `sub_1D84`: GUID comparison -- compares two GUID values (16 bytes each, split as two QWORDs). Used to find matching HOB entries.
- **0x1990** `sub_1990`: CPU topology discovery -- uses CPUID to determine thread bits and core bits for the CPU topology.

### Spinlock

- **0x1DF4** `sub_1DF4`: Spinlock initialization -- sets a spinlock QWORD to 1 (unlocked).

### Assertion/Logging

- **0x11A0** `sub_11A0`: Debug logging -- calls debug print protocol if available (lazy-loaded).
- **0x1118** `sub_1118`: Conditional debug print with platform check -- checks CMOS (I/O 0x70/0x71) for platform type and error severity level before printing.

## State Management

### Global Variables (.data section, 0x2A00 - 0x24800)

| Address | Size | Name | Purpose |
|---------|------|------|---------|
| 0x2E28 | 8 | SystemTable | Saved UEFI SystemTable pointer |
| 0x2E30 | 8 | BootServices | Saved gBS pointer |
| 0x2E38 | 8 | ImageHandle | Saved driver image handle |
| 0x2E40 | 8 | RuntimeServices | Saved gRT pointer |
| 0x2E48 | 8 | qword_2E48 | SMM System Table pointer (gSmst) |
| 0x2E50 | 8 | qword_2E50 | Debug print protocol (lazy-loaded) |
| 0x2E58 | 8 | qword_2E58 | PciRootBridge protocol |
| 0x2E60 | 8 | qword_2E60 | HOB list pointer (lazy-loaded) |
| 0x2E68 | 8 | qword_2E68 | PCI Express MMIO base address |
| 0x2E78 | 8 | qword_2E78 | MpSyncData protocol pointer |
| 0x2E80 | 8 | qword_2E80 | MpSyncData service protocol |
| 0x2E88 | 8 | qword_2E88 | MpSyncData CPU info protocol |
| 0x2E90 | 8 | unk_2E90 | MpSyncData second protocol |
| 0x2E98 | 8 | qword_2E98 | FPGA MMIO protocol (PciRootBridge IO access) |
| 0x2EA0 | 8 | qword_2EA0 | PCD protocol pointer |
| 0x2FA8 | 8 | qword_2FA8 | Module return status (initialized to 0x8000000000000001) |
| 0x24050 | 8 | qword_24050 | FPGA state structure pointer (from MmPciBase protocol) |
| 0x24058 | 8 | qword_24058 | FPGA protocol interface for callback registration |
| 0x24000 | 8 | qword_24000 | MmPciBase protocol instance |
| 0x23FF0 | 8 | qword_23FF0 | SMM memory allocation buffer |
| 0x23FE8 | 8 | qword_23FE8 | SMM descriptor count |
| 0x23FE0 | 4 | dword_23FE0 | Topology: socket mask (1 << socket) |
| 0x7FD8 | 4 | dword_7FD8 | Topology: thread mask (1 << (core + socket)) |
| 0x24020 | 8 | psub_B38 | Function pointer: FPGA error presence check |
| 0x24028 | 8 | psub_B48 | Function pointer: FPGA error clear |
| 0x24030 | 8 | psub_BF0 | Function pointer: FPGA error buffer clear |
| 0x24038 | 8 | psub_C90 | Function pointer: FPGA error status query |
| 0x24040 | 8 | psub_CB4 | Function pointer: FPGA fatal error handler |
| 0x24048 | 8 | psub_D48 | Function pointer: FPGA error poll handler |
| 0x24060 | 16 | buf_ | FPGA error status buffer (4 x DWORD, one per socket) |
| 0x2EB0 | 248 | unk_2EB0 | SetJump buffer (248 bytes for CPU context save) |

### Large Data Tables (0x7FC8 - 0xE7E0)

| Address | Purpose |
|---------|---------|
| 0x7FC8 | unk_7FC8 - Spinlock |
| 0x7FD0 | unk_7FD0 - Spinlock |
| 0x7FE0 | byte_7FE0 - Per-CPU active state flags |
| 0x87E0 | qword_87E0 - Per-CPU APIC ID mapping table |
| 0xC7E0 | dword_C7E0 - Per-CPU initial APIC ID table |
| 0xE7E0 | byte_E7E0 - Per-CPU state byte table |
| 0x2FC0 | unk_2FC0 - Per-CPU spinlock array (0x5000 bytes, 40 bytes per entry) |
| 0x2FC8 | byte_2FC8 - Per-CPU active flag array (0x5000 bytes, 40 bytes stride) |

### Reference Data (.rdata section, 0x1E80 - 0x2A00)

| Address | Data | Purpose |
|---------|------|---------|
| 0x1FD0 | 6 x WORD offsets | FPGA register offset table (0x394, 0x39C, 0x3A4, 0x3AC, 0x3B4, 0x3BC) |
| 0x1FE0 | 6 x WORD offsets | FPGA register offset table group 2 (0x390, 0x398, 0x3A0, 0x3A8, 0x3B0, 0x3B8) |

## Protocol GUIDs

### Located via SMM System Table (qword_2E48 + 208 = Smst->SmmLocateProtocol)

| Address | GUID Binary | Likely Protocol |
|---------|-------------|-----------------|
| 0x2A00 | 3BA7E14B-176D-4B2A-948A-C86FB001943C | MmPciBase protocol (get FPGA base address) |
| 0x2A10 | 86B091ED-1463-43B5-82A1-2C8B83CB8917 | MmPciBase FPGA cfg protocol |
| 0x2A20 | 0067835F-9A50-433A-8CBB-852078197814 | MpSyncData protocol |
| 0x2A70 | ED32D533-99E6-4209-9CC0-2D72CDD998A7 | FPGA MMIO access protocol |
| 0x2A80 | 1D202CAB-C8AB-4D5C-94F7-3CFCC0D3D335 | MpSyncData CPU info protocol |
| 0x2A90 | 6820ABD4-A292-4817-9147-D91DC83542 | PCI config protocol |
| 0x2AB0 | 47B7FA8C-F4BD-4AF6-8200-333086F0D2C8 | FPGA callback registration protocol |
| 0x2AC0 | 7739F24C-93D7-11D4-9A3A-0090273FC14D | HOB GUID (gEfiHobMemoryAllocModuleGuid) |
| 0x2AC8 | 0090273FC14D... | Part of HOB entry GUID |

### Located via BootServices (gBS + 320 = gBS->LocateProtocol)

| Address | GUID Binary | Likely Protocol |
|---------|-------------|-----------------|
| 0x2AD0 | F4CCBFB7-F6E0-47FD-9DD4-10A8F150C191 | MpSyncData protocol |
| 0x2A40 | A7CED760-C71C-4E1A-ACB1-89604D5216CB | MpSyncData protocol |
| 0x2A50 | 11B34006-D85B-4D0A-A290-D5A571310EF7 | PCD protocol |

### Located via MpSyncData (qword_2E78 + 208 = MpSyncData->SmmLocateProtocol)

| Address | GUID Binary | Purpose |
|---------|-------------|---------|
| 0x2A20 | 0067835F-9A50-433A-8CBB-852078197814 | MpSyncData protocol |
| 0x2A80 | 1D202CAB-C8AB-4D5C-94F7-3CFCC0D3D335 | MpSyncData CPU info |

## Data Structures

### FPGA State Structure (accessed via qword_24050)
The FPGA state object is obtained from MmPciBase protocol. Key offsets:
- **+22**: Byte field with bitmask of active FPGA sockets (bits 0-3 for sockets 0-3)
- **+14958** (0x3A6E): Per-socket FPGA error status array (starts at offset 14958 from base)
- Each socket's FPGA status spans 14944 bytes

### FPGA Error Register Set
- Based on PCI config space at MMIO base derived from MmPciBase
- Register offsets at 0x1FD0 table: 0x394, 0x39C, 0x3A4, 0x3AC, 0x3B4, 0x3BC
- Register offsets at 0x1FE0 table: 0x390, 0x398, 0x3A0, 0x3A8, 0x3B0, 0x3B8
- These appear to be FPGA error status/clear registers

### FPGA Callback Registration Structure (at qword_24058)
Protocol with a function at offset 0 that takes an array of 6 function pointers and a parameter:
```c
typedef struct {
    void (*Register)(FPGA_CALLBACK_ARRAY *Callbacks, UINT8 Param);
} FPGA_CALLBACK_PROTOCOL;
```
The callback array has 6 entries:
- [0] = sub_B38: FPGA error presence check
- [1] = sub_B48: FPGA error clear
- [2] = sub_BF0: FPGA error buffer clear
- [3] = sub_C90: FPGA error status query
- [4] = sub_CB4: FPGA fatal error handler
- [5] = sub_D48: FPGA error poll handler

### Per-CPU Data Tables (MpSyncData)
The module allocates large tables indexed by (socket, core, thread):
- **byte_7FE0**: Per-CPU active flag (1 = active)
- **byte_E7E0**: Per-CPU state byte (0 = invalid, 1 = present, 2 = initialized)
- **dword_C7E0**: Per-CPU initial APIC ID
- **qword_87E0**: Per-CPU APIC ID (64-bit)
- **byte_2FC8**: Per-CPU active flag (40-byte stride)
- **unk_2FC0**: Per-CPU spinlock (40-byte stride)

Offset calculation: `idx = thread + (core + 448 * socket) * 64`
- 448 cores per socket max, 64 threads per core max

## Calling Patterns

### 1. Module Initialization Flow
```
_ModuleEntryPoint (0x514)
  -> sub_5C0 (0x5C0): Save boot services, gBS, gRT, gSmst
  -> sub_280 (0x280): SetJump to protect against errors
  -> sub_EAC (0xEAC): Main FPGA error handler setup
      -> LocateProtocol (MmPciBase) -> gets FPGA base -> qword_24050
      -> LocateProtocol (PCI config) -> qword_24058
      -> LocateProtocol (MpSyncData) -> qword_2E20
      -> Register protocol callbacks:
           RegisterCallback( {sub_B38, sub_B48, sub_BF0, sub_C90, sub_CB4, sub_D48}, 3)
  -> sub_11E0: Validate SetJump buffer
  -> sub_320: LongJump back if error occurred
```

### 2. FPGA Error Polling Flow (sub_DFC - error status collection)
```
sub_DFC (0xDFC) - called from FPCA callback framework
  -> For each of 4 sockets:
     -> Check if bit N is set in FPGA state byte (+22)
     -> Read FPGA error register via MMIO protocol at qword_2E98
     -> Store status in buf_ array
```

### 3. FPGA Error Correction Flow (sub_D48)
```
sub_D48 (0xD48) - poll handler
  -> For each active socket:
     -> Check error register at offset +16400 (error pending)
     -> If pending: call sub_A30 to log, set output flag
     -> Check register at offset +968 (secondary error)
     -> If pending: call sub_A30 to log, set output flag
```

### 4. Fatal Error Flow (sub_CB4)
```
sub_CB4 (0xCB4) - fatal FPGA error handler
  -> Read GPIO value via sub_1BA0 (MMIO 0xFD000148 + shift)
  -> Write GPIO output bit
  -> __outbyte(0xCF9, 2): Reset CPU
  -> __outbyte(0xCF9, 6): Full system reset
  -> Infinite loop
```

### 5. MpSyncData Initialization Flow (sub_1580)
```
sub_1580 (0x1580) - initialize CPU sync data
  -> LocateProtocol (MpSyncData)
  -> LocateProtocol (MpSyncData second)
  -> LocateProtocol (MpSyncData CPU info)
  -> sub_1990: Get CPU topology (thread_bits, core_bits)
  -> Initialize per-CPU data tables (spinlocks, flags, APIC IDs)
  -> Enumerate all CPUs via MpSyncData CPU info protocol
```

## Dependencies

### Consumed (this module calls out to)
- **UEFI Boot Services** (gBS): LocateProtocol, FreePool
- **SMM System Table** (gSmst): SmmLocateProtocol, SmmAllocatePool, InternalSmmFreePool
- **MmPciBase Protocol**: Provides FPGA MMIO config space base address
- **MmPci IO Protocol**: FPGA register read/write access
- **FPGA Callback Registration Protocol**: Registers error handler callbacks
- **MpSyncData Protocol**: Multi-processor synchronization data management
- **PciRootBridge Protocol**: PCI config space access for PCH detection
- **PCD Protocol**: Platform Configuration Database
- **Debug Print Protocol**: DEBUG/ASSERT message output
- **GPIO Private Library**: GPIO status register access via MMIO
- **PCH Info Library**: PCH SKU detection via LPC device ID

### Consumed By (other modules call this)
- **SMM Core**: Via registered FPCA callback protocol -- the 6 callbacks (sub_B38, sub_B48, sub_BF0, sub_C90, sub_CB4, sub_D48) are registered with the FPGA framework to be called when FPGA errors occur.

## Notes

1. **Source file**: `PurleyPlatPkg/Ras/Smm/ErrHandling/FpgaErrorHandler/FpgaErrorHandler.c` -- SMM driver for FPGA error handling on Intel Purley platform.

2. **Multi-socket support**: All error handling iterates over sockets 0-3, using a bitmask at FPGA state byte +22 to determine which sockets are populated.

3. **FPGA Register Layout**: The FPGA error status registers are at offsets 0x390-0x3BC in the FPGA PCI config space. Two sets of 6 WORD-sized registers (set1: 0x394/0x39C/0x3A4/0x3AC/0x3B4/0x3BC, set2: 0x390/0x398/0x3A0/0x3A8/0x3B0/0x3B8).

4. **Error severity levels**: The callbacks are registered with parameter "3", and the sub_1118 debug print function checks the platform type via CMOS to control which error levels get logged.

5. **CMOS-based debug control**: sub_1118 reads CMOS offset 0x4C (via I/O ports 0x70/0x71) to determine platform type and control error message output.

6. **Warm reset signaling**: Fatal errors trigger a warm reset via the standard 0xCF9 reset port (write 2 then 6). The GPIO register at 0xFD000148 is used to signal the error cause to the hardware.

7. **CPU topology**: sub_1990 uses CPUID leaf 0xB (Extended Topology) to discover thread bits per core and core bits per package, supporting up to 4 sockets, ~448 cores per socket, and 64 threads per core.

8. **SetJump protection**: The entry point uses SetJump/LongJump (sub_280/sub_320) to protect against crashes during initialization -- if sub_EAC fails (longjmp called), the module continues gracefully.
