system address map initialization in x86

Upload: dhaakchik

Post on 02-Jun-2018

269 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 System Address Map Initialization in x86

    1/71

    System Address Map Initialization in x86/x64 Architecture Part 1: PCI!ased Systems

    This article serves as a clarification about the PCI expansion ROM address mapping, which wasnot sufficiently covered in my Malicious PCI !xpansion ROM" article published by InfosecInstitute last year # http$%%resources&infosecinstitute&com%pci'expansion'rom% (& )ow'levelprogrammers are sometimes pu**led about the mapping of device memory, such as PCI devicememory, to the system address map& This article explains the initiali*ation of the system addressmap, focusing on the initiali*ation of the PCI chip registers that control PCI device memoryaddress mapping to the system address map& PCI device memory address mapping is onlyre+uired if the PCI device contains memory, such as a video card, networ card with onboardbuffer, or networ card that supports PCI expansion ROM, etc&

    -./%x/0 system address map is complex due to bac ward compatibility that must be maintainedin the bus protocol in x./%x/0 architecture& 1us protocol being utili*ed in a system dictates theaddress mapping of the memory of a device2that3s attached to the bus2to the system addressmap& Therefore, you must understand the address mapping mechanism of the specific busprotocol to understand the system address map initiali*ation& This article focuses on systemsbased on the PCI bus protocol & PCI bus protocol is a legacy bus protocol by today3s standard&4owever, it3s very important to understand how it wor s in the lowest level in terms ofsoftware%firmware, because it3s impossible to understand the later bus protocol, the PCI !xpress#PCIe( without understanding PCI bus protocol& PCIe is virtually the main bus protocol in everyx./%x/0 systems today& Part 5 of this article will focus on PCIe'based systems&

    Conventions

    There are several different usages of the word memory" in this article& It can be confusing forthose new to the sub6ect& Therefore, this article uses these conventions$

    7& The word main memory" refers to the R8M modules installed on the motherboard&5& The word memory controller" refers to part of the chipset or the CP9 that controls the

    R8M modules and accesses to the R8M modules&:& ;lash memory refers to either the chip on the motherboard that stores the 1IO& The word memory space" means the set of memory addresses accessible by the CP9,i&e&, the memory that is addressable from the CP9& Memory in this context could meanR8M, ROM or other forms of memory which can be addressed by the CP9&

    /& The word PCI expansion ROM" mostly refers to the ROM chip on a PCI device, exceptwhen the context contains other specific explanation&

    http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/pci-expansion-rom/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/pci-expansion-rom/
  • 8/10/2019 System Address Map Initialization in x86

    2/71

    The Boot Process at a Glance

    This section explains the boot process in sufficient detail to understand the system address mapand other bus protocol'related matters that are explained later in this article& ?ou need to have aclear understanding of the boot process before we get into the system address map and busprotocol'related tal s&

    The boot process in x./%x/0 starts with the platform firmware #1IO

  • 8/10/2019 System Address Map Initialization in x86

    3/71

    at http$%%support&amd&com%us%ProcessorETechBocs%:777/&pdf &

  • 8/10/2019 System Address Map Initialization in x86

    4/71

    and the flash ROM& Betails of the bit twiddling" are outside the scope of thisarticle& ?ou can read details of the mapping in the respective chipset datasheet&

    :& Redirecting memory transaction to the correct target& This is a continuation of theshadowing" step& The details depends on the platform #CP9 and chipset

    combination(, and the runtime setup, i&e&, whether to shadow the platformfirmware or not at runtime #when the O< runs(&

    0& & Transferring platform firmware execution to R8M& This is a 6ump" to the platformfirmware code which is shadowed" to the R8M in step b&

    /& Miscellaneous platform enabling& This step depends on the specific system configuration,i&e&, the motherboard and supporting chips& 9sually, it consists of cloc generator chip

    initiali*ation, to run the platform at the intended speed, and in some platforms this stepalso consists of initiali*ing the general purpose I%O #@PIO( registers&

    G& Interrupt enabling& Previous steps assume that the interrupt is not yet enabled becauseall of the interrupt hardware is not yet configured& In this step the interrupt hardware suchas the interrupt controller#s( and the associated interrupt handler software are initiali*ed&There are several possible interrupt controller hardware in x./%x/0, i&e&, the .5>Hprogrammable interrupt controller #PIC(, the local advanced programmable interruptcontroller #)8PIC( present in most CP9s today, and the I%O advanced programmableinterrupt controller #IOx8PIC( present in most chipsets today& 8fter the hardware andsoftware re+uired to handle the interrupt are ready, the interrupt is enabled&

    .& Timer initiali*ation& In this step, the hardware timer is enabled& The timer generates timerinterrupt when certain interval is reached& O< and some applications running on top ofthe O< use the timer to wor & There are also several possible pieces of hardware #orsome combination( that could act as the timer in x./%x/0 platform, i&e&, the .5>0programmable interrupt timer #PIT( chip that resides in the chipset, the high precisionevent timer #4P!T( also residing in the chipset2this timer doesn3t need initiali*ation andis used only by the O

  • 8/10/2019 System Address Map Initialization in x86

    5/71

    cores2the 8P2must be initiali*ed accordingly before the O< boot'loader ta es control of the system& One of the most important things to initiali*e in the 8P is the MTRRs& TheMTRRs must be consistent in all CP9 cores, otherwise memory read and write couldmisbehave and bring the system to a halt&

    77&

  • 8/10/2019 System Address Map Initialization in x86

    6/71

    System Based on Intel 81 !-IC"# Chipset

    ;igure 7 shows the simplified" bloc diagram of the system that uses Intel .7>!'IC45 chipset

    combination& ;igure 7 doesn3t show the entire connection from the chipset to other components inthe system, only those related to the address mapping in the system&

    Figure 1 Intel 815E-ICH2 (Simplified) Block Di gr m

    The Intel .7>!'IC45 chipset pair is not a pure" PCI chipset, because it implements a non'PCIbus to connect the northbridge and the southbridge, called the hub interface #4I(, as you can see

  • 8/10/2019 System Address Map Initialization in x86

    7/71

    in ;igure 7& 4owever, from logic point of view, the 4I bus is basically a PCI bus with fastertransfer speed& Fell, our focus here is not on the transfer speed or techni+ues related to datatransfers, but on the address mapping and, since 4I doesn3t alter anything related to addressmapping, we can safely ignore the 4I bus specifics and regard it in the same respect as PCI bus&

    The Intel .7>! chipset is ancient by present standards, but it3s very interesting case study forthose new to PCI chipset low'level details because it3s very close to pure" PCI'based systems&

    8s you can see in ;igure 7, Intel .7>! chipset was one of the northbridge chipset for Intel CP9sthat uses soc et :GD, such as Pentium III #code name Coppermine( and Intel Celeron CP9s&

    ;igure 7 shows how the CP9 connects #logically( to the rest of the system via the Intel .7>!northbridge& This implies that any access to any device outside the CP9 must pass through thenorthbridge& ;rom an address mapping standpoint, this means that Intel .7>! acts as a sort of

    address mapping router," i&e&, the device that routes read or write transactions to a certainaddress2or address range#s(2to the correct device& In fact, that is how the northbridge wor s in

    practice& The difference with present'day systems is the physical location of the northbridge,which is integrated into the CP9, instead of being an individual component on the motherboardli e Intel .7>! bac then& The configuration of the address mapping router" in the northbridge atboot differs from the runtime configuration& In practice, the address mapping router" is a series of #logical( PCI device registers in the northbridge that control the system address map& 9sually,these registers are part of the memory controller and the 8@P%PCI 1ridge logical devices in thenorthbridge chipset& The platform firmware initiali*es these address mapping'related registers atboot to prepare for runtime usage inside an O! chipset&

  • 8/10/2019 System Address Map Initialization in x86

    8/71

    Figure 2 Intel 815E S!"tem #ddre"" $ p

    ;igure 5 shows Intel .7>! system address map& ?ou can find a complementary address mappingexplanation in Intel .7>! chipset datasheet in ! chipset only supports up to >75M1 of main memory #R8M( and only uses a 0@1memory space& ;igure 5 also shows that PCI devices consume #use( the CP9 memory space& 8device that consumes CP9 memory space is termed a memory mapped I%O device or MMIO

  • 8/10/2019 System Address Map Initialization in x86

    9/71

    device for short& The MMIO term is widely used in the industry and applies to all other CP9s, not 6ust x./%x/0&

    ;igure 5 shows that the R8M occupies #at most( the lowest >75M1 of the memory space&4owever, above the 7/M1 physical address, the space seems to be shared between R8M andPCI devices& 8ctually, there is no sharing of memory range happening in the system because theplatform firmware initiali*es the PCI device3s memory to use memory range above the memoryrange consumed by main memory #R8M( in the CP9 memory space& This memory range free"from R8M depends on the amount of R8M installed in the system& If the installed R8M si*e is5>/M1, the PCI memory range starts right after the 5>/M1 boundary up until the 0@1 memoryspace boundary, if the installed R8M si*e is :.0M1, the PCI memory range starts right after the:.0M1 boundary up until the 0@1 memory space boundary and so on& Therefore, this impliesthat the memory range consumed by the PCI devices is relocatable, i&e&, can be relocated withinthe CP9 memory space, otherwise it3s impossible to prevent conflict in memory range usage&Fell, it3s true and actually it3s one of the features of the PCI bus which sets it apart from the I

  • 8/10/2019 System Address Map Initialization in x86

    10/71

    details on accessing the PCI configuration register in x./, please read this section of my pastarticle$ https$%%sites&google&com%site%pinc*a o%pinc*a o's'guide'to'award'bios'reverse'engineeringLPCIE19< & The material in that lin should give you a clearer view about access tothe PCI configuration register in x./%x/0 CP9s& Mind you that the code in that lin must beexecuted in ring D # ernel mode( or under BO< #if you3re using 7/'bit assembler(, otherwise itwould ma e the O< respond with access permission exception&

    Jow, let3s loo more closely at the PCI configuration register& The PCI configuration registerconsists of 5>/ bytes of registers, from #byte( offset DDh to #byte( offset ;;h& The 5>/'byte PCIconfiguration register consists of two parts, the first /0 bytes are called PCI configuration registerheader and the rest are called device-specific PCI configuration register & This article only dealswith the 18Rs, which are located in the PCI configuration register header& It doesn3t deal with thedevice'specific PCI configuration registers because only 18Rs affect the PCI device memorymapping to system address map&

    There are two types of PCI configuration register header, a type D and a type 7 header& The PCI'to'PCI bridge device must implement the PCI configuration register type 7 header, while otherPCI device must implement the PCI configuration register type D header& This article only dealswith PCI configuration register type D header and focuses on the 18Rs& ;igure : shows the PCIconfiguration register type D header& The registers are accessed via I%O ports C;.h'C;1h andC;Ch'C;;h, as explained previously&

    https://sites.google.com/site/pinczakko/pinczakko-s-guide-to-award-bios-reverse-engineeringhttps://sites.google.com/site/pinczakko/pinczakko-s-guide-to-award-bios-reverse-engineeringhttps://sites.google.com/site/pinczakko/pinczakko-s-guide-to-award-bios-reverse-engineeringhttps://sites.google.com/site/pinczakko/pinczakko-s-guide-to-award-bios-reverse-engineering
  • 8/10/2019 System Address Map Initialization in x86

    11/71

    Figure % &CI Configur tion 'egi"ter" !pe

    ;igure : shows that there are two types of 18R, highlighted on a blue bac ground$ the 18Rsthemselves and the expansion ROM base address register #-ROM18R(& 18Rs span the range of six :5'bit registers #50'bytes(, from offset 7Dh to offset 5Gh in the PCI configuration header type D&18Rs are used for mapping the non-expansion ROM PCI device memory 2usually R8M on thePCI device2to the system memory map, while -ROM18R is specifically used for mapping thePCI expansion ROM to system address map& It3s the 6ob of platform firmware to initiali*e thevalues of the 18Rs& !ach 18R is a :5'bit registers, hence each of them can map PCI devicememory in the :5'bit system address map, i&e&, can map the PCI device memory to the 0@1memory address space&

  • 8/10/2019 System Address Map Initialization in x86

    12/71

    The 18Rs and -ROM18R are readable and writeable registers, i&e&, the contents of the18Rs%-ROM18R can be modified& That3s the core capability re+uired to relocate the PCI devicememory in the system address map& PCI device memory is said to be relocatable in the systemaddress map, because you can change the base" address #start address( of the PCI devicememory in the system address map by changing the contents of the 18Rs%-ROM18R& It wor sli e this$

    Bifferent systems can have different main memory #R8M( si*e& Bifferent R8M si*e means that the area in the system address map set aside for PCI

    MMIO range differs& Bifferent PCI MMIO range means that the PCI device memory occupies different address

    in the CP9 memory space& That means you have to be able to change the base addressof the PCI device memory in the CP9 memory space re+uired when migrating the PCIdevice to a system with a different amount of R8MA the same is true if you add more R8Mto the same system&

    18Rs and -ROM18R control the address occupied by the PCI device memory& 1ecauseboth of them are modifiable, you can change the memory range occupied by the PCIdevice memory #in the CP9 memory space( as re+uired&

    Practical System Address Map on Intel 81 !-IC"# Plat$orm

    Perhaps the -ROM18R and 18Rs explanation is still a bit confusing for beginners& )et3s loo atsome practical examples& The scenario is as follows$

    7& The system in focus uses an Intel Pentium III CP9 with 5>/ mb R8M, a motherboard with

    Intel .7>! chipset, and an 8@P video card with :5 mb onboard memory& The 8@P videocard is basically a PCI device with onboard memory from the system address map pointof view& Fe call this configuration the first system configuration from now on&

    5& The same system as in point 7& 4owever, we add new 5>/ mb R8M module& Therefore,the system now has >75 mb R8M2the maximum amount of R8M supported by the Intel.7>! chipset based on its datasheet& Fe call this configuration the second systemconfiguration from now on&

    Fe3ll have a loo at what the system address map loo s li e in both of the configurations above&;igure 0 shows the system address map for the first system configuration #5>/ mb R8M( and thesystem address map for the second system configuration #>75 mb R8M(& 8s you can see, thememory range occupied by the PCI devices shrin s from :.0D mb #0@1 5>/ mb( in the firstsystem configuration #5>/ mb R8M( to :>.0 mb #0@1 >75M1( in the second systemconfiguration #>75 mb R8M(& The change also causes the base address of the 8@P video cardmemory to changeA in the first system configuration the base address is 5>/ mb while in thesecond system configuration the base address is >75 mb& The change in the 8@P video cardmemory base address is possible because the contents of the 8@P video card2its video chip218R can be modified&

  • 8/10/2019 System Address Map Initialization in x86

    13/71

    Figure * S!"tem #ddre"" $ p for t+e Fir"t (25, m '#$) nd Second (512 m '#$) S!"tem

    Configur tion

    Jow, let3s see how the Intel .7>! northbridge routes access to the CP9 memory space in the firstsystem configuration shown in ;igure 0 #5>/ mb R8M(& )et3s brea down the steps for a readre+uest to the video #8@P( memory in the first system configuration& In the first systemconfiguration, the platform firmware initiali*es the video memory to be mapped in memory range5>/ mb to 5.. mb, because the video memory si*e is :5 mb2the first 5>/ mb is mapped toR8M& The platform firmware does so by configuring%initiali*ing the video chip 18Rs to accept

    accesses in the 5>/ mb to 5.. mb memory range& ;igure > shows the steps to read the contentsof the video card memory starting at physical address 11C0_0000h #5.0M1( at runtime #inside an

    O

  • 8/10/2019 System Address Map Initialization in x86

    14/71

    Figure 5 Step" for $emor! 'e.ue"t to t+e #/& 0ideo C rd in t+e Fir"t S!"tem Configur tion (25,

    m '#$)

    Betails of the steps shown in ;igure > are as follows$

    7&

  • 8/10/2019 System Address Map Initialization in x86

    15/71

    address # 11C0_0000h ( is within the PM18

    initiali*ation of the four address mapping registers is the 6ob of the platform firmware&:& 1ecause the re+uested address # 11C0_0000h ( is within the PM18

    memory range, the northbridge forwards the read re+uest to the 8@P, i&e&, to the videocard chip&

    0& The video card chip3s 18Rs setting allows it to respond to the re+uest from thenorthbridge& The video card chip then reads the video memory at the re+uested address#at 11C0_0000h ( and returns the re+uested contents to the northbridge&

    >& The northbridge forwards the result returned by the video card to the CP9& 8fter this step,the CP9 receives the result from the northbridge and the read transaction completes&

    The brea down of a memory read in the sample above should be clear enough for those not yetaccustomed to the role of the Intel .7>! northbridge chipset as address router&"

    The System Management Mode 'SMM( Memory Mapping

    In the beginning of this section, I have tal ed about the special" memory range in the PCImemory range, i&e&, the memory range occupied by the flash ROM chip containing the platformfirmware and the 8PIC& If you loo at the system address map in ;igure 5, you can see that thereare two more memory ranges in the system address map that show up mysteriously& They are the4

    7& 4

    .7>! chipset must be programmed #at boot in the 1IO< code( to enable this hardcodedmemory range as 4

    not mapped to R8M, but to part of the video memory to provide compatibility to BO&aspx&

    Jow you should have a very good overall view of the effect of the 8@P in the system addressmap& The most important thing to remember is that the @8RT logic consults a translation table,i&e&, the @8RT data structure, in order to access the real contents of the additional video memoryin R8M& 8 similar techni+ue is employed in IOMM92the use of translation table&

    "i.ac/ing BI0S Interr%pt 1 h A 2!8#3h Inter$ace

    Jow let3s see how you can +uery the system for the system address map& In legacy systems withlegacy platform firmware, i&e&, 1IOh function !.5Dh #ax !.5Dh(& The interrupt must be performed when the x./%x/0system runs in real modeA right after the 1IO< completes platform initiali*ation& ?ou can finddetails of this interface at$ http$%%www&uru &org%orig'grub%mem/0mb&html& Interrupt 7>h function!.5Dh is sometimes called the !.5Dh interface& Fe3ll adopt this naming here&

    8 legacy boot root it2let3s call it boot it2could hide in the system by patching the interrupt 7>h,function !.5Dh handler& One of the ways to do that is to alter the address rangedescriptor structure returned by the !.5Dh interface& The address range descriptor structure isdefined as follows$

    1#4

    typedef AddressRangeDescript r!ag" #nsigned $ ng BaseAddr% &' #nsigned $ ng BaseAddr(igh'

    http://msdn.microsoft.com/en-us/library/windows/hardware/gg463285.aspxhttp://msdn.microsoft.com/en-us/library/windows/hardware/gg463285.aspxhttp://www.uruk.org/orig-grub/mem64mb.htmlhttp://msdn.microsoft.com/en-us/library/windows/hardware/gg463285.aspxhttp://msdn.microsoft.com/en-us/library/windows/hardware/gg463285.aspxhttp://www.uruk.org/orig-grub/mem64mb.html
  • 8/10/2019 System Address Map Initialization in x86

    23/71

    5

    67

    #nsigned $ ng %ength% &' #nsigned $ ng %ength(igh' #nsigned $ ng !ype') AddressRangeDescript r'

    The type field in the address range descriptor structure determines whether the memory range is

    available to be used by the O

  • 8/10/2019 System Address Map Initialization in x86

    24/71

    0& Call the *et+e ry+ap . function&

    The *et+e ry+ap . function returns a data structure that is similar to the one returned by thelegacy !.5Dh interface& The data structure is called EF/_+E+ R3_DE CR/ ! R & Fell, this article

    doesn3t try to delve deeper into this interface& ?ou can read details of the interface and

    the EF/_+E+ R3_DE CR/ ! R in the 9!;I specification&

  • 8/10/2019 System Address Map Initialization in x86

    25/71

    System Address Map Initialization in x86/x64 Architecture Part ": PCI#xpress !ased Systems

    This article is the second part of a series that clarifies PCI expansion ROM address mapping tothe system address map& The mapping was not sufficiently covered in my Malicious PCI!xpansion ROM"Q article # http$%%resources&infosecinstitute&com%pci'expansion'rom% (& ?ou areassumed to have a wor ing nowledge of PCI bus protocol and details of the x./%x/0 bootprocess& If you don3t, then please read the first part to get up to speed with the bac ground

    nowledge re+uired to understand this article #at http$%%resources&infosecinstitute&com%system'address'map'initiali*ation'in'x./x/0'architecture'part'7'pci'based'systems% (&

    The first part focuses on system address map initiali*ation in a x./%x/0 PCI-based system& Thisarticle focuses on more recent systems, i&e&, x./%x/0 PCI "xpress-based systems& ;rom this

    point on, PCI !xpress is abbreviated as PCIe throughout this article, in accordance with officialPCI !xpress specification&

    Fe are going to loo at system address map initiali*ation in x./%x/0 PCIe'based systems&

  • 8/10/2019 System Address Map Initialization in x86

    26/71

    >& Memory space" means the set of memory addresses accessible by the CP9, i&e&, thememory that is addressable from the CP9& Memory in this context could mean R8M,ROM, or other forms of memory that can be addressed by the CP9&

    /& PCI expansion ROM" refers to the ROM chip on a PCI device or the contents of the chip,except when the context contains other specific explanation&

    G& The terms hostbridge" and northbridge" refer to the same logic components in thisarticle& 1oth terms refer to the digital logic component that glues the CP9 cores to therest of the system, i&e&, connecting the CP9 cores to R8M modules, PCIe graphics, the

    southbridge" chipset, etc&.& Intel 0 th @eneration Core 8rchitecture CP9s are called 4aswell CP9" or simply 4aswell"

    in most of this article& Intel uses 4aswell" as the codename for this CP9 generation&H& 4exadecimal values end with h" as in 0B0Ah or starts with Dx" as in 05B0A &7D& 1inary values end with b" as in 1010b &

    77& The term memory transactions routing" means memory transactions routing based ontarget address of the transaction, unless stated otherwise&

    8nother recurring word in this article is platform firmware & Platform firmware refers to code toinitiali*e the platform upon reset, i&e&, the 1IO< or 9!;I code residing in the flash ROM chip of themotherboard&

    Preserving *irm:are Code Compati ility in Modern-Day ;65 "ard:are

    The x/0 architecture is an extension of the x./ architecture& Therefore, x/0 inherits most of thex./ architecture characteristics, including its very early boot characteristics and most of itssystem address map& There are two important aspects that x/0 preserves from x./ with respect

    to firmware code execution$

    7& The CP9 reset vector location& !ven though x/0 architecture is a /0'bit CP9architecture, the reset vector remains the same as in x./ #:5'bit( architecture, i&e&, ataddress 0@1'7/ bytes # FFFF_FFF0h (& This is meant to preserve compatibility with old

    add'on hardware migrated to x/0 platforms and also compatibility with numerous low'level code depending on the reset vector&

    5& The compatibility%legacy" memory ranges in the system address map& The compatibility"memory ranges are used for legacy devices& ;or example, some ranges in the lowest7M1 memory area are mapped to legacy hardware or their hardware emulation

    e+uivalent& More important, part of the memory range is mapped to the bootbloc part ofthe 1IO

  • 8/10/2019 System Address Map Initialization in x86

    27/71

    compatibility between the two different architectures is important, down to the firmwareand chip level&

    )et3s loo at what is needed at the chip level to preserve the bac#ward compatibility x./architecture, now that you now the reason for preserving the compatibility& ;igure 7 shows the

    logic components of the 4aswell platform with relation to the 9!;I%1IO< code fetch%read& 8s youcan see, two bloc s of logic, one in the CP9 and one in the Platform Controller $ub #PC4(, areprovided to preserve the bac ward compatibility& They are the compatibility memory range logic inthe CP9 and the internal memory target decoder logic in the PC4& 8s for the Birect MediaInterface #BMI( 5&D controller logic, it3s transparent with respect to software, including firmwarecode2it 6ust acts as a very fast pass'through" deviceA it doesn3t alter any of the transactionsinitiated by the firmware code that pass through it&

  • 8/10/2019 System Address Map Initialization in x86

    28/71

    Figure 1; BI3S EFI Code 'e d r n" ction in $odern &l tform

    ;igure 7 shows the CP9 core fetching code from the 1IO

  • 8/10/2019 System Address Map Initialization in x86

    29/71

    ;igure 7 shows there are four CP9 cores in the CP9& 4owever, not all of them are the sameA oneof them is mar ed as boot strap processor #1

  • 8/10/2019 System Address Map Initialization in x86

    30/71

    Figure 2; &CI Configur tion 'egi"ter !pe 1 He der (for &CI-to-&CI Bridge)

    The numbers in the top of ;igure 5 mar the bit position in the registers of the PCI configurationspace header& The numbers to the right of ;igure 5 mar the offset of the registers in the PCIconfiguration space header& Registers mar ed with yellow in ;igure 5 determine the memory andIO range forwarded by the PCI'to'PCI bridge from its primary interface #the interface closer to theCP9( to its secondary interface #the interface farther away from the CP9(& Registers mar ed withgreen in ;igure 5 determine the PCI bus number of the bus in the PCI'to'PCI bridge primaryinterface #Primary 1us Jumber(, the PCI bus number of the PCI bus in its secondary interface#

  • 8/10/2019 System Address Map Initialization in x86

    31/71

    ;igure : shows an illustration of PCI'to'PCI bridge primary and secondary interface in ahypothetical platform2the platform components3 inner wor ings are the same with a real worldsystem despite the platform is hypotheticalA it3s 6ust simplified to ma e it easier to understand& PCIbus 7 connects to the PCI'to'PCI bridge primary interface and PCI bus 5 connects to the PCI'to'PCI bridge secondary interface in ;igure :&

    Figure %; &CI-to-&CI Bridge Interf ce"

    The PCI'to'PCI bridge forwards an IO transaction downstream" #from the primary interface to thesecondary interface( if the IO limit register contains a value greater than the IO base registervalue and the transaction address falls within the range covered by both registers& )i ewise, thePCI'to'PCI bridge forwards a memory transaction downstream" if the memory limit register

  • 8/10/2019 System Address Map Initialization in x86

    32/71

    contains a value greater than the memory base register value and the transaction address fallswithin the range covered by both registers&

    There is a fundamental difference between the memory base%limit register and the prefetchablememory base%limit register& The memory base%limit registers are used for memory rangesoccupied by devices that have a side effect on read transactions& The prefetchable memorybase%limit registers are used only for devices that don3t have side effects on read because, in thiscase, the PCI'to'PCI bridge can prefetch the data a on read transaction from the device withoutproblems& Prefetching wor s because there is no side effect on the read transaction& 8notherdifference is that the prefetchable memory base%limit registers are able to handle devices locatedabove the 0@1 limit because they can handle /0'bit address space&

    There are no memory base%limit registers for devices mapped above 0@1 because the PCIspecification assumes all devices that re+uire large address ranges behave li e memory, i&e&,their memory" contents are prefetchable and don3t have side effects on reads& Therefore, the PCI

    specification implies that devices with large address range consumption should implementprefetchable memory base%limit registers instead of memory base%limit registers and all deviceswith memory that have side effects on read should be mapped to address ranges below the 0@1limit by the platform firmware&

    8 fact sometimes overloo ed when dealing with PCI'to'PCI bridge is that the bridge forwardsmemory transactions upstream" # from the secondary interface to the primary interface)* + i&e&,from PCI device to the direction of the CP92if the transaction address range doesn3t fall withinthe range covered by the memory base%limit or prefetchable memory base%limit registers&Perhaps, you3re as ing, why is this behavior neededK The answer is because we need direct

    memory access #BM8( to wor for devices connected to the PCI'to'PCI bridge secondaryinterface& In BM8, the device downstream" of the PCI'to'PCI bridge initiates the transaction #toread'from or write'to R8M( and the PCI'to'PCI bridge must ensure that the transaction isforwarded from the device in upstream" direction toward the R8M&

    Bevices in BM8 #in this case PCI devices( need to write data into the system memory2the so'called BM8 write transaction& If you loo at ;igure :, the BM8 write transaction for devicesconnected to the PCI'to'PCI bridge secondary interface must go through the PCI'to'PCI bridge toreach the system memoryA if the PCI'to'PCI bridge doesn3t forward the write transaction

    upstream," BM8 cannot wor because the contents from the device cannot be written to the

    system memory&

    Jow, let3s have a loo at an example of a memory transaction that3s forwarded downstream" byPCI'to'PCI bridge in ;igure :& 1efore we proceed to examine the example, we are going to ma eseveral assumptions$

  • 8/10/2019 System Address Map Initialization in x86

    33/71

    The system in ;igure : has .@1 system memory& The first :@1 of the system memory is

    mapped to the lowest :@1 of the CP9 memory address spaceA the rest is mapped toaddress range 0@1'to'H@1 in the CP9 memory address space2above the 0@1 limit&

    The platform firmware has initiali*ed all of the base address registers #18Rs( of the PCI

    devices in the systemA including the PCI'to'PCI bridge 18Rs& The platform firmwareinitiali*ed the CP9 memory range from :@1 to 0@1 to be used by PCI devicesA of courseoutside of the hardcoded range used by advanced programmable interrupt controller#8PIC( logic, the platform firmware flash ROM chip and some other legacy systemfunctions in the memory range close to the 0@1 limit&

    Contents of the initiali*ed PCI devices 18Rs and related registers are as follows$PCI device 7, only one 18R in use with 7/ M1 #non'prefetchable( memory spaceconsumption starting at address E000_0000h #:&>@1(& This 18R claimstransactions to E000_0000h E0FF_FFFFh non'prefetchable memory range&PCI device 5, only one 18R in use with 7/ M1 #non'prefetchable( memory spaceconsumption starting at address E100_0000h #:&>@1 = 7/ M1(& This 18R

    claims transactions to E100_0000h E1FF_FFFFh non'prefetchable memoryrange&PCI device :, only one 18R in use with :5 M1 #prefetchable( memory spaceconsumption starting at address E200_0000h #:&>@1 = :5 M1(& This 18R claimstransactions to E200_0000h E6FF_FFFFh prefetchable memory range&PCI device 0, only one 18R in use with 75. M1 #prefetchable( memory spaceconsumption starting at address C000_0000h #:@1(& This 18R claimstransactions to C000_0000h C7FF_FFFFh prefetchable memory range&PCI device >, only one 18R in use with 75. M1 #prefetchable( memory spaceconsumption starting at address C800_0000h #:@1 = 75.M1(& This 18R claims

    transactions to C800_0000h CFFF_FFFFh prefetchable memory range&PCI device /, only one 18R in use with 5>/ M1 #prefetchable( memory spaceconsumption starting at address D000_0000h #:@1 = 5>/M1(& This 18R claimstransactions to D000_0000h DFFF_FFFFh prefetchable memory range&PCI'to'PCI bridge address and routing related configuration registers contents$

    Primary 1us Jumber Register$ 7

  • 8/10/2019 System Address Map Initialization in x86

    34/71

    7& The CP9 core issues a read" transaction& This read transaction reaches the integratedhostbridge%northbridge&

    5& The northbridge forward the read" transaction to the southbridge because it nows thatthe re+uested address resides in the southbridge&

    :& The southbridge forwards the read" transaction to the PCI bus 7, which is connecteddirectly to it&

    0& The PCI'to'PCI bridge claims the read" transaction because it3s within its assignedrange& The PCI'to'PCI bridge claims the read" transaction and responds to it becausethe re+uested address is within the range of its prefetchable memory range #between thevalue of the prefetchable memory base and prefetchable memory limit(&

    >& The PCI'to'PCI bridge forwards the read" transaction to its secondary bus, PCI bus 5&/& PCI device / claims the read" transaction in PCI bus 5 because it falls within the range

    of its 18R&G& PCI device / returns the data at the target address # D100_0000h ( via PCI bus 5&

    .& The PCI'to'PCI bridge forwards the data to the southbridge via PCI bus 7&

    H& The southbridge forwards the data to the CP9&7D& The northbridge in the CP9 then places the data in R8M and the read transaction

    completes&

    ;rom the sample above, you can see that the PCI'to'PCI bridge forwards read%write transactionfrom its primary interface to its secondary interface if the re+uested address falls within its range&If the read%write transaction doesn3t fall within its configured range, the PCI'to'PCI bridge wouldnot forward the transaction from its primary interface to its secondary interface&

    8 seldom nown fact about PCI'to'PCI bridge is the presence of a subtractive decode PCI'to'PCI

    bridge& The decoding" method explained in the example above2to claim the read" transaction2is nown as positive decode, i&e&, the device claims the transaction if it3s within its assignedrange #in one of its 18R(& The reverse of positive decode is nown as subtractive decode& Insubtractive decode the device2with subtractive decode support2claims the transaction if thereis no other device on the bus that claims the transaction, irrespective of whether the transaction iswithin the device range or not& There could only be one subtractive decode device in one PCI bustree& There is a certain class of PCI'to'PCI bridge device that supports subtractive decode& It wasused to support address decoding of legacy devices2such as a 1IO< chip2in older chipsets&4owever, this techni+ue is largely abandoned in modern'day chipsets because there is alreadylegacy'support logic in the chipset and the CP9&

    PCIe Device Types

    ?ou have learned all the re+uired prere+uisites to understand PCIe protocol in the previoussection& Jow let3s start by loo ing into PCIe device types based on their role in a PCIe device treetopology& This is important to understand because you need a fundamental understanding of

  • 8/10/2019 System Address Map Initialization in x86

    35/71

    PCIe device types to understand PCIe devices initiali*ation& PCIe devices are categori*ed asfollows$

    7& PCIe root comple9 & The root complex is similar to northbridge in PCI'based system& Itacts as the glue" logic to connect the PCIe device tree to main memory #R8M(, and theCP9& In many cases, the root complex also provides high speed PCIe connection to the@P9& The root complex can be implemented as part of the northbridge in systems thatemploy two physical chips for the chipset logic& 4owever, nowadays the root complex isalways integrated into the CP9 chip, as you can see in ;igure 7& ;igure 7 shows thePCIe root complex as part of the hostbridge that3s integrated into the CP9& The rootcomplex connects to the PCIe device tree through a logical port nown as root port & Itis a logical port because the root port can be implemented physically in a chip outside thechip containing the root complex& ;or example, the root complex can reside in the CP9,but the root port is located in the chipset& The 4aswell CP9 and Intel .'series PC4implements this root port arrangement& Jote that the root complex can have more than

    one root port&5& PCIe "

  • 8/10/2019 System Address Map Initialization in x86

    36/71

    lin s to the root port via the chipset interconnect& There is no difference between them from aPCIe logic point of view&

    Figure *; &CIe S

  • 8/10/2019 System Address Map Initialization in x86

    37/71

    two different PCIe devices& !ach lin consists of one or more l ne" & !ach lane consists of a pair of physical interconnects, one in the outgoing direction from the PCIe device and one in theincoming direction to the PCIe device& The physical interconnect uses differential signaling totransmit the PCIe pac ets in either direction&

    8t this point, PCIe device basics should be clear to you& In the next section I3ll go through thedetails of communication between PCIe devices&

    PCIe Pac/ets and Device

  • 8/10/2019 System Address Map Initialization in x86

    38/71

    Figure 5; &CIe & cket-B "ed Communic tion

    There are three types of pac ets in PCIe protocol #as seen from the highest level of abstractiondown to lowest level pac et sent over the PCIe lin ($

    7& Transaction layer pac et #T)P(2The transaction layer in the PCIe device constructs thispac et, as seen in ;igure >& the T)P consists of a T)P header and the data content beingtransmitted& The source of the data content is the PCIe device core and the PCIe corelogic interface in the device& The T)P header contains CRC, among other data& T)P cantravel through the PCIe device tree, passing through more than one PCIe devicesbetween the source and the destination device& This loo s li e the pac et tunnels"through the PCIe device in between the source and the destination& 1ut in reality thepac et is 6ust routed through the PCIe device tree& The PCIe device in between thesource and the destination must be a PCIe switch because only a switch canforward%route pac ets from its ingress port to its egress port&

    5& Bata lin layer pac et #B))P(2The data lin layer in the PCIe device constructs thispac et, as seen in ;igure >& B))P wraps the T)P in yet another header& B))P providesanother CRC for the pac et in the B))P header& B))P can only travel between PCIedevices directly connected to each other through a PCIe lin & Therefore, the purpose ofthe CRC is different from that provided by the T)P because B))P CRC is used to ma esure that the neighboring PCIe device receives the pac et correctly& There are also somespecific B))Ps which don3t contain any T)P pac et inside of it, such as B))P for linpower management, flow control, etc&

  • 8/10/2019 System Address Map Initialization in x86

    39/71

    :& Physical layer pac et #P)P(2The physical layer in the PCIe device constructs thispac et, as seen in ;igure >& P)P wraps the B))P into one or several P)Ps, depending onthe si*e of the B))PA if the si*e of the B))P cannot fit into a single P)P, the P)P logicdivides the B))P into several frames" of P)Ps& The P)Ps are transmitted in the linbetween two connected PCIe devices& There are also some specific P)Ps that don3tcontain any B))P pac et, such as P)P for lin training, cloc tolerance compensation,etc&

    The explanation about PCIe pac et types above implies that a PCIe device must have threedevice layers, one for each type of pac et& In practice, that3s not always the case& 8s long as thePCIe device can create PCIe pac ets that conform to the specification, it3s fine&

    PCIe Address Spaces

    ?ou now from the previous section that PCIe is a pac et'based chip'to'chip communicationprotocol& This means that the protocol re+uires some means to route the B))P or T)P betweenchips& B))P can only reach directly lin ed PCIe chips& Therefore, we are more interested in T)Prouting because in several cases the target of a read%write transaction lies several chips awayfrom the source of the read%write transaction& There are several mechanisms to route the T)P&4ere, we are going to loo into one of them, namely, the T)P routing based on address, also

    nown as address routing &

    There are four address spaces in PCIe& In contrast, PCI only have three address spaces& PCIeaddress spaces are as follows$

    7& PCIe configuration space2This address space is used to access the PCI'compatibleconfiguration registers in PCIe devices and also the PCIe enhanced configurationregisters& Part of this address space that resides in the CP9 IO space is provided forbac ward compatibility reasons2i&e&, compatibility with PCI bus protocol& The rest of thePCIe configuration space resides in the CP9 memory space& The access mechanism forthe first 5>/ registers is the same as in PCI for x/0 architecture, i&e&, using IOport 05CF8-05CFB for address and 05CFC-05CFF for data& 8s in PCI devices, there are

    5>/ eight'bit configuration space registers that are mapped to this IO address space inPCIe& The first 5>/ byte configuration registers are immediately available at the very earlyboot stage via the CP9 IO space #because the mapping doesn3t re+uire firmware

    initiali*ation(, while the rest are available only after the platform firmware finishesinitiali*ing CP9 memory space used for PCIe configuration space& PCIe supports a larger number of configuration space registers than PCI& !ach PCIe device has 0 1configuration space registers& The first 5>/ bytes of those registers are mapped to boththe legacy" PCI configuration space and to PCIe configuration space in the CP9memory space& The entire 0 1 PCIe configuration space registers can be accessed viaPCIe enhanced configuration mechanism& PCIe enhanced configuration mechanism uses

  • 8/10/2019 System Address Map Initialization in x86

    40/71

    the CP9 memory space instead of the CP9 IO space #PCI configuration mechanismuses the CP9 IO space in x./%x/0 architecture(&

    5& PCIe memory space2This address space lies in the CP9 memory address space, 6ustas in PCI& 4owever, PCIe supports /0'bit addressing by default& Part of the PCIeconfiguration register is located in the PCIe memory space& 4owever, what is meant byPCIe memory space in this context is the CP9 memory space consumed by PCIedevices for non'configuration purposes& ;or example, the CP9 memory space used tostore PCIe device data, such as for local R8M in PCIe networ controller card or PCIegraphics card local R8M used for graphics buffer&

    :& PCIe IO space2This is the same as the IO space in PCI bus protocol& It exists only forPCI bac ward compatibility reason&

    0& PCIe message space2This is a new address space not previously implemented in PCI&This address space exists to eliminate the need for physical sideband signals& Therefore,everything that used to be implemented physically in previous bus protocols, such as theinterrupt sideband signal, is now implemented as messages in the PCIe device tree& Fe

    are not going to loo deeper into this address space& It3s enough to now its purpose&

    This article only deals with two address spaces of the four PCIe address spaces explained above,PCIe configuration space and PCIe memory space& Fe are going to loo into the PCIeconfiguration space in the PCIe configuration mechanism section later& In this section, we3regoing to loo into the PCIe memory space in detail&

    ;or the sample, we3re going to proceed to scrutini*e a PCIe memory read transaction that goesthrough the PCIe fabric #device tree(, a read transaction routed via address'routing& Fe3re goingto loo at a +uite complicated PCIe platform that contains a PCIe switch& This ind of

    configuration usually doesn3t exist on a des top'class PCIe platform, only on a server'class PCIeplatform& The complexity of the sample would ma e it a lot easier for the reader to deal withdes top'class hardware in a real'world scenario because the latter is simpler compared to server'class platform&

    ;igure / shows the sample memory read transaction with targets address at C000_0000h #:@1(&

    The memory read transaction originated in the CP9 core 7, and the target is the contents of thePCIe Infiniband networ " controller local" memory because that address is mapped to the latterdevice3s memory& The transaction is routed through the PCIe fabric& The double arrow in the readtransaction path in ;igure /2mar ed as a dashed purple line2indicates that the path ta en toget to the PCIe device memory contents is identical to the path ta en by the re+uested data bacto CP9 core 7&

    8ddress'routing in the PCIe fabric can only happen after all the address'related registers in allPCIe devices in the fabric are initiali*ed& Fe assume that the platform firmware initiali*es theplatform in ;igure / as follows$

  • 8/10/2019 System Address Map Initialization in x86

    41/71

    7& The system has .@1 R8MA :@1 mapped to the D'to':@1 memory range and the restmapped to the 0@1'to'.@1 memory range& The mapping is controlled by the respectivemapping registers in the hostbridge&

    5& The PCIe Infiniband networ controller has :5M1 of local memory, mapped toaddresses C000_0000h to C1FF_FFFFh #:@1 to :@1=:5M1'7(&

    :& The PCIe

  • 8/10/2019 System Address Map Initialization in x86

    42/71

    Figure ,; &CIe $emor! 'e d r n" ction S mple /oing t+roug+ t+e &CIeF ric =i #ddre"" 'outing

    Jow let3s loo at the steps ta en by the read transaction shown in ;igure /$

    7& The memory read transaction with target address at C000_0000h originated in the

    CP93s core 7&5& The memory read transaction reaches the hostbridge& The hostbridge mapping register

    directs the transaction to the PCIe root complex logic because the address maps to thememory range claimed by PCIe&

    :& The PCIe root complex logic in the hostbridge recogni*es the memory read transactionas targeting the PCIe fabric2due to the hostbridge mapping registers setting2andconverts the memory read transaction into a PCIe read T)P&

    0& The T)P is placed in logical" PCI bus D& PCI bus D originates in the PCIe root complexlogic and ends in virtual PCI'to'PCI bridge 7 in the PCIe switch inside the chipset& Jotethat the chipset interconnect logic is transparent with respect to PCI and PCIe protocol&

    >& irtual PCI'to'PCI bridge 7 chec s the target address of the T)P& In the beginning, virtualPCI'to'PCI bridge 7 chec s whether the T)P target address is within virtual PCI'to'PCIbridge 7 itself by comparing the target address with the value of its 18R D and 18R 7registers& 4owever, virtual PCI'to'PCI bridge 7 18R D and 18R 7 don3t claim any

  • 8/10/2019 System Address Map Initialization in x86

    43/71

    memory read%write transaction as per the platform firmware initiali*ation value& Then itchec s whether the target address of the T)P is within the range of one of its memorybase%limit or prefetchable memory base%limit registers& ;igure / shows both of thesesteps in point #a( and #b(& irtual PCI'to'PCI bridge 7 found that the target address of theT)P is within the range of the range of its prefetchable memory range& Therefore, the

    irtual PCI'to'PCI bridge 7 accepts the T)P in PCI bus D and routes the T)P to PCI bus7&

    /& irtual PCI'to'PCI bridge 5 and virtual PCI'to'PCI bridge : do a similar thing to the T)P inPCI bus 7 as virtual PCI'to'PCI bridge 7 did on PCI bus D, i&e&, chec their own 18Rs andtheir memory base%limit and prefetchable memory base%limit& irtual PCI'to'PCI bridge 5found that the T)P target address is in its secondary interface& Therefore, virtual PCI'to'PCI bridge 5 accepts the T)P in PCI bus 7 and routes the T)P to PCI bus 5&

    G& The PCIe Infiniband networ controller in PCI bus 5 chec s the target address of the T)Prouted by virtual PCI'to'PCI bridge 5 and accepts the T)P because the target address iswithin the range of one of its 18Rs&

    .& The PCIe Infiniband networ controller returns contents of the target address to the CP9via the PCIe fabric& Jote$ we are not going to scrutini*e this process in detail because wehave learned the details of the T)P address'based routing from the CP9 to the PCIeInfiniband networ controller&

    8t this point, PCIe address spaces and PCIe address routing should be clear& The next sectionfocuses on PCIe configuration space and the mechanism for routing PCIe configurationtransactions to their targets&

    PCIe Con$ig%ration Mechanisms

    ?ou need to now PCIe configuration mechanisms because they are the methods used toinitiali*e all of the PCIe devices in a platform that implements PCIe& There are two types ofconfiguration mechanisms in PCIe& as follows$

    7& The PCI'compatible configuration mechanism2This configuration mechanism is identicalto the PCI configuration mechanism& On an x./%x/0 platform, this configurationmechanism uses IO port CF8h-CFBh as the address port, and IO port CFCh-CFFh as the

    data port to read%write values from%into the PCI configuration register of the PCIe device&This configuration mechanism can access 5>/'bytes configuration registers per device2

    refer to the PCI configuration register section in the first part of this series#at http$%%resources&infosecinstitute&com%system'address'map'initiali*ation'in'x./x/0'architecture'part'7'pci'based'systems% ( for details on PCI configuration mechanisms&PCIe supports 0 1 of configuration registers per device, in contrast to only" 5>/ bytessupported by PCI& The rest of the configuration registers can be accessed via the secondPCIe configuration mechanism, the PCIe enhanced configuration mechanism&

    http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/http://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/
  • 8/10/2019 System Address Map Initialization in x86

    44/71

    5& The PCIe enhanced configuration mechanism2In this configuration mechanism, theentire PCIe configuration registers all of the PCIe devices that are mapped to the CP9memory space, including the first 5>/'bytes PCI'compatible configuration registers,which are mapped to both the CP9 IO space and the CP9 memory space2see point 7above& The CP9 memory range occupied by the PCIe configuration registers must bealigned to the 5>/M1 boundary& The si*e of the memory range is 5>/M1& The calculationto arrive in this memory range si*e is simple$ each PCIe device has 0 1 configurationregisters, PCIe supports the same number of buses as PCI, i&e& 5>/ buses, :5 devicesper bus, and . functions per device& Therefore, the total si*e of the re+uired memoryrange is$ 5>/ x :5 x . x 0 1A which is e+ual to 5>/M1&

    One of the implications of the PCIe configuration mechanism is that the first 5>/'bytes of each ofthe PCIe device configuration registers are mapped into two different spaces, the CP9 IO space2through the PCI'compatible configuration mechanism2and the CP9 memory space2throughthe PCIe enhanced configuration mechanism& If you are still confused about this explanation, ta e

    a loo at ;igure G& ;igure G shows mapping of the PCIe device configuration space registers ofone PCIe device into the CP9 IO space and CP9 memory space&

    Figure 4; &CIe De=ice Configur tion Sp ce 'egi"ter $ pping " Seen fromt+e C&

    ?ou might be as ing why PCIe systems still need to implement the PCI configuration mechanism&The first reason is to provide bac ward'compatibility to operating systems that existed prior to thePCIe being adopted and the second reason is to provide a way to initiali*e the PCIe enhanced

  • 8/10/2019 System Address Map Initialization in x86

    45/71

    configuration mechanism& On an x/0 platform, the CP9 memory range consumed by the PCIeenhanced configuration mechanism is not hardcoded to a certain CP, memory range it.srelocatable in the /0-bit CP, memory space & The platform firmware must initiali*e certain register in the PCIe root complex logic to map the PCIe devices3 configuration registers to certain addressin the /0'bit CP9 memory space& The start address to map the PCIe configuration registers mustbe aligned to 5>/M1 boundary& On the other hand, location of the PCI configuration registers inthe CP9 IO space is hardcoded in x./ and x/0A this provides a way to initiali*e the register thatcontrols the mapping of all of the PCIe configuration registers2in the PCIe root complex2viaPCI'compatible configuration mechanism because PCI'compatible configuration mechanism isavailable at all times, including very early at system boot&

    8 PCIe enhanced configuration mechanism has an implication that reading or writing the PCIeconfiguration registers of a PCIe device re+uires a memory read or write & This is a contrast to thePCI configuration mechanism, where the code to do the same thing re+uires an IO read or IOwrite& This approach was a trend in the hardware world in the late HDSs2i&e&, moving all hardware'

    related registers to CP9 memory space to simplify hardware and system software design& It wasnot adopted 6ust by the PCIe bus protocol but also by other bus protocols in CP9 architecturesother than x/0&

    Figure 8; &CIe En+ nced Configur tion $ec+ ni"m #ddre"" Bit" $ ppingto C& $emor! Sp ce

    ;igure . shows mapping of the PCIe enhanced configuration space into the /0'bit CP9 memoryspace& This is the brea down of the /0'bit PCIe enhanced configuration space register address$

    7& 8ddress bits 5.'/: are upper bits of the 5>/M1'aligned base address of the 5>/M1memory'mapped IO address range allocated for the enhanced configuration mechanism&The manner in which the base address is allocated is implementation'specific& Platformfirmware supplies the base address to the O

  • 8/10/2019 System Address Map Initialization in x86

    46/71

    8s in PCI configuration register address accesses, reading or writing to PCIe enhancedconfiguration registers must be aligned into a dword #:5'bit( boundary& This is because the CP9and the chipset in the path to the PCIe enhanced configuration register only guarantee thedelivery of configuration transactions if they are aligned to a :5'bit boundary&

    In x/0 architecture, a special register in the CP92part of the PCIe root complex logic2controlsthe :/'bit PCIe enhanced configuration space base address& This base address register must beinitiali*ed by the platform firmware on boot& The register initiali*ation is carried out through a PCI'compatible configuration mechanism because, at very early boot, the register contains a defaultvalue that is not usable to address the registers in the PCIe enhanced configuration space& Fe3llhave a loo deeper into the implementation of this base address when we dissect the PCIe'basedsystem address map later&

    Jow, let3s loo at a simple example of PCIe enhanced configuration register mapping into theCP9 address space& )et3s ma e these assumptions$

    7& The base address of the PCIe enhanced configuration address space is setto C400_0000h #:@1=/0M1( in the PCIe root complex register&

    5& The target PCIe device resides in bus number one #7(&:& The target PCIe device is device number *ero #D( in the corresponding bus&0& The target PCIe function is function number *ero #D(&>& The target register resides at offset 5>/ #7DDh( in the PCIe device configuration space&/&

  • 8/10/2019 System Address Map Initialization in x86

    47/71

    Figure :; &CIe De=ice C p ilitie" 'egi"ter Set

  • 8/10/2019 System Address Map Initialization in x86

    48/71

    ;igure H shows a capabilities pointer register2highlighted in purple2in PCIe device configurationspace pointing to the PCIe capabilities register set& In practice, the capabilities pointer registerpoints to the start of PCIe capabilities register set by using an .'bit offset #in bytes( of the start ofPCIe capabilities register set& The offset is calculated from start of the PCIe device configurationspace& This .'bit offset is stored in the capabilities pointer register& The position of the PCIecapabilities register set is device'specific& 4owever, the PCIe capabilities register set isguaranteed to be placed in the first 5>/ bytes of the PCIe device configuration space and locatedafter the mandatory PCI header& 1oth a type D or type 7 header must implement the PCIecapabilities register set in a PCIe device configuration space&

    Jow, let3s loo more closely at part of the PCIe capabilities register set& ;igure H shows the thirdregister in the capabilities register set is the PCIe capabilities register& ;igure 7D shows format ofthis register3s contents&

    Figure 1 ; &CIe C p ilitie" 'egi"ter Form t

    Bevice%port type bits #bits 0'G( in the PCIe capabilities register are the ones that affect the PCIedevice mapping to the system address map& Bevice%port type bits determine whether the PCIedevice is a native PCIe endpoint function or a legacy PCIe endpoint function& Bifferencesbetween the two types of PCIe device are$

    7& The value of device%port type bits in a native PCIe endpoint function is 0000b & Jative

    PCIe endpoint function devices must map all of the device components, such as itsregisters and local memory, to the CP9 memory space at runtime2from inside a runningO

  • 8/10/2019 System Address Map Initialization in x86

    49/71

    5& alue of device%port type bits in a legacy PCIe endpoint function is 0001b & )egacy PCIe

    endpoint function devices are permitted to use the CP9 IO space even at runtime& ThePCIe specification assumes that legacy PCIe endpoint function devices act as front'endto legacy bus, such as PCI or PCI'-&

    Jow, it3s clear that the contents of the PCIe capabilities register determine whether the PCIedevice will map its 18Rs to the CP9 memory space or to the CP9 IO space at runtime& There arespecial cases though, especially when dealing with legacy IO devices& ;or example, legacy PC'compatible devices such as @8 and IB! controllers fre+uently expect to be located within fixedlegacy IO ranges&

  • 8/10/2019 System Address Map Initialization in x86

    50/71

    Figure 11; &CI &CIe $emor! B#' Form t

    ;igure 77 shows the memory 18R format& ;igure 77 shows that the lowest bit is hardcoded to D inthe 18R that map to CP9 memory space& It also shows that bit 7 and bit 5 determine whether the18R is a :5'bit 18R or /0'bit 18R&

    ;igure 77 shows that bit : controls the prefetching in the 18R that map to CP9 memory space&Prefetching in this context means that the CP9 fetches the contents of memory addressed by the18R before a re+uest to that specific memory address is made, i&e&, the fetching" happens inadvance, hence pre"'fetching& This feature is used to improve the overall PCI%PCIe devicememory read speed&

    The main difference between a PCI and PCIe memory 18R is that all memory 18R registers inPCIe endpoint functions with the prefetchable bit set to 7 must be implemented as /0'bit memory18Rs& Memory 18Rs that do not have the prefetchable bit set to 7 may be implemented as :5'bit18Rs& The minimum memory range re+uested by a memory 18R is 75. bytes&

    8nother difference between PCIe and PCI is the notion of a dual address cycle #B8C(& PCIe is aserial bus protocol and doesn3t implement B8C& PCIe was designed with native /0'bit addressingin mind& Therefore, support for memory transactions targeting /0'bit addresses is native in PCIe&

    There is no performance penalty for carrying out memory transactions targeting /0'bit addresses&

    PCIe BA& Si)ing

    The algorithm for PCIe 18R si*ing is the same as the algorithm for PCI device 18R si*ingexplained in the first article& The difference lies only in prefetchable memory 18R, because a

  • 8/10/2019 System Address Map Initialization in x86

    51/71

    prefetchable memory 18R in PCIe must be /0 bits wide, the 18R si*ing algorithm must use twoconsecutive :5'bit 18Rs instead of one :5'bit 18R during 18R si*ing&

    Dissecting PCIe-Based System Address Map

    In this section we loo at an implementation sample of the system address map in x./%x/0 beforeproceeding to the system address map initiali*ation in more detail& The implementation sample isbased on 4aswell2with integrated northbridge%hostbridge2and the Intel .'series PC4 platform&This platform implements the PCIe bus and it3s an up'to'date platform& Therefore, it3s a perfectexample to learn real'world PCIe implementation&

    Intel .'series PC4 can be viewed as southbridge in the classic system layoutA however, both arenot the same logic because there are some functions in the PC4 that absent in the classic"southbridge& ?ou can download the CP9 datasheetfrom http$%%www&intel&com%content%www%us%en%processors%core%CoreTechnicalResources&html andPC4 datasheet from http$%%www&intel&com%content%www%xr%en%chipsets%.'series'chipset'pch'datasheet&html &

    PCIe differs from PCI in that PCIe moves everything to CP9 memory space, including itsconfiguration space, as you can see from the PCIe configuration mechanisms section& Thepresence of part of PCIe configuration registers in the CP9 IO space is only for bac wardcompatibility reasons& This fact means the CP9 memory space in a PCIe'based system is a bitmore fragmented compared to PCI'based systems& 4owever, this approach pays bac in terms of less complication in CP9 design and +uic er access to all of the memory ranges mapped to theCP9 memory space, including PCIe configuration registers, because access to CP9 memory

    space is +uic er than access to IO space by default&

    "as:ell CP and Intel 8-series Chipset Plat$orm

    ;igure 75 shows a bloc diagram of systems with 4aswell CP9 and .'series chipset combination&;igure 75 shows the entire connection from the chipset to other components in the system,including those that might not exist in all chipset stoc eeping units #< 9s(&

    http://www.intel.com/content/www/us/en/processors/core/CoreTechnicalResources.htmlhttp://www.intel.com/content/www/xr/en/chipsets/8-series-chipset-pch-datasheet.htmlhttp://www.intel.com/content/www/xr/en/chipsets/8-series-chipset-pch-datasheet.htmlhttp://www.intel.com/content/www/us/en/processors/core/CoreTechnicalResources.htmlhttp://www.intel.com/content/www/xr/en/chipsets/8-series-chipset-pch-datasheet.htmlhttp://www.intel.com/content/www/xr/en/chipsets/8-series-chipset-pch-datasheet.html
  • 8/10/2019 System Address Map Initialization in x86

    52/71

    Figure 12; Intel H "

  • 8/10/2019 System Address Map Initialization in x86

    53/71

    Figure 1%; $emor! r n" ction" 'outing in H "

  • 8/10/2019 System Address Map Initialization in x86

    54/71

    entering the memory%BR8M controller& The control registers that control this memoryrange are top of lo< u" le D'#$ #TO)9B( register and rem p "e register&1oth registers are in the hostbridge& TO)9B controls CP9 memory range occupied bythe BR8M below 0@1& Remap base is only in use if the system BR8M si*e is e+ual to orlarger than 0@1A in this case remap base mar s the end of the normal" CP9 BR8Mrange above 0@1&

    5& Remapped BR8M range logic2This logic bloc routes memory transactions #read%write(targeting range covered by BR8M that re+uires remapping , i&e&, the target address of thetransaction need to be translated before entering the memory%BR8M controller& There aretwo control registers that control the remapped memory range, the rem p

    "e and rem p limit registers& The registers are in the hostbridge&:& Compatibility memory range logic2This logic bloc routes memory transactions

    #read%write( targeting range covered by the compatibility memory range& This memoryrange comprises the range between A_0000h to F_FFFFh and the IM1 to 7/M1(& This memory range is further divided into

    three sub'ranges$7& )egacy @8 memory range lies between A_0000h and B_FFFFh 2 0/#

    memor! m p mode control register controls mapping of compatibilitymemory range from A_0000h to B_FFFFh & This range may be mapped to PCIe,

    BMI or Internal @raphics Bevice #I@B(, depending on the 0/# memor! m pmode control register value& Therefore, a memory transaction targeting memoryranges between A_0000h and B_FFFFh will be routed to either PCIe or BMI or

    I@B&5& Jon' @8 compatibility and non'"IM1'7/M1(2

    8 leg c! cce"" control #)8C( register in the hostbridge controls routing of memory transactions targeting the IM1'7/M1(& 8ll memory transactions targeting

    this compatibility memory range are routed either to the memory%BR8M controller or to the .'series PC4 chipset #Intel 4.G chipset in ;igure 75( via the BMIinterface, depending on values in the )8C control register& The I

  • 8/10/2019 System Address Map Initialization in x86

    55/71

    targeting this compatibility memory range are always routed to the BMI interface,except those targeting the MI$I registers control access to the PCIegraphics memory range if the external PCIe graphics uses memory range above 0@1&1oth registers are part of the PCIe controllers integrated into the 4aswell CP9&

    >& PCIe 5&D%PCI memory range logic& This logic bloc routes memory transactions

    #read%write( targeting range from the value of TO)9B register to 0@1 to the .'series PC4chipset via the BMI interface& This logic bloc also routes memory transactions#read%write( targeting range between &$B#SE and &$>I$I registers2but don3tfall within the range covered by the PCIe :&D graphics memory range2if the system has0@1 R8M or more & The range from TO)9B value to 0@1 is set aside for PCI%PCIememory& The PCI%PCIe memory range that3s not claimed by the PCIe :&D graphicsresides in the .'series PC4 chipset& The control register for this range is the TO)9B,PM818

    8ll five memory transaction routing logic bloc s are mutually exclusive , i&e&, every memory

    transaction must be claimed only by either one of them& There should be only one memorytransaction routing logic bloc that claims one memory transaction& 8narchy" in memorytransaction routing could happen though& 8narchy in this context means more than one logicbloc claims a memory transaction& 8narchy happens if the platform firmware initiali*es one ormore control registers of these logic bloc s incorrectly&

    "as:ell System Address Map

    In the preceding section, you have learned how memory transactions are routed in 4aswell by thenorthbridge based on the target address of the transactions& This section delve into the result ofthe routing, the system address map& The presence of address remapping in the northbridgema es the system address map +uite complicated, i&e&, the address map depends on the point ofview, whether the address map is seen from the CP9 core#s( perspective or not& ;igure 70 showsa 4aswell system address map with 0@1 R8M or more& I choose not to tal about 4aswellsystems with less than 0@1 of R8M because address remapping is not in use in suchconfiguration&

  • 8/10/2019 System Address Map Initialization in x86

    56/71

    Figure 1*; H "

  • 8/10/2019 System Address Map Initialization in x86

    57/71

    1oxes with light blue color in ;igure 70 represent memory ranges occupied by R8M& This meansthat the BR8M controller sees the available R8M as a contiguous memory range while the CP9core doesn3t& The CP9 core view contains holes" in the memory range below 0@1 that don3tbelong to R8M2the holes" are mar ed as boxes with non'light'blue colors in ;igure 70&

    Betail of the memory ranges in ;igure 70 as follows$

    7& )egacy address range #as seen from CP9 core perspective(2This range is the BO/'byte PCIe configuration space registers are mapped to the CP9 IOspace at port CF8h-CFFh , 6ust as in the legacy PCI bus2in addition, these registers are also

    mapped to the PCIe enhanced configuration space&

    Contrary to legacy PCI configuration space, the entire PCIe configuration space #0 1 per'device(is located in the CP9 memory space& On the x./%x/0 platform, the memory range consumed by

    the PCIe configuration space is relocatable in the CP9 memory space& The platform firmwaremust initiali*e the location of this configuration space in the CP9 memory space& Fe should loomore closely into 4aswell'specific implementation in this section&

    Jow, let3s calculate the memory space re+uirement of the PCIe configuration space registers$

    7& The maximum number or PCIe buses in the system is 5>/5& The maximum number of PCIe devices per bus is :5:& The maximum number of function per device is .0& !ach function can implement up'to 0 1 configuration registers

    9sing the statistics above, the entire PCIe configuration space registers re+uires$ 5>/ x :5 x . x0 1 of memory space& This amounts to 5>/M1 of memory space& Therefore, the platformfirmware must initiali*e the system address map to accommodate this PCIe configuration spacere+uirement& 4owever, in practice, the memory space re+uirement of the PCIe enhancedconfiguration space in a particular system can be less than 5>/M1 because the system cannotsupport that many PCIe devices physically&

    In most cases, the PCIe enhanced configuration space is carved out of the PCI%PCIe memoryrange& The PCIe configuration space can be mapped to the PCI%PCIe memory range below 0@1#from TO)9B to the 0@1 limit( or mapped to PCI%PCIe memory above the 0@1 limit #aboveTO99B( in the 4aswell memory map, as shown in ;igure 7/&

    On the 4aswell platform, the PCI express register range base address # &CIE B#' (2aregister2in the hostbridge determines the location of the PCIe enhanced configuration space&PCI!-18R contents determine the start address and the si*e of the PCIe enhanced configurationspace& ;igure 7/ shows the two possible alternatives to map the PCIe enhanced configurationspace& They are mar ed as Mapping 8lternative 7N #within the PCI%PCIe memory range below

  • 8/10/2019 System Address Map Initialization in x86

    62/71

    0@1( and Mapping 8lternative 5N #within the PCI%PCIe memory range above TO99B(&PCI!-18R can set the si*e of the PCIe enhanced configuration space to /0 M1, 75. M1 or 5>/M1& The platform firmware should initiali*e the bits that control the si*e of the PCIe enhancedconfiguration space in PCI!-18R at boot&

  • 8/10/2019 System Address Map Initialization in x86

    63/71

    Figure 1,; &CIe En+ nced Configur tion Sp ce 'egi"ter $ pping onH "

  • 8/10/2019 System Address Map Initialization in x86

    64/71

    C/e_reg_addr_in_C ;_ e ry_space < C/EXBAR = B#s_># ber ? 1+B =

    De@ice_># ber ? 62 B = F#ncti n_># ber ? 4 B =

    Register_ ffset

    Perhaps you3re as ing where the 7M1, :5 1, and 0 1 multipliers come from& It3s simple actually$;or each bus, we need :5 #device( . #function( 0 1 of memory space, this is e+ual to 7M1A for each device, we need . #function( 0 1 of memory space, this is e+ual to :5 1&

    Jow, let3s loo into a simple sample& )et3s assume that PCI!-18R is initiali*edto C000_0000h #:@1( and we want to access the PCIe configuration register in 1us D, device 5,function 7, at offset 40h & Fhat is the address of this particular registerK )et3s calculate it$

    Register_address_in_ e ry < C000_0000h = 0 ? 1+B = 2 ? 62 B = 1 ? 4 B =

    40h

    Register_address_in_ e ry < C000_0000h = 0 = 1_0000h = 1000h = 40h

    Register_address_in_ e ry < C001_1040h

    Fe found that the target PCIe configuration register is located at C001_1040h in the CP9

    memory space& Fith this sample, you should now have no problem dealing with PCIe enhancedconfiguration space&

    System Management Mode 'SMM( Memory on the "as:ell Plat$orm

    In the first article of this series, you learned that there are two memory ranges used to store has been deprecated and unsupported& Therefore, there is only one memory range used to store;igure 70 shows the location of the

    main reason to do this is because the security of the system is compromised if a device otherthan the CP9 is given access to T

  • 8/10/2019 System Address Map Initialization in x86

    65/71

    In this section we are going to delve into @8RT& In the first article, I tal ed about @8RT in alegacy system, i&e& 8@P @8RT& This section tal s about present'day @8RT, i&e&, @8RT in a PCIe'based system& Microsoft outlines re+uirements for @8RT implementation in a PCIe'based system2PCIe @8RT for short& ?ou can read the re+uirements at http$%%msdn&microsoft&com%en'us%library%windows%hardware%gg0/:5.>&aspx& This is the relevant excerpt$

    1y definition, 8@P re+uires a chipset with a graphics address relocation table #@8RT(, whichprovides a linear view of nonlinear system memory to the graphics device& &CIe+o

  • 8/10/2019 System Address Map Initialization in x86

    66/71

    Figure 14; H "

  • 8/10/2019 System Address Map Initialization in x86

    67/71

    system memory but in the I@B& ?ou can thin of it as a buffer #memory( containing @8RTentries but residing in the I@B& This is different from @8RT entries in a legacy 8@Psystem, where the @8RT entries reside in system memory&

    :& @M8BR is the graphics memory aperture base register& This register is part of the @8RTlogic& Contents of this register contain the start address of the graphics aperture in theCP9 memory space& ;igure 7G shows that @M8BR points to start of 1loc L7, which isthe first bloc of the graphics aperture range&

    0& T)1s are the translation loo 'aside buffers used to handle graphics memory transactions&They are part of the @8RT logic in the I@B& These T)1s are similar to T)1s you wouldfind in the CP93s memory management unit #MM9(&

    >& PT!s means page table entries& I use this term to highlight the fact that the @8RT entriesare basically similar to PT!s in the CP9 but these are used specifically for graphics&

    /& B

  • 8/10/2019 System Address Map Initialization in x86

    68/71

    )et3s summari*e the difference between legacy 8@P @8RT and modern'day PCIe @8RT& Thefirst one is that 8@P @8RT logic was implemented as part of the hostbridge while modern'day@8RT logic is implemented as part of the PCIe graphics chip& In case the PCIe graphics chip islocated in the hostbridge #li e in the 4aswell case(, the @8RT logic will be part of the hostbridge&The operating system treats 8@P @8RT and PCIe @8RT differently& 8@P @8RT has its ownminiport driver, while the PCIe @8RT driver is part of the PCIe graphics device driver& The secondma6or difference is in the location of the graphics aperture$ In a legacy 8@P system, the graphicsaperture always resides below 0@1 while the modern'day PCIe graphics aperture can lie eitherbelow 0@1 or above 0@1&

    8t this point you should have a clear understanding of @8RT on the 4aswell platform& !ven if thissection tal s about @8RT in the I@B PCIe graphics chip, you should be able to understand @8RTimplemented by add'on PCIe graphics card easily because its principle is 6ust the same& Thedifference is only in the location of the graphics memory%buffer, which is basically very similarfrom the system address map standpoint&

    "as:ell System Address Map Initiali)ation

    In this section we3ll have a loo at the 4aswell system address map initiali*ation& Fe3re not goingto dive into the minute detail of the initiali*ation but 6ust sufficiently deep to understand the wholeprocess& There are several steps in the 4aswell boot process that are parts of system addressmap initiali*ation& They are as follows$

    7& Manageability engine #M!( initiali*ation2M! initiali*ation happens prior to platformfirmware code execution& M! initiali*es the Intel m n gement engine $#

    regi"ter in the .'series PC4 to signal to the platform firmware how much space itre+uires in the system BR8M for use as the M! 9M8 memory region&

    5& Chipset initiali*ation2In this step, the chipset registers is initiali*ed, including the chipsetbase address registers #18Rs(& Fe are particularly interested in chipset 18R initiali*ationbecause this initiali*ation affects the system address map& There are two chipsets in the4aswell platform, the northbridge and the southbridge& The northbridge is part of the CP92sometimes called the uncore part2and the southbridge is the .'series PC4& There aremany registers involved in the system address map that are part of the chipset, as youcan see from the previous sections& TO)9B, T

    BR8M is nown& Therefore, most of them are initiali*ed as part of or after main memoryinitiali*ation&

    :& Main memory #R8M( initiali*ation2In this step, the memory controller initiali*ationhappens& The memory controller initiali*ation and R8M initiali*ation happen together ascomplementary code, because the platform firmware code must figure out the correctparameters supported by both the memory controller and the R8M modules installed onthe system and then initiali*e both of the components into the correct" setup& The

  • 8/10/2019 System Address Map Initialization in x86

    69/71

    memory si*ing process is carried out in this step& The memory si*ing determines the si*eof system BR8M& This is mostly carried out by reading contents of the serial presencedetect #

    assignment of memory or IO address space happens via the use of 18R in the PCI%PCIedevices& Initiali*ation of 9

  • 8/10/2019 System Address Map Initialization in x86

    70/71

    In the first part of this series, you learned about the 1IO< !.5Dh interface& In this article I wouldonly reiterate the 9!;I e+uivalent of that function, the 9!;I *et+e ry+ap . function& This

    function is available as part of the 9!;I boot services& Therefore, you need to traverse into the9!;I boot services table to call" the function& The simplified" algorithm to call this function asfollows$

    7& )ocate the !;I system table&5& Traverse to the EF/_B ! ER /CE _!AB%E in the !;I system table&:& Traverse the EF/_B ! ER /CE _!AB%E to locate the *et+e ry+ap . function&0& Call the *et+e ry+ap . function&

    The *et+e ry+ap . function returns a similar data structure to the one returned by the legacy

    !.5Dh interface& The data structure iscalled EF/_+E+ R3_DE CR/ ! R &EF/_+E+ R3_DE CR/ ! R is defined as follows$

    //*******************************************************//E !"#E#O$%"&ES'$! O$

    //*******************************************************

    typedef stru t +

    !- 2 ype

    E !" 0%S!'1 "1&&$ESS hysi alStart

    E !"3!$ 1 "1&&$ESS 3irtualStart

    !- 45 -umberOf ages

    !- 45 1ttribute

    6 E !"#E#O$%"&ES'$! O$

    The *et+e ry+ap . function returns a copy of the current memory map& The map is an array

    of memory descriptors, each of which describes a contiguous bloc of memory& The mapdescribes all of memory, no matter how it is being used& The memory map is only used todescribe memory that is present in the system& Memory descriptors are never used to describeholes in the system memory map&

    Fell, this article doesn3t try to delve deeper into 9!;I *et+e ry+ap . interface& ?ou can readdetails of the interface and the EF/_+E+ R3_DE CR/ ! R in the 9!;I specification&

  • 8/10/2019 System Address Map Initialization in x86

    71/71

    intriguing regarding the 4aswell platform, it3s the manageability engine #M!(& This part of thesystem deserves its own scrutiny and further research& I3m aware of at least one proof'of'conceptwor in this particular field, but it was not on 4aswell&