l01: intro, combinational logicl22: virtual memory iii cse369, … · 2017-03-02 · 11/23/2016 3...
TRANSCRIPT
11/23/2016
1
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
VirtualMemoryIIICSE351Winter2017
https://xkcd.com/720/
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AddressTranslation:PageHit
2
1) ProcessorsendsvirtualaddresstoMMU(memorymanagementunit)
2-3)MMUfetchesPTEfrompagetableincache/memory(UsesPTBRtofindbeginningofpagetableforcurrentprocess)
4) MMUsendsphysicaladdress tocache/memoryrequestingdata
5) Cache/memorysendsdatatoprocessor
MMU Cache/MemoryPA
Data
CPU VA
CPUChip PTEA
PTE1
2
3
4
5
VA=VirtualAddress PTEA=PageTableEntryAddress PTE=PageTableEntryPA=PhysicalAddress Data=ContentsofmemorystoredatVAoriginallyrequestedbyCPU
11/23/2016
2
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AddressTranslation:PageFault
3
1) ProcessorsendsvirtualaddresstoMMU2-3) MMUfetchesPTEfrompagetableincache/memory4) Validbitiszero,soMMUtriggerspagefaultexception5) Handleridentifiesvictim(and,ifdirty,pagesitouttodisk)6) HandlerpagesinnewpageandupdatesPTEinmemory7) Handlerreturnstooriginalprocess,restartingfaultinginstruction
MMU Cache/Memory
CPU VA
CPUChip PTEA
PTE1
2
3
4
5
Disk
Pagefaulthandler
Victimpage
Newpage
Exception
6
7
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Hmm…TranslationSoundsSlow
v TheMMUaccessesmemorytwice:oncetogetthePTEfortranslation,andthenagainfortheactualmemoryrequest§ ThePTEsmay becachedinL1likeanyothermemoryword
• Buttheymaybeevictedbyotherdatareferences
• AndahitintheL1cachestillrequires1-3cycles
v Whatcanwedotomakethisfaster?§ Solution:addanothercache!🎉
4
11/23/2016
3
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SpeedingupTranslationwithaTLB
v TranslationLookasideBuffer (TLB):§ SmallhardwarecacheinMMU§ Mapsvirtualpagenumberstophysicalpagenumbers§ Containscompletepagetableentries forsmallnumberofpages• ModernIntelprocessorshave128or256entriesinTLB
§ Muchfasterthanapagetablelookupincache/memory
5
TLB
PTEVPN →
PTEVPN →
PTEVPN →
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
TLBHit
v ATLBhiteliminatesamemoryaccess!
6
MMU Cache/Memory
PA
Data
CPU VA
CPU Chip
PTE
1
2
4
5
TLB
VPN 3
TLBPTEVPN →
PTEVPN →
PTEVPN →
11/23/2016
4
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
TLBMiss
v ATLBmissincursanadditionalmemoryaccess(thePTE)§ Fortunately,TLBmissesarerare
7
MMU Cache/MemoryPA
Data
CPU VA
CPU Chip
PTE
1
2
5
6
TLB
VPN
4
PTEA3
TLBPTEVPN →
PTEVPN →
PTEVPN →
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
FetchingDataonaMemoryRead
1) CheckTLB§ Input:VPN,Output:PPN§ TLBHit: Fetchtranslation,returnPPN§ TLBMiss: Checkpagetable(inmemory)
• PageTableHit: LoadpagetableentryintoTLB• PageFault: Fetchpagefromdisktomemory,updatecorrespondingpagetableentry,thenloadentryintoTLB
2) Checkcache§ Input:physicaladdress,Output:data§ CacheHit: Returndatavaluetoprocessor§ CacheMiss: Fetchdatavaluefrommemory,storeitincache,returnittoprocessor
8
11/23/2016
5
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AddressTranslationVirtualAddress
TLBLookup
PageTable“Walk”
UpdateTLB
PageFault(OSloadspage)
ProtectionCheck
PhysicalAddress
TLBMiss TLBHit
PagenotinMem
AccessDenied
AccessPermitted
ProtectionFault
SIGSEGV
PageinMem
CheckcacheFindinDisk FindinMem
9
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SummaryofAddressTranslationSymbols
v BasicParameters§ N = 2$ Numberofaddressesinvirtualaddressspace§ M = 2& Numberofaddressesinphysicaladdressspace§ P = 2( Pagesize(bytes)
v Componentsofthevirtualaddress(VA)§ VPO Virtualpageoffset§ VPN Virtualpagenumber§ TLBI TLBindex§ TLBT TLBtag
v Componentsofthephysicaladdress(PA)§ PPO Physicalpageoffset(sameasVPO)§ PPN Physicalpagenumber
10
11/23/2016
6
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SimpleMemorySystemExample(small)
v Addressing§ 14-bitvirtualaddresses§ 12-bitphysicaladdress§ Pagesize=64bytes
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPNVirtualPageNumber VirtualPageOffset
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPNPhysicalPageNumber PhysicalPageOffset
11
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SimpleMemorySystem:PageTable
v Onlyshowingfirst16entries(outof_____)§ Note:showing2hexdigitsforPPNeventhoughonly6bits
12
VPN PPN Valid0 28 11 – 02 33 13 02 14 – 05 16 16 – 07 – 0
VPN PPN Valid8 13 19 17 1A 09 1B – 0C – 0D 2D 1E – 0F 0D 1
11/23/2016
7
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SimpleMemorySystem:TLB
v 16entriestotalv 4-waysetassociative
13
13 12 11 10 9 8 7 6 5 4 3 2 1 0
virtualpageoffsetvirtualpagenumber
TLBindexTLBtag
0–021340A10D030–0730–030–060–080–0220–0A0–040–0212D031102070–0010D090–030
ValidPPNTagValidPPNTagValidPPNTagValidPPNTagSet
WhydoestheTLBignorethepageoffset?
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
SimpleMemorySystem:Cache
v Direct-mappedwithK =4B,C/K =16v Physicallyaddressed
14
11 10 9 8 7 6 5 4 3 2 1 0
physicalpageoffsetphysicalpagenumber
cacheoffsetcacheindexcachetag
Note: ItisjustcoincidencethatthePPNisthesamewidth
asthecacheTag
Index Tag Valid B0 B1 B2 B30 19 1 99 11 23 111 15 0 – – – –2 1B 1 00 02 04 083 36 0 – – – –4 32 1 43 6D 8F 095 0D 1 36 72 F0 1D6 31 0 – – – –7 16 1 11 C2 DF 03
Index Tag Valid B0 B1 B2 B38 24 1 3A 00 51 899 2D 0 – – – –A 2D 1 93 15 DA 3BB 0B 0 – – – –C 12 0 – – – –D 16 1 04 96 34 15E 13 1 83 77 1B D3F 14 0 – – – –
11/23/2016
8
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
CurrentStateofMemorySystem
15
Cache:
TLB:Pagetable(partial):
Index Tag V B0 B1 B2 B30 19 1 99 11 23 111 15 0 – – – –2 1B 1 00 02 04 083 36 0 – – – –4 32 1 43 6D 8F 095 0D 1 36 72 F0 1D6 31 0 – – – –7 16 1 11 C2 DF 03
Index Tag V B0 B1 B2 B38 24 1 3A 00 51 899 2D 0 – – – –A 2D 1 93 15 DA 3BB 0B 0 – – – –C 12 0 – – – –D 16 1 04 96 34 15E 13 1 83 77 1B D3F 14 0 – – – –
Set Tag PPN V Tag PPN V Tag PPN V Tag PPN V0 03 – 0 09 0D 1 00 – 0 07 02 11 03 2D 1 02 – 0 04 – 0 0A – 02 02 – 0 08 – 0 06 – 0 03 – 03 07 – 0 03 0D 1 0A 34 1 02 – 0
VPN PPN V0 28 11 – 02 33 13 02 14 – 05 16 16 – 07 – 0
VPN PPN V8 13 19 17 1A 09 1B – 0C – 0D 2D 1E – 0F 0D 1
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryRequestExample#1
v VirtualAddress:0x03D4
v PhysicalAddress:
TLBITLBT
013
012
011
010
19
18
17
16
05
14
03
12
01
00
VPOVPN
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
VPN______ TLBT_____ TLBI_____ TLBHit?___ PageFault?___ PPN _____
CT______ CI_____ CO_____ CacheHit?___ Data(byte)_______
Note: ItisjustcoincidencethatthePPNisthesamewidth
asthecacheTag
16
11/23/2016
9
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryRequestExample#2
v VirtualAddress:0x038F
v PhysicalAddress:
TLBITLBT
013
012
011
010
19
18
17
06
05
04
13
12
11
10
VPOVPN
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
VPN______ TLBT_____ TLBI_____ TLBHit?___ PageFault?___ PPN _____
CT______ CI_____ CO_____ CacheHit?___ Data(byte)_______
Note: ItisjustcoincidencethatthePPNisthesamewidth
asthecacheTag
17
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryRequestExample#3
v VirtualAddress:0x0020
v PhysicalAddress:
TLBITLBT
013
012
011
010
09
08
07
06
15
04
03
02
01
00
VPOVPN
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
VPN______ TLBT_____ TLBI_____ TLBHit?___ PageFault?___ PPN _____
CT______ CI_____ CO_____ CacheHit?___ Data(byte)_______
Note: ItisjustcoincidencethatthePPNisthesamewidth
asthecacheTag
18
11/23/2016
10
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryRequestExample#4
v VirtualAddress:0x036B
v PhysicalAddress:
TLBITLBT
013
012
011
010
19
18
07
16
15
04
13
02
11
10
VPOVPN
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
VPN______ TLBT_____ TLBI_____ TLBHit?___ PageFault?___ PPN _____
CT______ CI_____ CO_____ CacheHit?___ Data(byte)_______
Note: ItisjustcoincidencethatthePPNisthesamewidth
asthecacheTag
19
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
VirtualMemorySummary
v Programmer’sviewofvirtualmemory§ Eachprocesshasitsownprivatelinearaddressspace§ Cannotbecorruptedbyotherprocesses
v Systemviewofvirtualmemory§ Usesmemoryefficientlybycachingvirtualmemorypages
• Efficientonlybecauseoflocality
§ Simplifiesmemorymanagementandsharing§ Simplifiesprotectionbyprovidingpermissionschecking
20
11/23/2016
11
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemorySystemSummaryv MemoryCaches(L1/L2/L3)
§ Purelyaspeed-uptechnique§ Behaviorinvisibletoapplicationprogrammerand(mostly)OS§ Implementedtotallyinhardware
v VirtualMemory§ SupportsmanyOS-relatedfunctions
• Processcreation,taskswitching,protection§ OperatingSystem(software)
• Allocates/sharesphysicalmemoryamongprocesses• Maintainshigh-leveltablestrackingmemorytype,source,sharing• Handlesexceptions,fillsinhardware-definedmappingtables
§ Hardware• Translatesvirtualaddressesviamappingtables,enforcingpermissions• Acceleratesmappingviatranslationcache(TLB)
21
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemorySystem– Whocontrolswhat?
v MemoryCaches(L1/L2/L3)§ Controlledbyhardware§ Programmercannotcontrolit§ Programmercan writecodetotakeadvantageofit
v VirtualMemory§ ControlledbyOSandhardware§ Programmercannotcontrolmappingtophysicalmemory§ Programmercancontrolsharingandsomeprotection
• viaOSfunctions(notinCSE351)
22
11/23/2016
12
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
QuickReviewv WhatdoPageTablesmap?
v WherearePageTableslocated?
v HowmanyPageTablesarethere?
v Canyourprogramtellifapagefaulthasoccurred?
v Whatisthrashing?
v True/False:VirtualAddressesthatarecontiguouswillalwaysbecontiguousinphysicalmemory
v TLBstandsfor_______________________andstores_______________
23
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
QuickReviewAnswersv WhatdoPageTablesmap?
§ VPN→ PPNordiskaddressv WherearePageTableslocated?
§ Inphysicalmemoryv HowmanyPageTablesarethere?
§ Oneperprocessv Canyourprogramtellifapagefaulthasoccurred?
§ Nope,butithastowaitalongtimev Whatisthrashing?
§ Constantlypagingoutandpaginginv True/False:VirtualAddressesthatarecontiguouswillalwaysbe
contiguousinphysicalmemory§ Couldfallacrossapageboundary
v TLBstandsforTranslationLookasideBuffer andstorespagetableentries
24
11/23/2016
13
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Review:AddressTranslation
v VMiscomplicated,butalsoelegantandeffective§ Levelofindirectiontoprovideisolatedmemory&caching§ TLBasacacheofpagetablesavoidstwotripstomemoryforeverymemoryaccess
VirtualAddress
TLBLookup
PageTable“Walk”
UpdateTLB
PageFault(OSloadspage)
ProtectionCheck
PhysicalAddress
TLBMiss TLBHit
PagenotinMem
AccessDenied
AccessPermitted
ProtectionFault
SIGSEGV
PageinMem
CheckcacheFindinDisk FindinMem25
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryOverview:Puttingitalltogether
26
Disk
Mainmemory(DRAM)
CacheCPU
Page
PageLine
Line
Word
v movl 0x8043ab, %rdi
TLB
MMU
11/23/2016
14
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ContextSwitchingRevisited
v WhatneedstohappenwhentheCPUswitchesprocesses?§ Registers:
• Savestateofoldprocess,loadstateofnewprocess• IncludingthePageTableBaseRegister(PTBR)
§ Memory:• Nothingtodo!Pagesforprocessesalreadyexistinmemory/diskandprotectedfromeachother
§ TLB:• Invalidate allentriesinTLB– mappingisforoldprocess’VAs
§ Cache:• CanleavealonebecausestoringbasedonPAs– goodforshareddata
27
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PageTableReality
v Justoneissue…thenumbersdon’tworkoutforthestorysofar!
v Theproblemisthepagetableforeachprocess:§ Suppose64-bitVAs,8KiBpages,8GiB physicalmemory§ Howmanypagetableentriesisthat?
§ AbouthowlongiseachPTE?
§ Moral: Cannotusethisnaïveimplementationofthevirtual→physical-page mapping– it’sway toobig
28
11/23/2016
15
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ASolution:Multi-levelPageTables
29
Pagetablebaseregister
(PTBR)
VPN10p-1n-1
VPOVPN2 ... VPNk
PPN
0p-1m-1PPOPPN
VirtualAddress
PhysicalAddress
... ...
Level1pagetable
Level2pagetable
Levelkpagetable
TLB
PTEVPN →
PTEVPN →
PTEVPN →
Thisiscalledapagewalk
Thisisextra(non-testable)
material
Whydoesthiswork?
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Multi-levelPageTablesv Atreeofdepth𝑘 whereeachnodeatdepth𝑖 hasupto2/
childrenifpart𝑖 oftheVPNhas𝑗 bitsv Hardwareformulti-levelpagetablesinherentlymore
complicated§ Butit’sanecessarycomplexity– 1-leveldoesnotfit
v Whyitworks:Mostsubtreesarenotusedatall,sotheyarenevercreatedanddefinitelyaren’tinphysicalmemory§ Partscreatedcanbeevictedfromcache/memorywhennotbeingused§ Eachnodecanhaveasizeof~1-100KB
v Butnowfora𝑘-levelpagetable,aTLBmissrequires𝑘 + 1cache/memoryaccesses§ FinesolongasTLBmissesarerare– motivateslargerTLBs
30
Thisisextra(non-testable)
material
11/23/2016
16
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PracticeVMQuestion
v Oursystemhasthefollowingproperties§ 1MiB ofphysicaladdressspace§ 4GiB ofvirtualaddressspace§ 32KiBpagesize§ 4-entryfullyassociativeTLBwithLRUreplacement
a) Fillinthefollowingblanks:
________ Entriesinpage table ________ Minimumbit-widthofPTBR
________ TLBTbits ________ Max#ofvalidentriesinapage table
31
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PracticeVMQuestion
v Oneprocessusesapage-alignedsquarematrixmat[] of32-bitintegersinthecodeshownbelow:
#define MAT_SIZE = 2048for(int i=0; i<MAT_SIZE; i++)
mat[i*(MAT_SIZE+1)] = i;
b) Whatisthelargeststride(inbytes)betweensuccessivememoryaccesses(intheVAspace)?
32
11/23/2016
17
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PracticeVMQuestion
v Oneprocessusesapage-alignedsquarematrixmat[] of32-bitintegersinthecodeshownbelow:
#define MAT_SIZE = 2048for(int i=0; i<MAT_SIZE; i++)
mat[i*(MAT_SIZE+1)] = i;
c) Whatarethefollowinghitratesforthefirstexecutionoftheforloop?
________ TLBHitRate ________ PageTableHitRate
33
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Roadmap
34
car *c = malloc(sizeof(car));c->miles = 100;c->gals = 17;float mpg = get_mpg(c);free(c);
Car c = new Car();c.setMiles(100);c.setGals(17);float mpg =
c.getMPG();
get_mpg:pushq %rbpmovq %rsp, %rbp...popq %rbpret
Java:C:
Assemblylanguage:
Machinecode:
01110100000110001000110100000100000000101000100111000010110000011111101000011111
Computersystem:
OS:
Memory&dataIntegers&floatsMachinecode&Cx86assemblyProcedures&stacksArrays&structsMemory&cachesProcessesVirtualmemoryMemoryallocationJavavs.C
11/23/2016
18
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MultipleWaystoStoreProgramDatav Staticglobaldata
§ Fixedsizeatcompile-time§ Entirelifetimeoftheprogram
(loadedfromexecutable)§ Portionisread-only
(e.g.stringliterals)
v Stack-allocateddata§ Local/temporaryvariables
• Can bedynamicallysized(insomeversionsofC)
§ Knownlifetime(deallocatedonreturn)
v Dynamic(heap)data§ Sizeknownonlyatruntime(i.e.basedonuser-input)§ Lifetimeknownonlyatruntime(long-liveddatastructures)
int array[1024];
void foo(int n) {int tmp;int local_array[n];
int* dyn = (int*)malloc(n*sizeof(int));
}
35
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryAllocation
v Dynamicmemoryallocation§ Introductionandgoals§ Allocationanddeallocation(free)§ Fragmentation
v Explicitallocationimplementation§ Implicitfreelists§ Explicitfreelists(Lab5)§ Segregatedfreelists
v Implicitdeallocation:garbagecollectionv Commonmemory-relatedbugsinC
36
11/23/2016
19
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
DynamicMemoryAllocation
v Programmersusedynamicmemoryallocatorstoacquirevirtualmemoryatruntime§ Fordatastructureswhosesize(orlifetime)isknownonlyatruntime
§ Managetheheapofaprocess’virtualmemory:
v Typesofallocators§ Explicit allocator:programmerallocatesandfreesspace
• Example:malloc andfree inC
§ Implicit allocator: programmeronlyallocatesspace(nofree)• Example:garbagecollectioninJava,Caml,andLisp
37
Programtext(.text)Initializeddata(.data)
Userstack
0
Heap(viamalloc)
Uninitializeddata(.bss)
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
DynamicMemoryAllocation
v Allocatororganizesheapasacollectionofvariable-sizedblocks,whichareeitherallocated orfree§ Allocatorrequestspagesintheheapregion;virtualmemoryhardwareandOSkernelallocatethesepagestotheprocess
§ Applicationobjectsaretypicallysmallerthanpages,sotheallocatormanagesblockswithin pages• (Largerobjectshandledtoo;ignoredhere)
38
Topofheap(brk ptr)
Programtext(.text)Initializeddata(.data)
Userstack
0
Heap(viamalloc)
Uninitializeddata(.bss)
11/23/2016
20
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AllocatingMemoryinCv Needto#include <stdlib.h>v void* malloc(size_t size)
§ Allocatesacontinuousblockofsize bytesofuninitializedmemory§ Returnsapointertothebeginningoftheallocatedblock;NULLindicates
failedrequest• Typicallyalignedtoan8-byte(x86)or16-byte(x86-64)boundary• ReturnsNULL ifallocationfailed(alsosetserrno)orsize==0
§ Differentblocksnotnecessarilyadjacent
v Goodpractices:§ ptr = (int*) malloc(n*sizeof(int));
• sizeofmakescodemoreportable• void* isimplicitlycastintoanypointertype;explicittypecastwillhelpyoucatchcodingerrorswhenpointertypesdon’tmatch
39
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AllocatingMemoryinCv Needto#include <stdlib.h>v void* malloc(size_t size)
§ Allocatesacontinuousblockofsize bytesofuninitializedmemory§ Returnsapointertothebeginningoftheallocatedblock;NULLindicates
failedrequest• Typicallyalignedtoan8-byte(x86)or16-byte(x86-64)boundary• ReturnsNULL ifallocationfailed(alsosetserrno)orsize==0
§ Differentblocksnotnecessarilyadjacent
v Relatedfunctions:§ void* calloc(size_t nitems, size_t size)
• “Zerosout”allocatedblock§ void* realloc(void* ptr, size_t size)
• Changesthesizeofapreviouslyallocatedblock(ifpossible)§ void* sbrk(intptr_t increment)
• Usedinternallybyallocatorstogroworshrinktheheap40
11/23/2016
21
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
FreeingMemoryinCv Needto#include <stdlib.h>v void free(void* p)
§ Releaseswholeblockpointedtobyp tothepoolofavailablememory§ Pointerpmustbetheaddressoriginally returnedbym/c/realloc
(i.e.beginningoftheblock),otherwisethrowssystemexception§ Don’tcallfree onablockthathasalreadybeenreleasedoronNULL
41
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
MemoryAllocationExampleinC
42
void foo(int n, int m) {int i, *p;p = (int*) malloc(n*sizeof(int)); /* allocateblockofnints */if (p == NULL) { /* checkforallocationerror */
perror("malloc");exit(0);
}for (i=0; i<n; i++) /* initializeint array */
p[i] = i;/* addspaceformintstoendofpblock */
p = (int*) realloc(p,(n+m)*sizeof(int));if (p == NULL) { /* checkforallocationerror */
perror("realloc");exit(0);
}for (i=n; i < n+m; i++) /* initializenewspaces */
p[i] = i;for (i=0; i<n+m; i++) /* printnewarray */
printf("%d\n", p[i]);free(p); /* freep*/
}
11/23/2016
22
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
NotationNode(theseslides,book,videos)
v Memoryisdrawndividedintowords§ Eachword canholdanint (32bits/4bytes)§ Allocationswillbeinsizesthatareamultipleofwords,i.e.multiplesof4bytes
§ Inpicturesinslides,book,videos:
43
Allocatedblock(4words)
Freeblock(3words) Freeword
Allocatedword
=oneword,4bytes
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
AllocationExample
44
p1 = malloc(16)
p2 = malloc(20)
p3 = malloc(24)
free(p2)
p4 = malloc(8)
=4-byteword
11/23/2016
23
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Constraints (interface/contract)v Applications
§ Canissuearbitrarysequenceofmalloc andfree requests§ Mustneveraccessmemorynotcurrentlyallocated§ Mustneverfreememorynotcurrentlyallocated
• Alsomustonlyusefree withpreviouslymalloc’ed blocks(not,e.g.,stackdata)
v Allocators§ Can’tcontrolnumberorsizeofallocatedblocks§ Mustrespondimmediatelytomalloc (i.e.can’treorderorbuffer)§ Mustallocateblocksfromfreememory(i.e.blockscan’toverlap–Whynot?)
§ Mustalignblockssotheysatisfyallalignmentrequirements§ Can’tmovetheallocatedblocks(i.e.compaction/defragmentationisnot
allowed–Whynot?)
45
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PerformanceGoals
v Goals: Givensomesequenceofmalloc andfreerequests𝑅4, 𝑅6, … , 𝑅8, … , 𝑅9:6,maximizethroughputandpeakmemoryutilization§ Thesegoalsareoftenconflicting
1) Throughput§ Numberofcompletedrequestsperunittime§ Example:
• If5,000malloc callsand5,000free callscompletedin10seconds,thenthroughputis1,000operations/second
46
11/23/2016
24
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
PerformanceGoals
v Definition: Aggregatepayload𝑃8§ malloc(p) resultsinablockwithapayload ofp bytes§ Afterrequest𝑅8 hascompleted,theaggregatepayload𝑃8isthesumofcurrentlyallocatedpayloads
v Definition: Currentheapsize𝐻8§ Assume𝐻8 ismonotonicallynon-decreasing
• Allocatorcanincreasesizeofheapusingsbrk
2) PeakMemoryUtilization§ Definedas𝑈8 = (max
BC8𝑃B)/𝐻8 after𝑘+1requests
§ Goal:maximizeutilizationforasequenceofrequests§ Whyisthishard?Andwhathappenstothroughput?
47
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Fragmentation
v Poormemoryutilizationiscausedbyfragmentation§ Sectionsofmemoryarenotusedtostoreanythinguseful,butcannotsatisfyallocationrequests
§ Twotypes:internal andexternal
v Recall: Fragmentationinstructs§ Internalfragmentationwaswastedspaceinside ofthestruct(betweenfields)duetoalignment
§ Externalfragmentationwaswastedspacebetween structinstances(e.g.inanarray)duetoalignment
v Nowreferringtowastedspaceintheheapinside orbetween allocatedblocks
48
11/23/2016
25
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
InternalFragmentation
v Foragivenblock,internalfragmentation occursifpayloadissmallerthantheblock
v Causes:§ Paddingforalignmentpurposes§ Overheadofmaintainingheapdatastructures(insideblock,outsidepayload)
§ Explicitpolicydecisions(e.g.,toreturnabigblocktosatisfyasmallrequest)
v Easytomeasurebecauseonlydependsonpastrequests
payload Internalfragmentation
block
Internalfragmentation
49
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ExternalFragmentationv Fortheheap,externalfragmentation occurswhen
allocation/freepatternleaves“holes”betweenblocks§ Thatis,theaggregatepayloadisnon-continuous§ Cancausesituationswherethereisenoughaggregateheapmemoryto
satisfyrequest,butnosinglefreeblockislargeenough
v Don’tknowwhatfuturerequestswillbe§ Difficulttoimpossibletoknowifpastplacementswillbecome
problematic50
p1 = malloc(16)
p2 = malloc(20)
p3 = malloc(24)
free(p2)
p4 = malloc(24) Ohno!(Whatwouldhappennow?)
=4-byteword
11/23/2016
26
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ImplementationIssues
v Howdoweknowhowmuchmemorytofreegivenjustapointer?
v Howdowekeeptrackofthefreeblocks?v Howdowepickablocktouseforallocation(whenmanymightfit)?
v Whatdowedowiththeextraspacewhenallocatingastructurethatissmallerthanthefreeblockitisplacedin?
v Howdowereinsertafreedblockintotheheap?
51
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
KnowingHowMuchtoFree
v Standardmethod§ Keepthelengthofablockinthewordprecedingtheblock
• Thiswordisoftencalledtheheaderfield or header
§ Requiresanextrawordforeveryallocatedblock
52
free(p0)
p0 = malloc(16)
p0
blocksize data
20
=4-byteword(free)
=4-byteword(allocated)
11/23/2016
27
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ForFun:DRAMMERSecurityAttackv Whyarewetalkingaboutthis?
§ Current: AnnouncedinOctober2016;GooglereleasedAndroidpatchonNovember8
§ Relevant: Usesyoursystem’smemorysetuptogainelevatedprivileges• Tiestogethersomeofwhatwe’velearnedaboutvirtualmemoryandprocesses
§ Interesting: It’sasoftwareattackthatusesonlyhardwarevulnerabilities andrequiresnouserpermissions
53
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
UnderlyingVulnerability:RowHammer
v DynamicRAM(DRAM)hasgottendenserovertime§ DRAMcellsphysicallycloserandusesmallercharges
§ Moresusceptibleto“disturbanceerrors”(interference)
v DRAMcapacitorsneedtobe“refreshed”periodically(~64ms)§ Losedatawhenlossofpower§ Capacitorsaccessedinrows
v Rapidaccessestoonerowcanflipbitsinanadjacentrow!§ ~100Kto1Mtimes 54
ByDsimic (modified),CCBY-SA4.0,https://commons.wikimedia.org/w
/index.php?curid=38868341
11/23/2016
28
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
RowHammerExploit
v Forcememoryaccessbyconstantlyreadingandthenflushingthecache§ clflush – flushcacheline
• Invalidatescachelinecontainingthespecifiedaddress
• Notavailableinallmachinesorenvironments
§ WantaddressesX andY tofallinactivationtargetrow(s)• Goodtounderstandhowbanks ofDRAMcellsarelaidout
v Therowhammereffectwasdiscoveredin2014§ OnlyworksoncertaintypesofDRAM(2010onwards)§ Thesetechniquestargetx86machines
55
hammertime: mov (X), %eaxmov (Y), %ebxclflush (X) clflush (Y) jmp hammertime
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
ConsequencesofRowHammer
v Rowhammeringprocesscanaffectanotherprocessviamemory§ Circumventsvirtualmemoryprotectionscheme§ MemoryneedstobeinanadjacentrowofDRAM
v Worse:privilegeescalation§ Pagetablesliveinmemory!§ HopetochangePPNtoaccessotherpartsofmemory,orchangepermissionbits
§ Goal: gainread/writeaccesstoapagecontainingapagetable,hencegrantingprocessread/writeaccesstoallofphysicalmemory
56
11/23/2016
29
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Effectiveness?
v Doesn’tseemsobad– randombitflipinarowofphysicalmemory§ Vulnerabilityaffectedbysystemsetupandphysicalconditionofmemorycells
v Improvements:§ Double-sidedrowhammeringincreasesspeed&chance§ Dosystemidentificationfirst(e.g.Lab4)
• Usetimingtoinfermemoryrowlayout&find“bad”rows• Allocateahugechunkofmemoryandtrymanyaddresses,lookingforareliable/repeatablebitflip
§ Fillupmemorywithpagetablesfirst• Fork extraprocesses;hopetoelevateprivilegesinanypagetable
57
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
What’sDRAMMER?
v Noonepreviouslymadeahugefuss§ Prevention: error-correctingcodes,targetrowrefresh,higherDRAMrefreshrates
§ Oftenreliedonspecialmemorymanagementfeatures§ Oftencrashedsysteminsteadofgainingcontrol
v Researchgroupfoundadeterministicwaytoinducerowhammerexploitinanon-x86system(ARM)§ Reliesonpredictablereusepatternsofstandardphysicalmemoryallocators
§ Universiteit Amsterdam,GrazUniversityofTechnology,andUniversityofCalifornia,SantaBarbara
58
11/23/2016
30
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
DRAMMERDemoVideov It’sashell,sonotthatsexy-looking,butstillinteresting
§ Apologiesthatthetextissosmallonthevideo
59
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Howdidwegethere?
v Computingindustrydemandsmoreandfasterstoragewithlowerpowerconsumption
v Abilityofusertocircumventthecachingsystem§ clflush isanunprivilegedinstructioninx86§ Othercommandsexistthatskipthecache
v Availabilityofvirtualtophysicaladdressmapping§ Example: /proc/self/pagemap onLinux(nothuman-readable)
v GooglepatchforAndroid(Nov.8,2016)§ PatchedtheIONmemoryallocator
60
11/23/2016
31
CSE369, Autumn 2016L01: Intro, Combinational Logic CSE351, Winter 2017L22: Virtual Memory III
Morereadingforthoseinterested
v DRAMMERpaper:https://vvdveen.com/publications/drammer.pdf
v GoogleProjectZero:https://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
v Firstrowhammerpaper:https://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf
v Wikipedia:https://en.wikipedia.org/wiki/Row_hammer
61