ca unit4 notes without diagrams

Upload: shivalkar-j

Post on 05-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    1/22

    Unit IV - Memory System

    5.1) Basics:

    Memory is designed to store & retrieve data / information.

    Max mem size is determined by addressing scheme used.

    Ex- 16 bit address gives 216 = 64 K memory locations.

    Modern comps r byte addressable.

    Connection of Mem to the Processor:

    Processor consists of two Registers MAR & MDR.

    To read data from Mem, MAR locates the address in the mem & reads data from that

    location (R/W = 1).

    To write data into Mem, MAR locates addr & MDR Passes data bits to that addr(R/W =

    0).

    MFC identifies the mem function completed or not.

    Ops involving consecutive addr locations r Block Transfer.

    Basic Measures of Memory:

    Mem Access Time: Time b/w initiation & completion of an Op.

    Mem Cycle Time: Min time delay b/w the initiation of two successive mem Ops.

    Types Of Memory:

    RAM(Random Access Memory / Read Write Memory)

    o Static RAM, CMOS Chip

    o Dynamic RAM

    Asynchronous DRAM

    Synchronous DRAM

    o Rambus Memory

    ROM(Read Only Memory)

    o PROM

    o EPROM

    Figure 5.1. Connection of the memory to the processor.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    2/22

    o EEPROM

    o Flash Memory

    Cache Memory

    Virtual Memory

    Secondary Storage

    o Magnetic Hard Disks

    o Optical Disks

    o Magnetic Tape Systems

    5.2) Semiconductor RAM Memories

    If a loc in a mem can be accessed for R/W in a specific time that is independent of locs

    addr, than such memory is Random Access Memory.

    5.2.1 Internal Organization of Memory Chips:

    Mem cells r organized in the form of Arrays.

    Each cell is capable of storing one bit of infm.

    Each row of cell constitutes a mem word.

    Cells of a row r connected to a common line referred as Word Line.

    Word Line is driven by the addr decoder

    Cells in each col r connected to Sense / Write circuit by two bit lines.

    In Read Op, this circuit read the infm from the selected Word Line & transmits to O/P

    lines.

    In W Op, these circuits receive I/P infm & store in the cells of selected Word Line.

    Control line CS selects one chip among multiple chips.

    Stores 128 bits.

    14 external connections for addr, data and contrl lines.

    Two lines for pwr sply & ground

    Figure 5.2. Organization of bit cells in a memory chip.

    Figure 5.3. Organization of a 1K 1 memory chip.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    3/22

    Slightly larger mem circuit which has 1K (1024) mem cells.

    It is organized as 128*8 mem

    Requires 15 external connections

    10 bit address is needed & is divided as 5 bit row addr, 5 bit col addr.

    Need 1 data line

    Each row contains 32 cells.

    I/P are demultiplexed and O/P is multiplexed from 32 bits to 1 bit.

    5.2.2 Static Memories:

    These r the mem circuits that r capable of retaining their State as long as pwr is applied.

    Two invertors r cross connected to form a latch.

    Latch is connected to two bit lines by Transistors T1 & T2.

    Transistors act as switches under the cntrl of Word Line.

    R OP: To read the state of SRAM cell, WL allows T1 &T2 to turn On. If cell state is 1,

    signal on bit line b is high and b is 0

    W OP: To set the state of SRAM cell, place appropriate values on b & b and activate theWL. This forces the cell into corresponding state.

    CMOS Cell:

    Complementary Metal Oxide Semiconductor Chip

    Transistor pairs (T3, T5) & (T4, T6) form the invertors in the Latch.

    In State 1, X have high vge, which makes T3, T6 turn ON, T4, T5 turn OFF and vge

    at bit line b is high.

    Continuous pwr sply is needed; interruption of pwr sply loses the cell contents.

    Figure 5.4. A Static RAM Cell.

    Figure 5.5. An example of a CMOS memory cell.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    4/22

    Even when pwr is restored, the cell state may not be same, hence called as Volatile

    Memory.

    Advantage

    o Low pwr consumption

    o Provides less heat.

    Dis Adv

    o Needs Continuous pwr sply

    5.2.3 Asynchronous DRAMs:

    Static RAMs r fast, but they cost more space,kore transistors & expensive.

    DRAMs r cheap & efficient, but they can not retain their state indefinitely & need to be

    periodically refreshed.

    DRAMs stores infm in form of Charge on Capacitors.

    Charge is maintained for tens of milliseconds.

    Hence need to restore capacitor charge to its full value by refreshing periodically.

    R OP: Sense amplifier connected to bit line detects whether charge stored on capacitor is

    above Threshold value.

    o Above threshold value -> logic 1 in bit line

    o Below threshold value -> logic 0 in bit line

    Once reading data contents of a cell, it refreshes the row cells automatically.

    Hence, reading shld be done at row level

    To store infm to the cell, tran T is turned On and appropriate vge is applied to the bit line

    & it allows C to charge to its full value.

    A Dynamic Memory Chip:

    16 Mega bit DRAM chip configured as 2M*8

    Cells r organized in the form of 4k*4K array

    4096 cells in each row r divided into 512 groups of 8

    Figure 5.6. A single-transistor dynamic memory cell.

    Figure 5.7. Internal organization of a 2M * 8 dynamic memory chip.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    5/22

    Each roe can store 512 bytes of data.

    Needs 12 addr bits to select a row addr and 9 addr bits to specify a group of 8 bits in the

    selected row.

    R/W OP:

    o Row addr is applied first

    o Based on RAS signal, addr is loaded on row address latch.

    o Col addr is loaded by CAS signal and decoded.

    o R OP: Sense circuits read data / infm from the selected circuits then transmits to

    D7 D0 bit lines.

    o W OP: Sense circuits get I/P from D7 D0 & load them into the selected circuits

    RAS & CAS r low in general and provides asynchronous cntrl signals

    DRAM chip is organized to R/W a number of bits in parallel.

    Chips r available in size range from 1M to 256M bits.

    Adv: High density & low cost.

    Fast Page Mode:

    When DRAM in Fig5.7 is accessed, the content of all 4096 cells in the selected row r

    sensed, but only 8 bit r placed on the data lines D7 D0.

    Fast Page Mode make it possible to access the other bytes in the same row without

    having to reselect the row.

    A latch is added at the O/P of the sense amplifier in each col.

    Good for bulk transfer.

    5.2.4 Synchronous DRAMs:

    As a technology development, the DRAMs OPs r directly controlled with clock signal

    & r called Synchronous DRAM.

    Cell array is same as in asynchronous DRAM.

    The addr & data connections r buffered by means of registers.

    The O/P of each sense amplifier is connected to a latch.

    Fig 5.8 Synchronous DRAM

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    6/22

    R OP loads all cell contents of selected row into the lathes & data in the latches r

    transferred to data O/P register.

    Refresh OP maintains the latch values but refreshes the cell contents.

    SDRAM have different modes of OP, whish can be selected by writing cntrl infm into a

    Mode register.

    No need of external col signals like CAS, instead the internal column counter & clock

    signal does this cntrl.

    All actions r triggered by the rising edge of the clock.

    Row addr is latched under cntrl of RAS signal.

    Mem typically takes 2 or 3 clock ccles to activate the selected row.

    Then col addr is latched by CAS signal.

    After one clock cycle, datas r loaded on data lines.

    DRAM automatically increments col addr to select next data values.

    Refresh counter provides the addr of the row that r selected for refreshing.

    Refresh circuits refresh for every 64ms. Clock frequency > 100 MHZ.

    Latency & Bandwidth:

    Transfers b/w mem & processor involve single word of data or small blocks of words.

    Speed & efficiency of these transfers have performance impact on computers.

    Performance is given by 2 parameters

    o Latency:

    Word transfer-The amount of time to transfer a word of data to or from

    mem

    Block Transfer-The amount of time it takes to transfer the first word of

    data.

    o Bandwith:

    The number of bits or bytes that can be transferred in one second.

    It is used to measure how much time is needed to transfer an entire block

    of data.

    Figure 5.9. Burst read of length 4 in an SDRAM.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    7/22

    It is determined by speed of mem, transfer capability of links b/w mem

    and processor & speed of the bus.

    Double Data Rate SDRAM:

    SDRAM performs all actions on the edge of the clock signal.

    DDR SDRAM accesses the cell array in the same way, but transfers the data on both

    edges of the clock.

    Latency is same as for SDRAMs but Bandwidth os double for long burst transfers.

    Cell array is organized in two banks. Each can be accessed separately.

    DDR SDRAMs & Strandard SDRAMs r most efficiently used in block transfer

    applications.

    5.2.5 Structure of larger Memories:

    Connecting memory chips to form a larger memory.

    Static Memory Systems:

    Consider mem of 2M words of 32 bits each.

    This is implemented by 512*8 static memory.

    CS allows selecting a chip, if it is logic 1 then the selected chip can R/W data on the data

    lines.

    19 bit addr used to access specific byte location inside each chip of selected row.

    2 bit higher addr to select which of the 4 CS cntrl signal should be activated.

    Dyanmic Memory Systems:

    PCs use atleast 32M bytes of memory.

    Workstations use 128 M bytes of memory.

    Large mem leads to better performance but occupies more space in MotherBoard.

    Thus, it leads to the development of larger mem units known as SIMMS (Single In-line

    Memory) & DIMMS (Dual In-line Memory).

    The choice of a RAM chip for a given application depends on several factors: cost,

    spped, power, size, etc.

    SRAMs r faster, more expensive, larger and r used in cache memories.

    DRAMs r slower, cheaper, smaller and r used in main memory.

    Figure 5.10 Organization of a 2M *32 memory module using 512K*8 static memory chips.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    8/22

    5.2.6 Memory System Considerations:

    The choice of a RAM chip for a given application depends on several factors: cost,

    spped, power, size, etc.

    SRAMs r faster, more expensive, larger and r used in cache memories.

    DRAMs r slower, cheaper, smaller and r used in main memory.

    Memory Controller:

    Same addr pins for row & column addr.

    RAS & CAS signal indicates whether the addr pin has row/coladdr.

    R/W signal specifies the R/W OP.

    Mem controller acts as the addr multiplexer.

    DRAM chips do not have self-refreshing capabilities.

    Refresh Overhead:

    All DRAMs have to be refreshed. (64ms)

    5.2.7 Rambus Memory:

    Rambus technology is a fast signaling method, used to transfer infm b/w chips.

    Instead of using signals that have Vsupply, it uses VRef. The reference voltage is about

    2V.

    The two logic values r represented by 0.3V swinsabove & below Vref.

    This type of signaling is Differential signaling which provides short transition time.

    The communication links used for this signaling is called as Rambus Channel.

    Communicating devices like processor may server as Master and RDRAM modules

    servers as Slaves is carried out by means of Packets transmitted on data lines.

    3 types of Packets:

    o Request: this is issued by Master to indicate the type of operation and

    contains addr of desired mem location and also has 8 bit count which

    specifies the number of bytes involved in transfer.

    o Acknowledge: Slave responds by returning positiveor negative

    acknowledgement packet.

    Fig 5.11 Use of a memory controller.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    9/22

    o Data packet

    RDRAM chip can be assembled into larger modules like SIMMS &DIMMS called

    as RIMM.

    5.3 ROM (Read Only Memory)

    The mem which involves only operation of reading stored data is Read Only Memory.

    5.3.1 ROM Cell:

    Data r written in ROM when it is manufactured.

    When Transistor is connected to ground point P then cell has logic 0 else logic 1 value.

    To read the state of the cell, word Line is activated & T turns on and sense circuit gives

    the O/P state from the bit line.

    5.3.2 PROM(Programmable ROM):

    Some ROM designs allow the data to be loaded by the user called Programmable ROM.

    Mem contains all values as Zero. To insert logic 1 in a location, high-current pulses can

    be fused at point P.

    5.3.3. EPROM(Erasable Programmable ROM):

    ROM chip that allows the stored data to be erased and new data to be loaded is EPROM.

    EPROMs r capable of retaining stored infm for a long time.

    Erasure can be done by exposing the chip to ultraviolet light.

    5.3.4 EEPROM (Electrically Programmable ROM):

    To erase the chip contents, the EPROM has to be removed physically from the circuit and

    erased by UV light.

    EEPROM allows the chip to be erasing the contents without removing physically.

    EEPROM needs different voltage for erasing, writing & reading.

    5.3.5 FLASH Memory:

    In EEPROM it is possible to R/W contents of every single cell.

    IN Flash device, it is possible to read contents of single cell but it is only possible to write

    an entire block of cells.

    Higher density, low cost/bit, less power consumption.

    Fig.5.12 A ROM cell

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    10/22

    Used in portable battery driven equipments.

    Larger modules can be implemented in two ways

    o Flash Cards

    o Flash Drives

    5.4 Speed, Size and Cost:

    SRAMs r faster, more expensive, larger and r used in cache memories.

    DRAMs r slower, cheaper, smaller and r used in main memory.

    5.5 Cache Memories:

    Since the speed of the main mem is slow, the cache memory is used which essentially

    makes the main mem available to the processor in a faster time.

    The effectiveness of cache mechanism is based on the property called Locality reference.

    Locality reference - Many instructions in the localized area of the program r executed

    repeatedly during some period of time and remainder of the program r accessed

    infrequently.

    It is done in 2 ways

    o Temporal Recently executed instructions is likely to be executed again very

    soon

    o Spatial Instructions that r closer to the executed instructions r likely to be

    execute soon.

    Caching block of datas Cache block is referred as cache line.

    When Read request comes from processor, the cache mem sends data if that specified

    data of given addr is there else read from the main mem.

    Cache mem has as much number of blocks but less than blocks in mem.

    The correspondence b/w the main mem block and cache block is specified by Mapping

    function.

    Fig.5.13 Memory hierarchy

    Figure 5.14. Use of a cache memory.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    11/22

    When cache mem is full and a block from main mem has to read, the control hardware

    has to decide which block has to be removed to create space for new block

    Replacement Algorithm.

    When R/W OP is done in cache mem, read/write hit is said to have occured.

    R OP do not involve main mem.

    W OP is done in 2 techniques.

    o Write-through Protocol: Cache location & main memory location r updated

    simultaneously.

    o Write-back or Copy-back Protocol: Update only Cache location and mark it as

    updated with associated flag bit (dirty or modified bit). Updates main memory

    later.

    Read Miss when the addressed word in a read operation is not in cache mem.

    Write Miss When the addressed word is not cache mem.

    o Writes directly into the main mem, if write-through protocol is used.

    o Writes first in the cache mem and then to main mem.

    5.5.1 Mapping Functions:

    Method to determine cache locations in which to store memory blocks.

    3 types Direct, Associative and Set-Associative Mapping.

    Consider a cache of 128 blocks with 16 words each.

    Main mem with 4k blocks of 16 words each.

    Direct Mapping:

    Block j of the main memory maps onto block j modulo 128 of the cache.

    Blocks 0, 128, 256. Is loaded in block 0 of cache.

    Replacement algorithm becomes trivial, for Ex when a program has block 1 to block 129.

    Both block 1 and block 129 has to be loaded in block 1 of cache location.

    Low order 4 bit main mem addr selects the 16 bit words in a block.

    7 bit cache block field determines the cache position in which this block must be stored

    The high order 5 bit is compared with tag bits associated with the cache location, if they

    match then the desired word is in the cache else read from main mem.

    Figure 5.15. Direct-mapped cache.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    12/22

    EX:

    Tag: 11101

    Block: 1111111=127, in the 127th block of the cache

    Word:1100=12, the 12th word of the 127th block in the cache

    Associative Mapping:

    Main mem block can be placed into any cache block position.

    12 bit tag r required to identify a mem block.

    Tag bits from processor r compared with cache and the blocks r located called associative

    mapping.

    12: 12 tag bits Identify which of the 4096 blocks that are resident in the cache.

    Tag: 111011111111

    Word:1100=12, the 12th word of a block in the cache

    4: one of 16 words. (each block has 16 words)

    Set-Associative Mapping:

    Combination of both direct & associative mapping techniques.

    Blocks of cache r grouped into sets.

    Ex 2 blocks / set i.e. memory block 0 may occupy either cache block0/ block 1

    One more control bit called Valid bit is provided to each block to indicate whether it

    contains valid data or not (Valid bit = 0).

    7 4 Main memory address

    Tag Block Word

    5

    11101, 1111111, 1100

    111011111111, 1100412

    Main memory address

    Tag Word

    Figure 5.16. Associative-mapped cache.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    13/22

    Transfers from the disk to main memory r carried out by a DMA (Direct Memory

    Access) mechanism.

    When the cache block is loaded for first time from main memory, the valid bit is set to 1.

    A transfer from main memory to disk uses write-back protocol.

    Flush mechanism forces the dirty data from cache block to main memory before DMA

    takes place.

    Need to ensure that the two different entities (Processor & DMA) use the same copies of

    data is referred as - Cache Coherence.

    4: one of 16 words.

    6: determines which set of cache might contain the desired block (128/2=64)

    6: 6 tag bits is used to check if the desired block is present

    Tag: 111011

    Set: 111111=63, in the 63rd set of the cache

    Word:1100=12, the 12th word of the 63rd set in the cache.

    5.5.2Replacement Algorithms:

    In Direct mapped cache, no replacement strategy exists.

    In associative and set-associative mapping, when a new block is to be brought into the

    cache & all the positions that it may occupy r full, then the cache controller decides

    which old blocks to overwrite.

    Programs usually stay in localized areas for reasonable periods of time, so the recently

    used blocks may be referred again.

    Hence the least recently used (LRU) blocks r replaced with the new blocks.

    Increase / clear track counters when a hit/miss occurs.

    Performance of LRU can be improved by using small amount of randomness in deciding

    which block to replace.

    6 6 4

    Tag Set Word

    111011, 111111, 1100

    Figure 5.17. Set-associative-mapped cache with two blocks per set.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    14/22

    5.6 Performance Considerations:

    Two key factors: performance and cost

    Price/performance ratio

    Performance depends on how fast machine instructions can be brought into the processor for

    execution and how fast they can be executed.

    In the memory hierarchy, it is beneficial if transfers to and from the faster units can be done

    at a rate equal to that of the faster unit.

    This is not possible if both the slow and the fast units are accessed in the same manner.

    However, it can be achieved when parallelism is used in the organizations of the slower unit.

    An effective way to introduce parallelism is to use an interleaved organization.

    5.6.1 Interleaving:

    If the main memory is structured as a collection of physically separated modules, each

    with its own ABR (Address buffer register) and DBR ( Data buffer register), memory

    access operations may proceed in more than one module at the same time.

    First Case:

    High orderKbits specify one of n modules.

    Low orderMbits namea particular word in that module

    Second Case:

    It is called memory interleaving

    Low orderKbits select a module.

    High orderMbits namea location within that module.

    To implement the interleaved structure, there must be 2kmodules.

    5.6.2 Hit Rate and Miss Penalty:

    The success rate in accessing information at various levels of the memory hierarchy hitrate / miss rate.

    A successful access to data in a cache is called a hit.

    The number of hits stated as a fraction of all attempted accesses is called the hit rate.

    The number of misses stated as a fraction of all attempted accesses is called the miss rate.

    Figure 5.25. Addressing multiple-module memory systems.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    15/22

    Ideally, the entire memory hierarchy would appear to the processor as a single memory

    unit that has the access time of a cache on the processor chip and the size of a magnetic

    disk depends on the hit rate (>>0.9).

    A miss causes extra time needed to bring the desired information into the cache is called

    the miss penalty.

    The impact of the cache on the overall performance of the computer:

    tave = hC + (1-h)M

    o tave: average access time experienced by the processor

    o h: hit rate

    o M: miss penalty- the time to access information in the main memory

    o C: the time to access information in the cache

    Example:

    o Assume that 30 percent of the instructions in a typical program perform a

    read/write operation, which means that there are 130 memory accesses for every

    100 instructions executed.

    o h=0.95 for instructions, h=0.9 for data

    o C=10 clock cycles, M=17 clock cycles

    Time without cache 130x10

    Time with cache 100(0.95x1+0.05x17)+30(0.9x1+0.1x17)

    o The computer with the cache performs five times better

    How to Improve Hit Rate?

    Use larger cache increased cost

    Increase the block size while keeping the total cache size constant.

    However, if the block size is too large, some items may not be referenced before the

    block is replaced miss penalty increases.

    Using Load-through approach reduces the miss penalty.

    5.6.3 Caches on the Processor Chip:

    Optimal place for cache is on the processor chip. But space on the processor chip is

    needed for many other functions.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    16/22

    All high performance chips include some form of cache.

    Some manufactures use two separate caches for instructions and data, respectively as in

    68040, Pentium III and Pentium IV Processors.

    Some use Single cache for both as in ARM710T processor.

    Which one has better hit rate? -- Single cache

    Whats the advantage of separating caches? parallelism, better performance

    High performance processors use 2 levels of caches.

    L1 cache faster, smaller and located in processor chip. Access more than one word

    simultaneously and let the processor use them one at a time.

    L2 cache slower, larger and implemented externally using SRAM chips.

    How about the average access time?

    Average access time: tave = h1C1 + (1-h1)h2C2 + (1-h1)(1-h2)M

    o Where h is the hit rate, Cis the time to access information in cache;Mis the time

    to access information in the main memory.

    5.6.4 Other Enhancements:

    Write buffer

    o Write operation in write-through protocol results in writing new value into the

    main memory, which makes the processor to wait for the memory functions to be

    completed.

    o Write buffer is used for temporary storage of write requests.

    o Write buffer sends this requests to main memory whenever main memory is not

    responding for read requests.

    o Write buffer also works for the writ-back protocol.

    Prefetching

    o Prefetch the data into the cache before they are needed.

    o A special prefetch instruction may be provided in the instruction set of the

    processor to achieve prefetching.

    o Executing this instruction causes the addressed data to be loaded into the cache,

    as in the case of read miss.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    17/22

    o Prefetch instructions can be inserted into a program either by the programmer or

    by the compiler.

    o Prefetching can also be done through hardware by adding circuitry that discovers

    a pattern in memory references and prefetches data according to that pattern.

    o Intels Pentium 4 processor use both software prefetching and hardware

    prefetching.

    Lockup-Free cache

    o Processor is able to access the cache while a miss is being serviced.

    o A cache that can support multiple outstanding misses is called lockup-free cache.

    o It can service only one miss at a time and includes circuitry that keeps track of all

    outstanding misses.

    5.7 Virtual Memories:

    Overview

    Physical main memory is not as large as the address space spanned by an address issued

    by the processor.

    When a program does not completely fit into the main memory, the parts of it not

    currently being executed are stored on secondary storage devices.

    Techniques that automatically move program and data blocks into the physical main

    memory when they are required for execution are called virtual-memory techniques.

    The binary addresses that the processor issue for either instructions or data are called

    virtual or logical addresses.

    These addresses r translated into physical addresses by a combination of hardware and

    software components.

    A special hardware unit called Memory Management Unit (MMU) translates virtual

    addresses into physical address.

    If the desired data is in the main memory, then it is accessed by cache mechanism else it

    is accessed from storage devices by DMA mechanism.

    5.7.1 Address Translation:

    To translate virtual addr into physical addr assume that all programs and data are

    composed of fixed-length units called pages, each of which consists of a block of words

    that occupy contiguous locations in the main memory.

    Figure 5.26 Virtual memory organization.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    18/22

    Pages commonly range from 2K to 16 bytes in length.

    Page cannot be too small, because the access time of magnetic disk is much longer than

    the access time of main memory.

    Page cannot be too large, because a substantial portion of a page may not be used, yet

    unnecessary data will occupy valuable space in the main memory.

    The virtual memory mechanism bridges the size and speed gaps between the main

    memory and secondary storage.

    Each virtual address generated by the processor, whether to fetch/store a

    instruction/operand, is interpreted as a virtual page number(high order) and offset(low

    order) that specifies the location of a particular byte within a page.

    Information about the main memory location of each page is kept in aPage table.

    An area in maim memory that can hold one page is calledpage frame. The starting addr of a page table is kept in apage table base register.

    The control bits describe the status of the page present in the main memory.

    One such bit indicates validity of the page, whether the page is actually loaded in the

    main memory.

    Another control bit indicates whether the page is modified during its residency in the

    main memory.

    Other control bits indicate various restrictions like full read and write permission or read

    access only, etc on accessing the page.

    The page table information is used by the MMU for every access, so it is supposed to be with

    the MMU.

    However, since MMU is on the processor chip and the page table is rather large, only smallportion of it, which consists of the page table entries that correspond to the most recently

    accessed pages, can be accommodated within the MMU.

    A small cache, called Translation Look aside Buffer (TLB) is incorporated in MMU for this

    purpose.

    TLB

    The contents of TLB must be coherent with the contents of page tables in the memory.

    When the OS updates the Page table, the corresponding TLB contents need to be updated.

    Translation procedure.

    o If the page table entry is found in the TLB, the physical address is obtained

    immediately.

    o If there is a miss in TLB, then the required entry is got from the Page table in themain memory and the TLB is updated.

    When a program generates an access request to a page that is not in the main memory, aPage

    faultis said to have occurred.

    If a new page is brought from the disk when the main memory is full, it must replace one of

    the resident pages.

    Figure 5.27 Virtual-memory address translation.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    19/22

    LRU replacement algorithm is applicable and is achieved by the control bit set to 1 whenever

    the corresponding page is accessed. This determines which pages have not been used

    recently.

    Write-through is not suitable for virtual memory.

    5.8 Memory Management Requirements:

    Management routines are part of the OS of the computer.

    Virtual address space is called System space.

    Providing separate page table for each user is called user space.

    No program should be allowed to destroy either the data or instruction of other programs,

    so Protection has to be addressed. (supervisor / user state, privileged instructions)

    Shared pages will have entry in two different page tables.

    5.9 Secondary Storage:

    Semiconductor memories r limited to cost per bit of stored information. This leads to

    large storage devices like magnetic disks, optical disks and magnetic tapes.

    5.9.1 Magnetic Hard Disks

    Magnetic disk system consists of one or more disks mounted on a common spindle.

    A thin magnetic film is deposited on each disk, usually on both sides.

    The disks are placed in a rotary drive so that the magnetized surfaces move in closeproximity to read/write heads.

    The disks rotate at a uniform speed.

    Each head consist of a magnetic yoke and a magnetizing coil.

    Digital information can be stored on the magnetic film by applying current pulses of

    suitable polarity to the magnetizing coil.

    using clock signal as a reference, the data stored on other tracks can be read correctly

    Encoding clock signal with data is approached.

    Manchester or phase encoding is used. In which the data sent is broken down into a series

    of long and short signals.

    Manchester encoding is a self-clocking data encoding method that divides the time

    required to define the bit into two cycles. The first cycle is the data value (0 or 1) and the

    second cycle provides the timing by shifting to the opposite state.

    Organization and Accessing of Data on a Disk:

    Each surface is divided into concentric tracks. Each track is divided intosectors.

    The set of corresponding tracks on all surfaces of a stack of disks forms a logical

    cylinder.

    The data r accessed by specifying the surface number, track number, and the sector

    number without moving the read/write heads.

    Figure 5.29 Magnetic disk Principles.

    Figure 5.28. Use of an associative-mapped TLB.

    Figure 5.30. Organization of one surface of a disk.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    20/22

    Following the data, there is an error-correction code (ECC).

    The stored information is packed more densely on inner tracks than on outer tracks.

    Access time

    o Seek time The time required to move the read/write head to the proper track.

    o Rotational delay or latency time amount of time that elapses after the head is

    positioned over the correct track until the starting position of the addresses sectorpasses under the read/write head.

    Data buffer/cache

    o A disk drive is connected to the rest of a computer system using some standard

    interconnection scheme.

    o The SCSI bus is capable of transferring data at much higher rates than the rate at

    which data can be read from the disk tracks.

    o AData bufferin the disk unit is a semiconductor memory, capable of storing a few

    mega bytes of data.

    o When a read request arrives at the disk, the controller can first check to see if the

    desired data are already available in the buffer. If so, the data can be accessed and

    placed on the SCSI bus in microseconds rather than milliseconds.

    Disk Controller:

    Operation of disk drives is controlled by a disk controllercircuit.

    It uses DMA scheme to transfer data b/w the disk and the MM.

    Main Memory address The addr of the first main memory location of the block of words

    involved in the transfer.

    Disk address The location of the sector containing the beginning of the desired block of

    words.

    Word Count The number of words in the block to be transferred.

    The disk controllers major functions r:

    o Seek Causes the disk drive to move the read/write head from its current

    position to the desired track.

    o Read

    o Write

    o Error checking Computes the error correcting code (ECC) value for the data

    read from a given sector and compares it with the corresponding ECC value read

    from the disk.

    Floppy Disks:

    Floppy disks are smaller, simpler, and cheaper disk units consist of a flexible, removable,

    plastic diskette coated with magnetic material. The diskette is enclosed in a plastic jacket, which has an opening where the read/write

    head makes contact with the diskette.

    A hole in the center of the diskette allows a spindle mechanism in the disk drive to

    position & rotate the diskette.

    RAID Disk Arrays:

    Redundant Array of Inexpensive Disks

    Figure 5.31 Disks Connected to the system bus

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    21/22

    Using multiple disks makes it cheaper for huge storage, and also possible to improve the

    reliability of the overall system.

    RAID0 data striping

    RAID1 identical copies of data on two disks

    RAID2, 3, 4 increased reliability

    RAID5 parity-based error-recovery

    5.9.2 Optical Disks:

    Large storage devices can also be implemented using optical means.

    CD Technology:

    A cross section of a small portion of a CD is shown in Fig 5.32 a.

    The bottom layer is polycarbonate plastic.

    The surface of this plastic is programmed to store data by indenting it withpits.

    The unintended parts are called lands.

    Fig 5.32 b shows what happens as the laser beam scans across the disk and encounters atransition from a pit to a land.

    Three different positions of the laser source and the detector are shown, as would occur

    when the disk is rotating.

    When the light reflects solely from the pit, or from the land, the detector will see the

    reflected beam as a bright spot.

    At the pit-land and land-pit transitions the detector will not see a reflected beam and will

    detect a dark spot.

    Fig 5.32 c depicts several transitions b/w lands and pits. If each transition, detected as a

    dark spot, is taken to denote the binary value 1, & the flat portions represents 0s.

    CD-ROM

    CD-Recordable (CD-R)

    CD-Re Writables (CD-RW)

    DVD Technology

    DVD-RAM

    5.9.3 Magnetic Tape systems:

    Magnetic tapes r suited for off-line storage of large amounts of data.

    They r typically used for hard disk backup purpose and for archival storage.

    Data on the tape r organized in the form ofrecords separated by gaps.

    A group of related records is called afile.

    The beginning of a file is identified by afile mark.

    The control commands used are:

    o Rewind tape

    Figure 5.32. Optical disk.

  • 7/31/2019 CA Unit4 Notes Without Diagrams

    22/22

    o Rewind and unload tape

    o Erase tape

    o Write tape mark

    o Forward space one record

    o Backspace one record

    o Forward space one file

    o Backspace one file.