Download - Dr. John P. Abraham, University of Texas Pan American Components of a Personal Computer Hardware -1 Dr. John P. Abraham

Dr. John P. Abraham, University of Texas Pan American

Components of a Personal Computer

Hardware -1

Dr. John P. Abraham


Motherboard

• A printed circuit board having a slot for the CPU, memory and interface cards, and buses for the data, control and address signals.

• More about this in the next lecture.


CPU

• Microprocessor manufacturers– Intel, Motorola, IBM,

Sun, and AMD

• Intel is the leader– 8088, 8086,

286,386,486, Pentium, Itanium and Xeon.

– Also the SX and Celeron versions


CPU (2)

• Motorola chips for Apple

• AMD chips (Duron and Athlon)

• Sun produces chips for their proprietary workstations.

• All CPUs need heat sink and fan


CPU Speed

• Hertz defined. 0.01 seconds for a clock cycle is same as 100 hertz.

• 1000 hertz is a kilohertz, 1000 kilohertz is a megahertz, and 1000 megahertz is a gigahertz.

• The occurrence of events in theCPU is determined by a clock that transmits regular sequences of alternating 1s and 0s of equal duration. A single transmission of the set (0,1) is called a clock cycle.


Clock Speed

• The processor and the busses on the motherboard operate at different speeds. Motherboard clock speed is slower.

• A clock multiplier is added to the system bus to obtain the CPU speed. – For example, if the motherboard runs at 533

Megahertz and the CPU speed is 2 Gigahertz then a multiplier of 4 will be used.


Performance of a CPU

• Depends upon the clock speed of the CPU and buses, width of the buses, stages in pipelining, cache performance, addressable memory, miss rate, hit rate, instruction mix of the program, and many other such variables.

• Performance measurement:– benchmark suites such as the SPEC 2000.


Performance of a CPU (2)

• Moore’s Law– How to improve performance.

• Increase transistors on a chip.

• Amdahl’s Law– How to measure performance

• The performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used.


Amdahl’s law Calculation

• Assume it takes 100 microseconds to complete a program using the old CPU. To run the same program in the new improved CPU, it will take 80 microseconds for 80% of the program and the remaining 20% will run twice as fast, therefore will take 10 microseconds, a total of 90 microseconds.

• Overall Speedup = Execution time of the unimproved CPU/Execution time of the improved CPU = 100 microseconds/90 microseconds = 1.1


Cache Memory

• Level 1 – Inside the CPU• level 2 – CPU or Motherboard• Intel Pentium 4 CPUs - 8K of Level 1 cache and

512 K of Level 2.• Level 3 – If L1 and L2 are incorporated into the

CPU, the cache on the MB is referred to as L3.• Apple PowerMac has 3 levels of cache in the

CPU.


Cache Memory (2)

• Principle: spatial locality and temporal locality.• Spatial locality: data that are physically close

together are accessed close together.• Temporal locality: recently accessed data will be

accessed again in the near future.

A program spends 90% of the execution time in only 10% of the code such as in case of loops and function calls.


Cache Memory (3)

• L1 is smaller than L2, L2 is smaller than L3, L3 is smaller than RAM, and so forth.

• Speed (fastest to the slowest)– Registers (1/2 clock cycle)

– L1 (1 clock cycle)

– L2

– L3

– Memory

– HD


Memory Mapping

• Take a computer with 256KB cache and 256MB of RAM. – 1 to 1024 ratio. Only one out of 1024 blocks of

RAM can be held in one line of cache at any given time.

• The RAM is divided into blocks of certain bytes each (say 64 bytes) and cache will be divided into lines of same number of bytes (64 bytes) as in each block.


Memory Mapping (2)• 256MB of RAM/64 bytes= 4,194,304

blocks of RAM

• 256KB cache/64 bytes = 4096 lines of cache

• 4194304/4096=1024

• How do we fit the larger number of blocks into smaller number of cache? . There are three different ways to map the blocks of memory to lines of cache: direct, fully associative, and set-associative mapping.


Direct memory mapping• The direct associative maps every 1024th block

into a particular line, determined by modulo arithmetic.

• If a certain block of memory needs to be accessed, the predetermined line can be accessed to verify if that line contains the required block. To make this verifying process easy and fast, the memory address issued by the CPU is divided into three fields, tag, index, and offset.

• The tag and the index fields together make up the block address and the offset indicates the byte number in the block.

• There are many tutorials on the web. I suggest trying some.


Direct memory mapping (2)

• To address 256 Meg (268,435,455 Bytes) of RAM, (bin 1111,1111,1111,1111,1111,1111,1111) requires 28 bits.

• The 28 bit address issued by the CPU is separated into three fields. Let us work from right to left. The rightmost field, the offset is 6 (26 is 64, recall that the block is 64 bytes).


Direct memory mapping (3)

• Total 28 bits, 6 is used for offset.• Remaining 22 bits used for the block address,

which can be further divided into tag and index. We have 4,194,304 blocks or 222.

• Index will tell us which cache line may have the information. Since there are 4096 or 212 cache lines we need 12 bits for the index.

• The remaining 10 bits is used for the tag. This will indicate which of the 1024 blocks is held by that line (as indicated by the index).


Example of Direct Mapped Cache256 Meg RAM, 256 K Cache, 64 byte block

Number of bits needed to Address 256M RAM 28 228 = 268,435,456

Number of Blocks in 256M RAM (64 bytes = 26 per block) 228 /26 = 222 = 4,194,304

Number of bits required to address 4,194,304 memory blocks 22

Number of bits required to address each byte in a block 6

Number of lines of cache available for 256K, 64 bytes/line 218 /26 =212 = 4,096

Number of bits required to address the 4096 cache lines 12

Number of bits for tag 28 – (6+12) = 10

Number of blocks of memory represented by each line of cache 210 = 1024


Associative Caching• Fully Associative

– block from the memory may reside in any line in the cache

– all tags searched in parallel to find the desired block address.

• 2-way, 4-way, and 8-way– a block can only be placed within a specified

set – Intel Pentium 4 implements a 8-way associative

L2 cache.


Replacement Policy• Hit vs. Miss. miss penalty depends upon memory

speed, bus speed and bus width • A cache miss causes the CPU to stall and replace

an existing block.• In direct mapped the new block can only be placed

in one pre-ordained line. • In case of the fully associative and set associative,

a decision has to be made as to which line of data to replace.

• Some algorithms that could be used are first-in, first-out and least-recently used.


Write Policy

• Three options:– only update the RAM, only update the cache,

and update both memory and cache(write through). Since reads are more important for performance of the CPU, updating RAM without updating cache is simply not done.

• The first two options make the other unusable.

• Dirty bit to indicate write back required.


Main Memory

• Memory speed has only increased by a factor of 2 over past 15 years where as the CPU speed has increased almost 1,000 times.

• It takes more than 60 clock cycles for the CPU to access the data from RAM.

• With increased cache availability, what is needed is high bandwidth for the RAM.


Main Memory (2)

• Dual Inline Pin Package (DIPP)

• Single Inline Pin Package (SIPP)

• Single Inline Memory Module (SIMM)

• dual inline memory module (DIMM)– Synchronous Dynamic RAM (SDRAM)

• Double Data Rate RAM (DDR-DRAM ).

• Parity explained.


Main Memory(3)

• Overlays

• Virtual memory

• Paging


Virtual Memory

• Handled by the memory manager of OS• When program memory requirements

exceed actual memory.• Required space is divided into pages and

stored in the hard disk drive.• Pages are brought in as needed replacing

another page no longer required (presumed).


Secondary Storage

• Floppy Drives

• Hard Drives– MFM, RLL, ESDI, IDE, EIDE, SCSI, Firewire,

ATA.– RAID levels

• Removable Disks


IEEE 1394 FireWire

• High performance serial bus

• Fast

• Low cost

• Easy to implement

• Also being used in digital cameras, VCRs and TV


SCSI (small computer system interface)

• SCSI-1 was standardized by ANSI in 1986. Parallel Interface.

• The industry, then, decided to agree on a minimum set of 18 basic commands. This command set was called the Common Command Set (CCS). CCS became the basis for SCSI-2

• SCSI-2 also provided extra speed with options called Fast SCSI and a 16-bit version called Wide SCSI.


SCSI

• This also means that SCSI-1 adapters will work with SCSI-2 hardware. SCSI-1 and SCSI-2 compliant hardware is the same.

• Fast SCSI delivers a 10 MB/sec transfer rate. When combined with the 16-bit bus, this doubles to 20 MB/sec. This is called Fast-Wide SCSI.

• Ultra-Wide SCSI incorporates the 16-bit bus, and the speed raises to 40MB/sec.


SCSI

• Each device must have a SCSI ID, 0-7. (Newer ones 0-15)

• The host adapter takes one ID. Most are usually factory-set to ID 7, which is the highest-priority ID.

• Besides configuring the proper ID, proper termination must be ensured. install termination at each end of the bus.


FireWire Configuration

• Daisy chain• Up to 63 devices on single port

– Really 64 of which one is the interface itself

• Up to 1022 buses can be connected with bridges

• Automatic configuration• No bus terminators• May be tree structure


Simple FireWire Configuration


FireWire 3 Layer Stack

• Physical– Transmission medium, electrical and signaling

characteristics

• Link– Transmission of data in packets

• Transaction– Request-response protocol


FireWire Protocol Stack


FireWire - Physical Layer

• Data rates from 25 to 400Mbps• Two forms of arbitration

– Based on tree structure

– Root acts as arbiter

– First come first served

– Natural priority controls simultaneous requests• i.e. who is nearest to root

– Fair arbitration

– Urgent arbitration


FireWire - Link Layer

• Two transmission types– Asynchronous

• Variable amount of data and several bytes of transaction data transferred as a packet

• To explicit address• Acknowledgement returned

– Isochronous• Variable amount of data in sequence of fixed size packets at

regular intervals• Simplified addressing• No acknowledgement


RAID

• Redundant Array of Independent Disks • Redundant Array of Inexpensive Disks• 6 levels in common use• Not a hierarchy• Set of physical disks viewed as single logical drive

by O/S• Data distributed across physical drives• Can use redundant capacity to store parity

information


RAID 0

• No redundancy• Data striped across all disks• Round Robin striping• Increase speed

– Multiple data requests probably not on same disk

– Disks seek in parallel

– A set of data is likely to be striped across multiple disks


RAID 1

• Mirrored Disks• Data is striped across disks• 2 copies of each stripe on separate disks• Read from either• Write to both• Recovery is simple

– Swap faulty disk & re-mirror– No down time

• Expensive


RAID 2

• Disks are synchronized• Very small stripes

– Often single byte/word

• Error correction calculated across corresponding bits on disks

• Multiple parity disks store Hamming code error correction in corresponding positions

• Lots of redundancy– Expensive– Not used


RAID 3

• Similar to RAID 2• Only one redundant disk, no matter how

large the array• Simple parity bit for each set of

corresponding bits• Data on failed drive can be reconstructed

from surviving data and parity info• Very high transfer rates


RAID 4

• Each disk operates independently

• Good for high I/O request rate

• Large stripes

• Bit by bit parity calculated across stripes on each disk

• Parity stored on parity disk


RAID 5

• Like RAID 4

• Parity striped across all disks

• Round robin allocation for parity stripe

• Avoids RAID 4 bottleneck at parity disk

• Commonly used in network servers

• N.B. DOES NOT MEAN 5 DISKS!!!!!


RAID 6

• Two parity calculations

• Stored in separate blocks on different disks

• User requirement of N disks needs N+2

• High data availability– Three disks need to fail for data loss– Significant write penalty


Input/Output Interfaces

• there are a wide variety of devices with differing methods of operation, and the data formats and block transfer modes are different.

• The I/O module has a processor on it that handles all the operations of the attached devices, satisfies requests of the CPU, and may have local memory for buffering the I/O stream.


Bus transfer rate

• Bus transfer rate or Throughput in MBps = Speed of the bus in MHz * (Width of data transferred in bytes/ Cycles to transfer these bytes)


Here is a chart showing some calculations.

Bus Type Bus Speed in MHz

Data Width in

bytes

Cycles to transfer

Calculation Throughput in MB/sec

ISA 8.33 2 2 8.33*(2/2) 8.33

EISA 8.33 4 1 8.33*(4/2) 33.32

MCA 10.0 4 1 10.0*(4/1) 40

VESA 33 4 1 33*(4/1) 132

SUN 50 4 1 50*(4/1) 200

PCI 33 8 1 33*(8/1) 264

AGP 66 8 1 66*(8/1) 528


Computer Case

• Case– The .design of the case will determine the

number of bays for removable and non-removable drives, and type of motherboard.

– rack mount systems do not require individual cases.

– Fans


Power Supply• Voltage and Wattage• Converts the AC to DC and steps down the

voltage from 120 or 220 volts to 12, 5 and 3.5 volts.

• converts AC to DC using full wave rectifiers and filters. – Rectifiers converts AC to DC by using power

diodes or by controlling the firing angles of thyristors.

– At the component level a rectifier uses diodes and capacitors.


Power Supply (2)• Step down transformer

– first convert the 60 Hz AC to a much higher frequency in order to make the stepping down much easier and more accurate.

– Such a power supply is called a switching power supply and provides a more uniform voltage to the computer.

– Voltage is stepped up or down by varying number of secondary windings in the transformer.


Power Supply (3)

• Wattage– The wattage rating of the power supply should

be sufficient to handle all components plugged into it.

– For example, drives generally consume 5 to 15 watts, motherboard 30 watts, CPU 50 watts and an AGP video card 30 watts. These specifications should be obtained from the manufacturer of the components.


The following slides are from Dale, your textbook


Memory

Memory A collection of cells,each with a uniquephysical address; bothaddresses andcontents are in binary

Dr. John P. Abraham, University of Texas Pan American54

Arithmetic/Logic Unit

Performs basic arithmetic operations such as adding

Performs logical operations such as AND, OR, and NOT

Most modern ALUs have a small amount of special storage units called registers


Input/Output Units

Input Unit A device through which data and programs fromthe outside world are entered into the computer;

Can you name three?Output unit A device through which results stored in thecomputer memory are made available to theoutside world

Can you name two?


Control UnitControl unit

The organizing force in the computer

Instruction register (IR)

Contains the instruction that is being executed

Program counter (PC)

Contains the address of the next instruction to be

executed

Central Processing Unit (CPU)

ALU and the control unit called the, or CPU


Flow of Information BusA set of wires that connect all major sections

Figure 5.2 Data flow through a von Neumann architecture


The Fetch-Execute Cycle

Fetch the next instruction

Decode the instruction

Get data if needed

Execute the instruction

Why is it called a cycle?


The Fetch-Execute Cycle

Figure 5.3 The Fetch-Execute Cycle


RAM and ROM

Random Access Memory (RAM)Memory in which each location can be accessed and changed Read Only Memory (ROM)Memory in which each location can be accessed but not changedRAM is volatile, ROM is not

What does volatile mean?


Secondary Storage Devices

Why is it necessary to have secondary storage devices?

Can you name some of these devices?


Magnetic Tape

The first truly mass auxiliary storage device was the magnetic tape drive

Tape drives have amajor problem; canyou describe it?

Figure 5.4 A magnetic tape


Magnetic Disks

Figure 5.5 The organization of a magnetic disk


Magnetic Disks

HistoryFloppy disks (Why "floppy"?)

1970. 8" in diameter " late 1970, 5 1/2" now, 3 1/2"

Zip drives


Magnetic Disks

Seek timeTime it takes for read/write head to be over right trackLatencyTime it takes for sector to be in positionAccess time


Compact Disks

CD A compact disk that uses a laser to read information stored optically on a plastic disk; data is evenly distributed around track

CD-ROM read-only memoryCD-DA digital audioCD-WORM write once, read manyRW or RAM both read from and written to

DVD Digital Versatile Disk, used for storing audio and video


Flash Drives

Flash MemoryNonvolatileCan be erased and rewritten


Touch Screens

Touch screen

A computer monitor that can respond to the user touching the screen with a stylus or finger

There are three types– Resistive

– Capacitive

– Infrared

– Surface acoustic wave (SAW)


Touch Screens

Resistive touch screen A screen made up of two layers of electrically conductive material

– One layer has vertical lines, the other has horizontal lines

– When the top layer is pressed, it comes in contact with the second layer which allows electrical current to flow

– The specific vertical and horizontal lines that make contact dictate the location on the screen that was touched


Touch Screens

Capacitive touch screen

A screen made up of a laminate applied over a glass screen

– Laminate conducts electricity in all directions; a very small current is applied equally on the four corners

– When the screen is touched, current flows to the finger or stylus

– The location of the touch on the screen is determined by comparing how strong the flow of electricity is from each corner


Touch Screens

Infrared touch screen

A screen with crisscrossing horizontal and vertical beams of infrared light

– Sensors on opposite sides of the screen detect the beams

– When the user breaks the beams by touching the screen, the location of the break can be determined


Touch Screens

Surface acoustic wave (SAW)

A screen with crisscrossing high frequency sound waves across the horizontal and vertical axes

– When a finger touches the surface, corresponding sensors detect the interruption and determine location of the touch


Synchronous processing

One approach to parallelism is to have multiple processors apply the same program to multiple data sets

Figure 5.8 Processors in a synchronous computing environment


PipeliningArranges processors in tandem, where each processor contributes one part to an overall computation

Figure 5.9 Processors in a pipeline


Shared MemoryParallel Processor

Communicate through shared memory

Figure 5.10 Shared memory configuration of processors


Embedded Systems

Embedded systems

Computers that are dedicated to perform

a narrow range of functions as part of a

larger system

Empty your pockets or backpacks.

How many embedded systems do you have?

Download - Dr. John P. Abraham, University of Texas Pan American Components of a Personal Computer Hardware -1 Dr. John P. Abraham

Top Related