Dr. John P. Abraham, University of Texas Pan American
Components of a Personal Computer
Hardware -1
Dr. John P. Abraham
Dr. John P. Abraham, University of Texas Pan American
Motherboard
• A printed circuit board having a slot for the CPU, memory and interface cards, and buses for the data, control and address signals.
• More about this in the next lecture.
Dr. John P. Abraham, University of Texas Pan American
CPU
• Microprocessor manufacturers– Intel, Motorola, IBM,
Sun, and AMD
• Intel is the leader– 8088, 8086,
286,386,486, Pentium, Itanium and Xeon.
– Also the SX and Celeron versions
Dr. John P. Abraham, University of Texas Pan American
CPU (2)
• Motorola chips for Apple
• AMD chips (Duron and Athlon)
• Sun produces chips for their proprietary workstations.
• All CPUs need heat sink and fan
Dr. John P. Abraham, University of Texas Pan American
CPU Speed
• Hertz defined. 0.01 seconds for a clock cycle is same as 100 hertz.
• 1000 hertz is a kilohertz, 1000 kilohertz is a megahertz, and 1000 megahertz is a gigahertz.
• The occurrence of events in theCPU is determined by a clock that transmits regular sequences of alternating 1s and 0s of equal duration. A single transmission of the set (0,1) is called a clock cycle.
Dr. John P. Abraham, University of Texas Pan American
Clock Speed
• The processor and the busses on the motherboard operate at different speeds. Motherboard clock speed is slower.
• A clock multiplier is added to the system bus to obtain the CPU speed. – For example, if the motherboard runs at 533
Megahertz and the CPU speed is 2 Gigahertz then a multiplier of 4 will be used.
Dr. John P. Abraham, University of Texas Pan American
Performance of a CPU
• Depends upon the clock speed of the CPU and buses, width of the buses, stages in pipelining, cache performance, addressable memory, miss rate, hit rate, instruction mix of the program, and many other such variables.
• Performance measurement:– benchmark suites such as the SPEC 2000.
Dr. John P. Abraham, University of Texas Pan American
Performance of a CPU (2)
• Moore’s Law– How to improve performance.
• Increase transistors on a chip.
• Amdahl’s Law– How to measure performance
• The performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used.
Dr. John P. Abraham, University of Texas Pan American
Amdahl’s law Calculation
• Assume it takes 100 microseconds to complete a program using the old CPU. To run the same program in the new improved CPU, it will take 80 microseconds for 80% of the program and the remaining 20% will run twice as fast, therefore will take 10 microseconds, a total of 90 microseconds.
• Overall Speedup = Execution time of the unimproved CPU/Execution time of the improved CPU = 100 microseconds/90 microseconds = 1.1
Dr. John P. Abraham, University of Texas Pan American
Cache Memory
• Level 1 – Inside the CPU• level 2 – CPU or Motherboard• Intel Pentium 4 CPUs - 8K of Level 1 cache and
512 K of Level 2.• Level 3 – If L1 and L2 are incorporated into the
CPU, the cache on the MB is referred to as L3.• Apple PowerMac has 3 levels of cache in the
CPU.
Dr. John P. Abraham, University of Texas Pan American
Cache Memory (2)
• Principle: spatial locality and temporal locality.• Spatial locality: data that are physically close
together are accessed close together.• Temporal locality: recently accessed data will be
accessed again in the near future.
A program spends 90% of the execution time in only 10% of the code such as in case of loops and function calls.
Dr. John P. Abraham, University of Texas Pan American
Cache Memory (3)
• L1 is smaller than L2, L2 is smaller than L3, L3 is smaller than RAM, and so forth.
• Speed (fastest to the slowest)– Registers (1/2 clock cycle)
– L1 (1 clock cycle)
– L2
– L3
– Memory
– HD
Dr. John P. Abraham, University of Texas Pan American
Memory Mapping
• Take a computer with 256KB cache and 256MB of RAM. – 1 to 1024 ratio. Only one out of 1024 blocks of
RAM can be held in one line of cache at any given time.
• The RAM is divided into blocks of certain bytes each (say 64 bytes) and cache will be divided into lines of same number of bytes (64 bytes) as in each block.
Dr. John P. Abraham, University of Texas Pan American
Memory Mapping (2)• 256MB of RAM/64 bytes= 4,194,304
blocks of RAM
• 256KB cache/64 bytes = 4096 lines of cache
• 4194304/4096=1024
• How do we fit the larger number of blocks into smaller number of cache? . There are three different ways to map the blocks of memory to lines of cache: direct, fully associative, and set-associative mapping.
Dr. John P. Abraham, University of Texas Pan American
Direct memory mapping• The direct associative maps every 1024th block
into a particular line, determined by modulo arithmetic.
• If a certain block of memory needs to be accessed, the predetermined line can be accessed to verify if that line contains the required block. To make this verifying process easy and fast, the memory address issued by the CPU is divided into three fields, tag, index, and offset.
• The tag and the index fields together make up the block address and the offset indicates the byte number in the block.
• There are many tutorials on the web. I suggest trying some.
Dr. John P. Abraham, University of Texas Pan American
Direct memory mapping (2)
• To address 256 Meg (268,435,455 Bytes) of RAM, (bin 1111,1111,1111,1111,1111,1111,1111) requires 28 bits.
• The 28 bit address issued by the CPU is separated into three fields. Let us work from right to left. The rightmost field, the offset is 6 (26 is 64, recall that the block is 64 bytes).
Dr. John P. Abraham, University of Texas Pan American
Direct memory mapping (3)
• Total 28 bits, 6 is used for offset.• Remaining 22 bits used for the block address,
which can be further divided into tag and index. We have 4,194,304 blocks or 222.
• Index will tell us which cache line may have the information. Since there are 4096 or 212 cache lines we need 12 bits for the index.
• The remaining 10 bits is used for the tag. This will indicate which of the 1024 blocks is held by that line (as indicated by the index).
Dr. John P. Abraham, University of Texas Pan American
Example of Direct Mapped Cache256 Meg RAM, 256 K Cache, 64 byte block
Number of bits needed to Address 256M RAM 28 228 = 268,435,456
Number of Blocks in 256M RAM (64 bytes = 26 per block) 228 /26 = 222 = 4,194,304
Number of bits required to address 4,194,304 memory blocks 22
Number of bits required to address each byte in a block 6
Number of lines of cache available for 256K, 64 bytes/line 218 /26 =212 = 4,096
Number of bits required to address the 4096 cache lines 12
Number of bits for tag 28 – (6+12) = 10
Number of blocks of memory represented by each line of cache 210 = 1024
Dr. John P. Abraham, University of Texas Pan American
Associative Caching• Fully Associative
– block from the memory may reside in any line in the cache
– all tags searched in parallel to find the desired block address.
• 2-way, 4-way, and 8-way– a block can only be placed within a specified
set – Intel Pentium 4 implements a 8-way associative
L2 cache.
Dr. John P. Abraham, University of Texas Pan American
Replacement Policy• Hit vs. Miss. miss penalty depends upon memory
speed, bus speed and bus width • A cache miss causes the CPU to stall and replace
an existing block.• In direct mapped the new block can only be placed
in one pre-ordained line. • In case of the fully associative and set associative,
a decision has to be made as to which line of data to replace.
• Some algorithms that could be used are first-in, first-out and least-recently used.
Dr. John P. Abraham, University of Texas Pan American
Write Policy
• Three options:– only update the RAM, only update the cache,
and update both memory and cache(write through). Since reads are more important for performance of the CPU, updating RAM without updating cache is simply not done.
• The first two options make the other unusable.
• Dirty bit to indicate write back required.
Dr. John P. Abraham, University of Texas Pan American
Main Memory
• Memory speed has only increased by a factor of 2 over past 15 years where as the CPU speed has increased almost 1,000 times.
• It takes more than 60 clock cycles for the CPU to access the data from RAM.
• With increased cache availability, what is needed is high bandwidth for the RAM.
Dr. John P. Abraham, University of Texas Pan American
Main Memory (2)
• Dual Inline Pin Package (DIPP)
• Single Inline Pin Package (SIPP)
• Single Inline Memory Module (SIMM)
• dual inline memory module (DIMM)– Synchronous Dynamic RAM (SDRAM)
• Double Data Rate RAM (DDR-DRAM ).
• Parity explained.
Dr. John P. Abraham, University of Texas Pan American
Main Memory(3)
• Overlays
• Virtual memory
• Paging
Dr. John P. Abraham, University of Texas Pan American
Virtual Memory
• Handled by the memory manager of OS• When program memory requirements
exceed actual memory.• Required space is divided into pages and
stored in the hard disk drive.• Pages are brought in as needed replacing
another page no longer required (presumed).
Dr. John P. Abraham, University of Texas Pan American
Secondary Storage
• Floppy Drives
• Hard Drives– MFM, RLL, ESDI, IDE, EIDE, SCSI, Firewire,
ATA.– RAID levels
• Removable Disks
Dr. John P. Abraham, University of Texas Pan American
IEEE 1394 FireWire
• High performance serial bus
• Fast
• Low cost
• Easy to implement
• Also being used in digital cameras, VCRs and TV
Dr. John P. Abraham, University of Texas Pan American
SCSI (small computer system interface)
• SCSI-1 was standardized by ANSI in 1986. Parallel Interface.
• The industry, then, decided to agree on a minimum set of 18 basic commands. This command set was called the Common Command Set (CCS). CCS became the basis for SCSI-2
• SCSI-2 also provided extra speed with options called Fast SCSI and a 16-bit version called Wide SCSI.
Dr. John P. Abraham, University of Texas Pan American
SCSI
• This also means that SCSI-1 adapters will work with SCSI-2 hardware. SCSI-1 and SCSI-2 compliant hardware is the same.
• Fast SCSI delivers a 10 MB/sec transfer rate. When combined with the 16-bit bus, this doubles to 20 MB/sec. This is called Fast-Wide SCSI.
• Ultra-Wide SCSI incorporates the 16-bit bus, and the speed raises to 40MB/sec.
Dr. John P. Abraham, University of Texas Pan American
SCSI
• Each device must have a SCSI ID, 0-7. (Newer ones 0-15)
• The host adapter takes one ID. Most are usually factory-set to ID 7, which is the highest-priority ID.
• Besides configuring the proper ID, proper termination must be ensured. install termination at each end of the bus.
Dr. John P. Abraham, University of Texas Pan American
FireWire Configuration
• Daisy chain• Up to 63 devices on single port
– Really 64 of which one is the interface itself
• Up to 1022 buses can be connected with bridges
• Automatic configuration• No bus terminators• May be tree structure
Dr. John P. Abraham, University of Texas Pan American
Simple FireWire Configuration
Dr. John P. Abraham, University of Texas Pan American
FireWire 3 Layer Stack
• Physical– Transmission medium, electrical and signaling
characteristics
• Link– Transmission of data in packets
• Transaction– Request-response protocol
Dr. John P. Abraham, University of Texas Pan American
FireWire Protocol Stack
Dr. John P. Abraham, University of Texas Pan American
FireWire - Physical Layer
• Data rates from 25 to 400Mbps• Two forms of arbitration
– Based on tree structure
– Root acts as arbiter
– First come first served
– Natural priority controls simultaneous requests• i.e. who is nearest to root
– Fair arbitration
– Urgent arbitration
Dr. John P. Abraham, University of Texas Pan American
FireWire - Link Layer
• Two transmission types– Asynchronous
• Variable amount of data and several bytes of transaction data transferred as a packet
• To explicit address• Acknowledgement returned
– Isochronous• Variable amount of data in sequence of fixed size packets at
regular intervals• Simplified addressing• No acknowledgement
Dr. John P. Abraham, University of Texas Pan American
RAID
• Redundant Array of Independent Disks • Redundant Array of Inexpensive Disks• 6 levels in common use• Not a hierarchy• Set of physical disks viewed as single logical drive
by O/S• Data distributed across physical drives• Can use redundant capacity to store parity
information
Dr. John P. Abraham, University of Texas Pan American
RAID 0
• No redundancy• Data striped across all disks• Round Robin striping• Increase speed
– Multiple data requests probably not on same disk
– Disks seek in parallel
– A set of data is likely to be striped across multiple disks
Dr. John P. Abraham, University of Texas Pan American
RAID 1
• Mirrored Disks• Data is striped across disks• 2 copies of each stripe on separate disks• Read from either• Write to both• Recovery is simple
– Swap faulty disk & re-mirror– No down time
• Expensive
Dr. John P. Abraham, University of Texas Pan American
RAID 2
• Disks are synchronized• Very small stripes
– Often single byte/word
• Error correction calculated across corresponding bits on disks
• Multiple parity disks store Hamming code error correction in corresponding positions
• Lots of redundancy– Expensive– Not used
Dr. John P. Abraham, University of Texas Pan American
RAID 3
• Similar to RAID 2• Only one redundant disk, no matter how
large the array• Simple parity bit for each set of
corresponding bits• Data on failed drive can be reconstructed
from surviving data and parity info• Very high transfer rates
Dr. John P. Abraham, University of Texas Pan American
RAID 4
• Each disk operates independently
• Good for high I/O request rate
• Large stripes
• Bit by bit parity calculated across stripes on each disk
• Parity stored on parity disk
Dr. John P. Abraham, University of Texas Pan American
RAID 5
• Like RAID 4
• Parity striped across all disks
• Round robin allocation for parity stripe
• Avoids RAID 4 bottleneck at parity disk
• Commonly used in network servers
• N.B. DOES NOT MEAN 5 DISKS!!!!!
Dr. John P. Abraham, University of Texas Pan American
RAID 6
• Two parity calculations
• Stored in separate blocks on different disks
• User requirement of N disks needs N+2
• High data availability– Three disks need to fail for data loss– Significant write penalty
Dr. John P. Abraham, University of Texas Pan American
Input/Output Interfaces
• there are a wide variety of devices with differing methods of operation, and the data formats and block transfer modes are different.
• The I/O module has a processor on it that handles all the operations of the attached devices, satisfies requests of the CPU, and may have local memory for buffering the I/O stream.
Dr. John P. Abraham, University of Texas Pan American
Bus transfer rate
• Bus transfer rate or Throughput in MBps = Speed of the bus in MHz * (Width of data transferred in bytes/ Cycles to transfer these bytes)
Dr. John P. Abraham, University of Texas Pan American
Here is a chart showing some calculations.
Bus Type Bus Speed in MHz
Data Width in
bytes
Cycles to transfer
Calculation Throughput in MB/sec
ISA 8.33 2 2 8.33*(2/2) 8.33
EISA 8.33 4 1 8.33*(4/2) 33.32
MCA 10.0 4 1 10.0*(4/1) 40
VESA 33 4 1 33*(4/1) 132
SUN 50 4 1 50*(4/1) 200
PCI 33 8 1 33*(8/1) 264
AGP 66 8 1 66*(8/1) 528
Dr. John P. Abraham, University of Texas Pan American
Computer Case
• Case– The .design of the case will determine the
number of bays for removable and non-removable drives, and type of motherboard.
– rack mount systems do not require individual cases.
– Fans
Dr. John P. Abraham, University of Texas Pan American
Power Supply• Voltage and Wattage• Converts the AC to DC and steps down the
voltage from 120 or 220 volts to 12, 5 and 3.5 volts.
• converts AC to DC using full wave rectifiers and filters. – Rectifiers converts AC to DC by using power
diodes or by controlling the firing angles of thyristors.
– At the component level a rectifier uses diodes and capacitors.
Dr. John P. Abraham, University of Texas Pan American
Power Supply (2)• Step down transformer
– first convert the 60 Hz AC to a much higher frequency in order to make the stepping down much easier and more accurate.
– Such a power supply is called a switching power supply and provides a more uniform voltage to the computer.
– Voltage is stepped up or down by varying number of secondary windings in the transformer.
Dr. John P. Abraham, University of Texas Pan American
Power Supply (3)
• Wattage– The wattage rating of the power supply should
be sufficient to handle all components plugged into it.
– For example, drives generally consume 5 to 15 watts, motherboard 30 watts, CPU 50 watts and an AGP video card 30 watts. These specifications should be obtained from the manufacturer of the components.
Dr. John P. Abraham, University of Texas Pan American
The following slides are from Dale, your textbook
Dr. John P. Abraham, University of Texas Pan American
Memory
Memory A collection of cells,each with a uniquephysical address; bothaddresses andcontents are in binary
Dr. John P. Abraham, University of Texas Pan American54
Arithmetic/Logic Unit
Performs basic arithmetic operations such as adding
Performs logical operations such as AND, OR, and NOT
Most modern ALUs have a small amount of special storage units called registers
Dr. John P. Abraham, University of Texas Pan American55
Input/Output Units
Input Unit A device through which data and programs fromthe outside world are entered into the computer;
Can you name three?Output unit A device through which results stored in thecomputer memory are made available to theoutside world
Can you name two?
Dr. John P. Abraham, University of Texas Pan American56
Control UnitControl unit
The organizing force in the computer
Instruction register (IR)
Contains the instruction that is being executed
Program counter (PC)
Contains the address of the next instruction to be
executed
Central Processing Unit (CPU)
ALU and the control unit called the, or CPU
Dr. John P. Abraham, University of Texas Pan American57
Flow of Information BusA set of wires that connect all major sections
Figure 5.2 Data flow through a von Neumann architecture
Dr. John P. Abraham, University of Texas Pan American58
The Fetch-Execute Cycle
Fetch the next instruction
Decode the instruction
Get data if needed
Execute the instruction
Why is it called a cycle?
Dr. John P. Abraham, University of Texas Pan American59
The Fetch-Execute Cycle
Figure 5.3 The Fetch-Execute Cycle
Dr. John P. Abraham, University of Texas Pan American60
RAM and ROM
Random Access Memory (RAM)Memory in which each location can be accessed and changed Read Only Memory (ROM)Memory in which each location can be accessed but not changedRAM is volatile, ROM is not
What does volatile mean?
Dr. John P. Abraham, University of Texas Pan American61
Secondary Storage Devices
Why is it necessary to have secondary storage devices?
Can you name some of these devices?
Dr. John P. Abraham, University of Texas Pan American
Magnetic Tape
The first truly mass auxiliary storage device was the magnetic tape drive
Tape drives have amajor problem; canyou describe it?
Figure 5.4 A magnetic tape
Dr. John P. Abraham, University of Texas Pan American
Magnetic Disks
Figure 5.5 The organization of a magnetic disk
Dr. John P. Abraham, University of Texas Pan American64
Magnetic Disks
HistoryFloppy disks (Why "floppy"?)
1970. 8" in diameter " late 1970, 5 1/2" now, 3 1/2"
Zip drives
Dr. John P. Abraham, University of Texas Pan American65
Magnetic Disks
Seek timeTime it takes for read/write head to be over right trackLatencyTime it takes for sector to be in positionAccess time
Dr. John P. Abraham, University of Texas Pan American66
Compact Disks
CD A compact disk that uses a laser to read information stored optically on a plastic disk; data is evenly distributed around track
CD-ROM read-only memoryCD-DA digital audioCD-WORM write once, read manyRW or RAM both read from and written to
DVD Digital Versatile Disk, used for storing audio and video
Dr. John P. Abraham, University of Texas Pan American67
Flash Drives
Flash MemoryNonvolatileCan be erased and rewritten
Dr. John P. Abraham, University of Texas Pan American68
Touch Screens
Touch screen
A computer monitor that can respond to the user touching the screen with a stylus or finger
There are three types– Resistive
– Capacitive
– Infrared
– Surface acoustic wave (SAW)
Dr. John P. Abraham, University of Texas Pan American69
Touch ScreensFigure 5.7A touch screen(RandyAllbritton/Photodisc/Getty Images© 2003
Dr. John P. Abraham, University of Texas Pan American70
Touch Screens
Resistive touch screen A screen made up of two layers of electrically conductive material
– One layer has vertical lines, the other has horizontal lines
– When the top layer is pressed, it comes in contact with the second layer which allows electrical current to flow
– The specific vertical and horizontal lines that make contact dictate the location on the screen that was touched
Dr. John P. Abraham, University of Texas Pan American71
Touch Screens
Capacitive touch screen
A screen made up of a laminate applied over a glass screen
– Laminate conducts electricity in all directions; a very small current is applied equally on the four corners
– When the screen is touched, current flows to the finger or stylus
– The location of the touch on the screen is determined by comparing how strong the flow of electricity is from each corner
Dr. John P. Abraham, University of Texas Pan American72
Touch Screens
Infrared touch screen
A screen with crisscrossing horizontal and vertical beams of infrared light
– Sensors on opposite sides of the screen detect the beams
– When the user breaks the beams by touching the screen, the location of the break can be determined
Dr. John P. Abraham, University of Texas Pan American73
Touch Screens
Surface acoustic wave (SAW)
A screen with crisscrossing high frequency sound waves across the horizontal and vertical axes
– When a finger touches the surface, corresponding sensors detect the interruption and determine location of the touch
Dr. John P. Abraham, University of Texas Pan American74
Synchronous processing
One approach to parallelism is to have multiple processors apply the same program to multiple data sets
Figure 5.8 Processors in a synchronous computing environment
Dr. John P. Abraham, University of Texas Pan American75
PipeliningArranges processors in tandem, where each processor contributes one part to an overall computation
Figure 5.9 Processors in a pipeline
Dr. John P. Abraham, University of Texas Pan American76
Shared MemoryParallel Processor
Communicate through shared memory
Figure 5.10 Shared memory configuration of processors
Dr. John P. Abraham, University of Texas Pan American77
Embedded Systems
Embedded systems
Computers that are dedicated to perform
a narrow range of functions as part of a
larger system
Empty your pockets or backpacks.
How many embedded systems do you have?