advanced concepts related to computer parts and types

7/31/2019 Advanced Concepts Related to Computer Parts and Types

1/14

Introduction to Computer Administration

Week-2

Advanced Concepts related to Computer Parts and Types

1. System Clock

2. MPIS / TFLOPS

3. Cache

4. DMA (Direct Memory Access)

5. Pipeline

6. SMP (symmetric multiprocessing)

7. Single-Point-Of-Failure

System Clock ( Clock Rate , MHz , GHz )

The clock rate is the fundamental rate in cycles per second (measured

in hertz) for the frequency of the clock in any synchronous circuit. For

example, a crystal oscillator frequency reference typically is

synonymous with a fixed sinusoidal waveform, a clock rate is that

frequency reference translated by electronic circuitry into a

corresponding square wave pulse [typically] for digital electronicsapplications. In this context the use of the word, speed (physical

movement), should not be confused with frequency or its corresponding

clock rate. Thus, the term "clock speed" is a misnomer.

CPU manufacturers typically charge premium prices for CPUs that

operate at higher clock rates. For a given CPU, the clock rates are

determined at the end of the manufacturing process through actualtesting of each CPU. CPUs that are tested as complying with a given set

of standards may be labeled with a higher clock rate, e.g., 1.50 GHz,

while those that fail the standards of the higher clock rate yet pass the


2/14

standards of a lesser clock rate may be labeled with the lesser clock

rate, e.g., 1.33 GHz, and sold at a relatively lower price

Limits to clock rate

The clock rate of a CPU is normally determined by the frequency of an

oscillator crystal. The first commercial PC, the Altair 8800 (by MITS),

used an Intel 8080 CPU with a clock rate of 2 MHz (2 million

cycles/second). The original IBM PC (c. 1981) had a clock rate of 4.77

MHz (4,772,727 cycles/second). In 1995, Intel's Pentium chip ran at

100 MHz (100 million cycles/second), and in 2002, an Intel Pentium 4

model was introduced as the first CPU with a clock rate of 3 GHz (three

billion cycles/second corresponding to ~3.3 10-10seconds per cycle).

With any particular CPU, replacing the crystal with another crystal that

oscillates half the frequency ("underclocking") will generally make the

CPU run at half the performance. It will also make the CPU produce

roughly half as much waste heat.

The clock rate of a computer is only useful for providing

comparisons between computer chips in the same processor

family. An IBM PC with an Intel 80486 CPU running at 50 MHz will be

about twice as fast as one with the same CPU, memory and display

running at 25 MHz, while the same will not be true for MIPS R4000

running at the same clock rate as the two are different processors withdifferent functionality. Furthermore, there are many other factors to

consider when comparing the performance of entire computers, like the

clock rate of the computer's front-side bus (FSB), the clock rate of the

RAM, the width in bits of the CPU's bus and the amount of Level 1, Level


3/14

2 and Level 3 cache. Further, in many cases a computer's performance

depends on factors outside of the CPU, such as the speed of access to

storage devices such as hard drives.

Clock rates should not be used when comparing different

computers or different processor families. Rather, some software

benchmark should be used. Clock rates can be very misleading since

the amount of work different computer chips can do in one cycle varies.

For example, RISC CPUs tend to have simpler instructions than CISC

CPUs (but higher clock rates), and superscalar processors can execute

more than one instruction per cycle (on average), yet it is not

uncommon for them to do "less" in a clock cycle. In addition, subscalar

CPUs or use of parallelism can also affect the quality of the computer

regardless of clock rate.

Clock rates:

1 Hertz = 1 Cycle Per Second

1 KHz = 1024 Cycles Per Second1 MHz = 106 Cycle Per Second

1 GHz = 109 Cycle Per Second

1 THz = 1012 Cycle Per Second

MIPS

Processor IPS IPS/MHz Year Source

Intel 486DX54 MIPS at66 MHz

0.818 MIPS/MHz 1992

DEC Alpha 21064 EV4300 MIPS at

150 MHz2 MIPS/MHz 1992 [4]

Motorola 68060 88 MIPS at 1.33 MIPS/MHz 1994
http://en.wikipedia.org/wiki/Intel_80486http://en.wikipedia.org/wiki/Alpha_21064http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-3%23cite_note-3http://en.wikipedia.org/wiki/Motorola_68060http://en.wikipedia.org/wiki/Instructions_per_secondhttp://en.wikipedia.org/wiki/Instructions_per_secondhttp://en.wikipedia.org/wiki/Instructions_per_secondhttp://en.wikipedia.org/wiki/Instructions_per_secondhttp://en.wikipedia.org/wiki/Instructions_per_secondhttp://en.wikipedia.org/wiki/Intel_80486http://en.wikipedia.org/wiki/Alpha_21064http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-3%23cite_note-3http://en.wikipedia.org/wiki/Motorola_68060


4/14

66 MHz

Intel Pentium Pro541 MIPS at

200 MHz2.705 MIPS/MHz 1996 [5]

ARM 7500FE35.9 MIPS at

40 MHz0.897 MIPS/MHz 1996

PowerPC G3 525 MIPS at233 MHz

2.253 MIPS/MHz 1997

Zilog eZ8080 MIPS at

50 MHz1.6 MIPS/MHz 1999 [6]

Intel Pentium III1,354 MIPS at500 MHz

2.708 MIPS/MHz 1999

Freescale MPC8272760 MIPS at

400 MHz1.9 MIPS/MHz 2000

[7]Integrated

CommunicationsProcessors

AMD Athlon3,561 MIPS at

1.2 GHz2.967 MIPS/MHz 2000

AMD Athlon XP 2400+5,935 MIPS at2.0 GHz

2.967 MIPS/MHz 2002

Pentium 4 Extreme Edition9,726 MIPS at

3.2 GHz3.039 MIPS/MHz 2003

ARM Cortex A82,000 MIPS at1.0 GHz

2.0 MIPS/MHz 2005 [8]

AMD Athlon FX-5712,000 MIPS

at 2.8 GHz4.285 MIPS/MHz 2005

AMD Athlon 64 3800+ X2

(Dual Core)

14,564 MIPS

at 2.0 GHz7.282 MIPS/MHz 2005 [9]

Xbox360 IBM "Xenon"Triple Core

19,200 MIPSat 3.2 GHz

2.0 MIPS/MHz 2005

PS3 Cell BE (PPE only)10,240 MIPS

at 3.2 GHz3.2 MIPS/MHz 2006

AMD Athlon FX-60 (Dual

Core)

18,938 MIPS

at 2.6 GHz7.283 MIPS/MHz 2006 [9]

Intel Core 2 Extreme X680027,079 MIPS

at 2.93 GHz9.242 MIPS/MHz 2006 [9]

Intel Core 2 ExtremeQX6700


18.481MIPS/MHz

2006 [10]

P.A. Semi PA6T-1682M8,800 MIPS at

2.0 GHz 4.4 MIPS/MHz 2007[11]

Intel Core 2 ExtremeQX9770


18.580MIPS/MHz

2008 [12]

Intel Core i7 Extreme

965EE

76,383 MIPS

at 3.2 GHz

23.860

MIPS/MHz2008 [13]

AMD Phenom II X4 940

Black Ed ition

42,820 MIPS

at 3.0 GHz

14.273

MIPS/MHz2009 [14]
http://en.wikipedia.org/wiki/Intel_Pentium_Prohttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-4%23cite_note-4http://en.wikipedia.org/wiki/Acorn_Network_Computerhttp://en.wikipedia.org/wiki/PowerPC_G3http://en.wikipedia.org/wiki/Zilog_eZ80http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-5%23cite_note-5http://en.wikipedia.org/wiki/Intel_Pentium_IIIhttp://en.wikipedia.org/wiki/PowerQUICChttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-6%23cite_note-6http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-6%23cite_note-6http://en.wikipedia.org/wiki/AMD_K7http://en.wikipedia.org/wiki/AMD_K7http://en.wikipedia.org/wiki/Pentium_4http://en.wikipedia.org/wiki/ARM_architecturehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-7%23cite_note-7http://en.wikipedia.org/wiki/Athlon_64http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Xenon_CPUhttp://en.wikipedia.org/wiki/Xenon_CPUhttp://en.wikipedia.org/wiki/Cell_(microprocessor)http://en.wikipedia.org/wiki/Cell_(microprocessor)#Power_Processor_Elementhttp://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Intel_Core_2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-9%23cite_note-9http://en.wikipedia.org/wiki/PWRficienthttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-10%23cite_note-10http://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-11%23cite_note-11http://en.wikipedia.org/wiki/Intel_Core_i7http://en.wikipedia.org/wiki/Intel_Core_i7http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-12%23cite_note-12http://en.wikipedia.org/wiki/Phenom_(processor)http://en.wikipedia.org/wiki/Phenom_(processor)http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-13%23cite_note-13http://en.wikipedia.org/wiki/Intel_Pentium_Prohttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-4%23cite_note-4http://en.wikipedia.org/wiki/Acorn_Network_Computerhttp://en.wikipedia.org/wiki/PowerPC_G3http://en.wikipedia.org/wiki/Zilog_eZ80http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-5%23cite_note-5http://en.wikipedia.org/wiki/Intel_Pentium_IIIhttp://en.wikipedia.org/wiki/PowerQUICChttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-6%23cite_note-6http://en.wikipedia.org/wiki/AMD_K7http://en.wikipedia.org/wiki/AMD_K7http://en.wikipedia.org/wiki/Pentium_4http://en.wikipedia.org/wiki/ARM_architecturehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-7%23cite_note-7http://en.wikipedia.org/wiki/Athlon_64http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Xenon_CPUhttp://en.wikipedia.org/wiki/Xenon_CPUhttp://en.wikipedia.org/wiki/Cell_(microprocessor)http://en.wikipedia.org/wiki/Cell_(microprocessor)#Power_Processor_Elementhttp://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Athlon_64_X2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Intel_Core_2http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-autogenerated1-8%23cite_note-autogenerated1-8http://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-9%23cite_note-9http://en.wikipedia.org/wiki/PWRficienthttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-10%23cite_note-10http://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Intel_Core_2_Extremehttp://en.wikipedia.org/wiki/Instructions_per_second#cite_note-11%23cite_note-11http://en.wikipedia.org/wiki/Intel_Core_i7http://en.wikipedia.org/wiki/Intel_Core_i7http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-12%23cite_note-12http://en.wikipedia.org/wiki/Phenom_(processor)http://en.wikipedia.org/wiki/Phenom_(processor)http://en.wikipedia.org/wiki/Instructions_per_second#cite_note-13%23cite_note-13


5/14

TFLOPS (1012 FLoating point Operations Per Second)

In computing, FLOPS (or flops or flop/s) is an acronym meaning

FLoating point Operations Per Second. The FLOPS is a measure of a

computer's performance, especially in fields of scientific calculations that

make heavy use of floating point calculations, similar to the older,

simpler, instructions per second.

Computer Performance

Name FLOPS

yottaFLOPS 1024

zettaFLOPS 1021

exaFLOPS 1018

petaFLOPS 1015

teraFLOPS 1012

gigaFLOPS 109

megaFLOPS 106

kiloFLOPS 103
http://en.wikipedia.org/wiki/Yotta-http://en.wikipedia.org/wiki/Zetta-http://en.wikipedia.org/wiki/Exa-http://en.wikipedia.org/wiki/Peta-http://en.wikipedia.org/wiki/Tera-http://en.wikipedia.org/wiki/Giga-http://en.wikipedia.org/wiki/Mega-http://en.wikipedia.org/wiki/Kilo-http://en.wikipedia.org/wiki/Yotta-http://en.wikipedia.org/wiki/Zetta-http://en.wikipedia.org/wiki/Exa-http://en.wikipedia.org/wiki/Peta-http://en.wikipedia.org/wiki/Tera-http://en.wikipedia.org/wiki/Giga-http://en.wikipedia.org/wiki/Mega-http://en.wikipedia.org/wiki/Kilo-


6/14

Cache Memory

(Key words: Cache Hit, Cache Miss, Hit Rate , Latency , Cache Types)

A CPU cache is a cache used by the central processing unit of a

computer to reduce the average time to access memory. The cache is a

smaller, faster memory which stores copies of the data from the most

frequently used main memory locations. As long as most memory

accesses are cached memory locations, the average latency of memory

accesses will be closer to the cache latency than to the latency of main

memory.

When the processor needs to read from or write to a location in main

memory, it first checks whether a copy of that data is in the cache. If

so, the processor immediately reads from or writes to the cache, which

is much faster than reading from or writing to main memory.

The diagram on the right (above) shows two memories. Each location in

each memory has a datum (a cache line), which in different designs

ranges in size from 8 to 512 bytes. The size of the cache line is usually

larger than the size of the usual access requested by a CPU instruction,which ranges from 1 to 16 bytes. Each location in each memory also has

an index, which is a unique number used to refer to that location. The

index for a location in main memory is called an address. Each location

in the cache has a tag that contains the index of the datum in main
http://en.wikipedia.org/wiki/File:Cache,basic.svg


7/14

memory that has been cached. In a CPU's data cache these entries are

called cache lines or cache blocks.

Most modern desktop and server CPUs have at least three independent

caches: an instruction cache to speed up executable instruction fetch,

a data cache to speed up data fetch and store, and a translation

lookaside buffer used to speed up virtual-to-physical address

translation for both executable instructions and data.

When the processor needs to read or write a location in main memory,

it first checks whether that memory location is in the cache. This is

accomplished by comparing the address of the memory location to all

tags in the cache that might contain that address. If the processor finds

that the memory location is in the cache, we say that a cache hit has

occurred; otherwise, we speak of a cache miss. In the case of a cache

hit, the processor immediately reads or writes the data in the cache

line. The proportion of accesses that result in a cache hit is known as

the hit rate, and is a measure of the effectiveness of the cache.

In the case of a cache miss, most caches allocate a new entry, which

comprises the tag just missed and a copy of the data from memory. The

reference can then be applied to the new entry just as in the case of a

hit. Misses are comparatively slow because they require the data to be

transferred from main memory. This transfer incurs a delay since main

memory is much slower than cache memory, and also incurs theoverhead for recording the new data in the cache before it is delivered

to the processor.


8/14

DMA (Direct Memory Access)

Direct memory access (DMA) is a feature of modern computers and

microprocessors that allows certain hardware subsystems within the

computer to access system memory for reading and/or writing

independently of the central processing unit. Many hardware systems

use DMA including disk drive controllers, graphics cards, network cards

and sound cards. DMA is also used for intra-chip data transfer in multi-

core processors, especially in multiprocessor system-on-chips, where its

processing element is equipped with a local memory (often called

scratchpad memory) and DMA is used for transferring data between the

local memory and the main memory. Computers that have DMA

channels can transfer data to and from devices with much less CPU

overhead than computers without a DMA channel. Similarly a processing

element inside a multi-core processor can transfer data to and from its

local memory without occupying its processor time and allowing

computation and data transfer concurrency.

Without DMA, using programmed input/output (PIO) mode for

communication with peripheral devices, or load/store instructions in the

case of multicore chips, the CPU is typically fully occupied for the entire

duration of the read or write operation, and is thus unavailable to

perform other work. With DMA, the CPU would initiate the transfer, do

other operations while the transfer is in progress, and receive an

interrupt from the DMA controller once the operation has been done.This is especially useful in real-time computing applications where not

stalling behind concurrent operations is critical. Another and related

application area is various forms of stream processing where it is


9/14

essential to have data processing and transfer in parallel, in order to

achieve sufficient throughput.

A typical usage of DMA is copying a block of memory from system RAM

to or from a buffer on the device. Such an operation does not stall the

processor, which as a result can be scheduled to perform other tasks


10/14

Pipeline

An instruction pipeline is a technique used in the design of computersand other digital electronic devices to increase their instruction

throughput (the number of instructions that can be executed in aunit of time).

The fundamental idea is to split the processing of a computer instruction

into a series of independent steps, with storage at the end of eachstep. This allows the computer's control circuitry to issue

instructions at the processing rate of the slowest step, which ismuch faster than the time needed to perform all steps at once.

The term pipeline refers to the fact that each step is carrying data

at once (like water), and each step is connected to the next (likethe links of a pipe.)

Generic pipeline

Generic 4-stage pipeline; the colored boxes represent instructions independent of each

other

To the right is a generic pipeline with four stages:

1. Fetch2. Decode

3. Execute

4. Write-back

(for lw and sw memory is accessed after execute stage)
http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg


11/14

The top gray box is the list of instructions waiting to be executed; the bottom gray box isthe list of instructions that have been completed; and the middle white box is the pipeline.

Execution is as follows:

Time

Clock

Tick

Execution

0 Four instructions are awaiting to be executed

1 the green instruction is fetched from memory

2

the green instruction is decoded

the purple instruction is fetched from memory

3 the green instruction is executed (actual operation is performed)

the purple instruction is decoded
http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg


12/14

the blue instruction is fetched

4

the green instruction's results are written back to the register file ormemory

the purple instruction is executed

the blue instruction is decoded

the red instruction is fetched

5

the green instruction is completed

the purple instruction is written back

the blue instruction is executed

the red instruction is decoded

6

The purple instruction is completed

the blue instruction is written back

the red instruction is executed

7

the blue instruction is completed

the red instruction is written back

8 the red instruction is completed

9 All instructions are executed


13/14

SMP (symmetric multiprocessing)

In computing, symmetric multiprocessing or SMP involves a

multiprocessor computer architecture where two or more identical

processors can connect to a single shared main memory. Most common

multiprocessor systems today use an SMP architecture. In the case of

multi-core processors, the SMP architecture applies to the cores,

treating them as separate processors.

SMP systems allow any processor to work on any task no matter where

the data for that task are located in memory; with proper operating

system support, SMP systems can easily move tasks between

processors to balance the workload efficiently.

Single Point of Failure

A Single Point of Failure, (SPOF), is a part of a system which, if it fails,

will stop the entire system from working [1]. They are undesirable inany system whose goal is high availability, be it a network, software

application or other industrial system.

The assessment of a potentially single location of failure identifies the

critical components of a complex system that would provoke a total

systems failure in case of malfunction. Highly reliable systems may not

rely on any such individual component.


14/14

The strategy to prevent total system failure is

Reduced Complexity

Complex systems shall be designed according to principles decomposing

complexity to the required level.

Redundancy

Redundant systems include a double instance for any critical component

with an automatic and robust switch or handle to turn control over to

the other well functioning unit (failover)

Diversity

Diversity design is a special redundancy concept that cares for the

doubling of functionality in completely different design setups of

components to decrease the probability that redundant components

might fail both at the same time under identical conditions.

Transparency

Whatever systems design will deliver, long term reliability is based on

transparent and comprehensive documentation.
http://upload.wikimedia.org/wikipedia/commons/f/f6/SPOF.png

advanced concepts related to computer parts and types

Documents