![Page 1: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/1.jpg)
ETM 555 1
ETM 555Supplementary Lecture NotesVersion 5. / 2011
Contents:Part 1: Hardware/Software Systems, Grid / Cloud Computing
![Page 2: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/2.jpg)
ETM 555 2
Part 1Hardware/Software
Systems, Grid Computing
![Page 3: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/3.jpg)
ETM 555 3
Hardware Hardware Parallel/Distributed Processing Parallel/Distributed Processing High Performance ComputingHigh Performance ComputingTop 500 listTop 500 listGrid computing Grid computing
ETM 555
picture of ASCI WHITE,
the most powerfulcomputer in the world (2001)
![Page 4: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/4.jpg)
ETM 555 4
Von Neumann Architecture
CPU RAM Device Device
• sequential computer
BUS
![Page 5: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/5.jpg)
ETM 555 5
History of Computer Architecture
• 4 Generations (identified by logic technology)
1. Tubes
2. Transistors
3. Integrated Circuits
4. VLSI (very large scale integration)
![Page 6: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/6.jpg)
ETM 555 6
PERFORMANCE TRENDS
![Page 7: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/7.jpg)
ETM 555 7
PERFORMANCE TRENDS
• Traditional mainframe/supercomputer performance 25% increase per year
• But … microprocessor performance 50% increase per year since mid 80’s.
![Page 8: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/8.jpg)
ETM 555 8
Moore’s Law
• “Transistor density doubles every 18 months”
• Moore is co-founder of Intel.
• 60 % increase per year
• Exponential growth
• PC costs decline.
• PCs are building bricks of all future systems.
![Page 9: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/9.jpg)
ETM 555 9
VLSI Generation
![Page 10: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/10.jpg)
ETM 555 10
Bit Level Parallelism(upto mid 80’s)
• 4 bit microprocessors replaced by 8 bit, 16 bit, 32 bit etc.
• doubling the width of the datapath reduces the number of cycles required to perform a full 32-bit operation
• mid 80’s reap benefits of this kind of parallelism (full 32-bit word operations combined with the use of caches)
![Page 11: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/11.jpg)
ETM 555 11
Instruction Level Parallelism(mid 80’s to mid 90’s)
• Basic steps in instruction processing (instruction decode, integer arithmetic, address calculations, could be performed in a single cycle)
• Pipelined instruction processing
• Reduced instruction set (RISC)
• Superscalar execution
• Branch prediction
![Page 12: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/12.jpg)
ETM 555 12
Thread/Process Level Parallelism(mid 90’s to present)
• On average control transfers occur roughly once in five instructions, so exploiting instruction level parallelism at a larger scale is not possible
• Use multiple independent “threads” or processes
• Concurrently running threads, processes
![Page 13: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/13.jpg)
ETM 555 13
Evolution of the Infrastructure
• Electronic Accounting Machine Era: 1930-1950
• General Purpose Mainframe and Minicomputer Era: 1959-Present
• Personal Computer Era: 1981 – Present
• Client/Server Era: 1983 – Present
• Enterprise Internet Computing Era: 1992- Present
![Page 14: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/14.jpg)
ETM 555 14
Memory Hierarchy
Registers
Cache
Real Memory
Disk
CD
Fast
Slow
![Page 15: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/15.jpg)
ETM 555 15
Sequential vs Parallel Processing
• physical limits reached
• easy to program
• expensive supercomputers
• “raw” power unlimited
• more memory, multiple cache
• made up of COTS, so cheap
• difficult to program
![Page 16: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/16.jpg)
ETM 555 16
Amdahl’s Law• The serial percentage of a program is fixed. So speed-up obtained by
employing parallel processing is bounded.
• Lead to pessimism in in the parallel processing community and prevented development of parallel machines for a long time.
Speedup = 1
s + 1-s
P
• In the limit:
Spedup = 1/s s
![Page 17: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/17.jpg)
ETM 555 17
Gustafson’s Law• Serial percentage is dependent on the number of
processors/input.
• Demonstrated achieving more than 1000 fold speedup using 1024 processors.
• Justified parallel processing
![Page 18: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/18.jpg)
ETM 555 18
Grand Challenge Applications
• Important scientific & engineering problems identified by U.S. High Performance Computing & Communications Program (’92)
![Page 19: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/19.jpg)
ETM 555 19
Flynn’s Taxonomy
• classifies computer architectures according to:
1. Number of instruction streams it can process at a time
2. Number of data elements on which it can operate simultaneously
Data Streams
Single Multiple
Single
Multiple
Instruction Streams
SISD SIMD
MIMDMISD
![Page 20: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/20.jpg)
ETM 555 20
SPMD Model (Single Program Multiple Data)
• Each processor executes the same program asynchronously
• Synchronization takes place only when processors need to exchange data
• SPMD is extension of SIMD (relax synchronized instruction execution)
• SPMD is restriction of MIMD (use only one source/object)
![Page 21: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/21.jpg)
ETM 555 21
Parallel Processing Terminology• Embarassingly Parallel:
-applications which are trivial to parallelize
-large amounts of independent computation
-Little communication
•Data Parallelism:
-model of parallel computing in which a single operation can be applied to all data elements simultaneously
-amenable to SIMD or SPMD style of computation
•Control Parallelism:
-many different operations may be executed concurrently
-require MIMD/SPMD style of computation
![Page 22: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/22.jpg)
ETM 555 22
Parallel Processing Terminology• Scalability:
- If the size of problem is increased, number of processors that can be effectively used can be increased (i.e. there is no limit on parallelism).
- Cost of scalable algorithm grows slowly as input size and the number of processors are increased.
- Data parallel algorithms are more scalable than control parallel alorithms
• Granularity:
- fine grain machines: employ massive number of weak processors each with small memory
- coarse grain machines: smaller number of powerful processors each with large amounts of memory
![Page 23: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/23.jpg)
ETM 555 23
Shared Memory Machines
Shared Address Space
process(thread)
process(thread)
process(thread)
process(thread)
process(thread)
•Memory is globally shared, therefore processes (threads) see single address space
•Coordination of accesses to locations done by use of locks provided by thread libraries
•Example Machines: Sequent, Alliant, SUN Ultra, Dual/Quad Board Pentium PC
•Example Thread Libraries: POSIX threads, Linux threads.
![Page 24: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/24.jpg)
ETM 555 24
Shared Memory Machines• can be classified as:
-UMA: uniform memory access
-NUMA: nonuniform memory access
based on the amount of time a processor takes to access local and global memory.
Inter-connectionnetwork/or BUS
Inter-connection
network
Inter-connection
network
P
P
..
P
M
M
..
M
P
M
P
M
..
P
M
P
M
P
M
..
P
M
M
M
M
..
M
(a)(b) (c)
![Page 25: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/25.jpg)
ETM 555 25
Distributed Memory Machines
Network
process
process
process
process
processM
M
M
M
M
•Each processor has its own local memory (not directly accessible by others)
•Processors communicate by passing messages to each other
•Example Machines: IBM SP2, Intel Paragon, COWs (cluster of workstations)
•Example Message Passing Libraries: PVM, MPI
![Page 26: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/26.jpg)
ETM 555 26
Beowulf Clusters
•Use COTS, ordinary PCs and networking equipment
•Has the best price/performance ratio
PC cluster
![Page 27: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/27.jpg)
ETM 555 27
Multi-Core Computing
• A multi-core microprocessor is one which combines two or more
independent processors into a single package, often a single integrated circuit.
• A dual-core device contains only two independent microprocessors.
![Page 28: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/28.jpg)
ETM 555 28
Comparison of Different Architectures
CPU State
CacheExecution
unit
Single Core Architecture
![Page 29: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/29.jpg)
ETM 555 29
Comparison of Different Architectures
CPU State
CacheExecution
unit
Multiprocessor
CPU State
CacheExecution
unit
![Page 30: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/30.jpg)
ETM 555 30
Comparison of Different Architectures
CPU State
CacheExecution
unit
Hyper-Threading Technology
CPU State
![Page 31: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/31.jpg)
ETM 555 31
Comparison of Different Architectures
CPU State
CacheExecution
unit
Multi-Core Architecture
CPU State
CacheExecution
unit
![Page 32: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/32.jpg)
ETM 555 32
Comparison of Different Architectures
CPU State
Executionunit
Multi-Core Architecture with Shared Cache
CPU State
Cache
Executionunit
![Page 33: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/33.jpg)
ETM 555 33
Comparison of Different Architectures
Multi-Core with Hyper-Threading Technology
CPU State
CacheExecution
unit
CPU State CPU State
CacheExecution
unit
CPU State
![Page 34: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/34.jpg)
ETM 555
Graphics Processing Units (GPUs)
• GPU devotes more transistors to data processing
![Page 35: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/35.jpg)
ETM 555 35
Hillis’ Thesis ’85 (back to the future !)
Piece of
silicon
Sequential computer Parallel computer
• proposed “The Connection Machine” with massive number of processors each with small memory operating in SIMD mode.
• CM-1, CM-2 machines from Thinking Machines Corporation (TMC)were examples of this architecture with 32K-128K processors.
![Page 36: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/36.jpg)
ETM 555
Floating Point Operations for the CPU and the GPU
![Page 37: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/37.jpg)
ETM 555
Memory Bandwidth for the CPU and the GPU
![Page 38: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/38.jpg)
ETM 555
NVIDIA GPU Supports Various Languages or Application Programming Interfaces
![Page 39: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/39.jpg)
ETM 555
Automatic Scalability
A multithreaded program is partitioned into blocks of threads that execute independently from each other, so that a GPU with more cores will automatically execute the program in less time than a GPU with fewer cores.
![Page 40: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/40.jpg)
ETM 555
Grid of Thread Blocks
![Page 41: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/41.jpg)
ETM 555
Memory Hierarchy
![Page 42: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/42.jpg)
ETM 555
GPU Programming Model
• Heterogeneous Programming • Serial code executes on the host while parallel code executes on the device.
![Page 43: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/43.jpg)
ETM 555 43
![Page 44: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/44.jpg)
ETM 555 44
Top 500 Most Powerful Computers List
• http://www.top500.org/list/2011/06
![Page 45: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/45.jpg)
ETM 555 45
Grid Computing• provide access to computing power and various resources
just like accessing electrical power from electrical grid
• Allows coupling of geographically distributed resources
• Provide inexpensive access to resources irrespective of their physical location or access point
• Internet & dedicated networks can be used to interconnect distributed computational resources and present them as a single unified resource
• Resources: supercomputers, clusters, storage systems, data resources, special devices
![Page 46: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/46.jpg)
ETM 555 46
Grid Computing
• the GRID is, in effect, a set of software tools, which when combined with hardware, would let users tap processing power off the Internet as easily as the electrical power can be drawn from the electricty grid.
• Examples of Grids:
-TeraGrid (USA)
-EGEE Grid (Europe)
- TR-Grid (Turkey)
![Page 47: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/47.jpg)
ETM 555
GRID COMPUTING
Power Grid Compute Grid
![Page 48: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/48.jpg)
ETM 555 48
ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…
>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000 users>150 VOs>150,000 jobs/day
![Page 49: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/49.jpg)
ETM 555 49
Virtualization
• Virtualization is abstraction of computer resources. • Make a single physical resource such as a server, an operating system, an application, or storage device appear to function as
multiple logical resources• It may also mean making multiple physical resources such as storage devices or servers appear as a single logical resource • Server virtualization enables companies to run more than one operating system at the same time on a single machine
![Page 50: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/50.jpg)
ETM 555 50
Advantages of Virtualization
• Most servers run at just 10-15 %capacity – virtualization can increase server utilization to 70% or higher. • Higher utilization means fewer computers are required to process the same amount of work. Fewer machines means less
power consumption.• Legacy applications can also be run on older versions of an operating system• Other advantages: easier administration, fault tolerancy, security
![Page 51: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/51.jpg)
ETM 555 51
VMware Virtual Platform
Virtual machine 1
Apps 1
OS 1
X86, motherboarddisks, display, net ..
Virtual machine 2
Apps 2
OS 2
X86, motherboarddisks, display, net ..
VMware Virtual Platform
X86, motherboard, disks, display, net ..
Virtual machines
Real machines
•VMware is now 40 billion dollar company !!
![Page 52: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/52.jpg)
ETM 555 52
Cloud Computing
•Style of computing in which IT-related capabilities are provided “as a service”,allowing users to access technology-enabled services from the Internet ("in the cloud") without knowledge of, expertise with, or control over the technology infrastructure that supports them.
•General concept that incorporates software as a service (SaaS), Web 2.0 and other recent, well-known technology trends, in which the common theme is reliance on the Internet for satisfying the computing needs of the users.
![Page 53: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/53.jpg)
ETM 555
Cloud Computing
• Virtualisation provides separation between infrastructure and user runtime environment
• Users specify virtual images as their deployment building blocks
• Pay-as-you-go allows users to use the service when they want and only pay for what they use
• Elasticity of the cloud allows users to start simple and explore more complex deployment over time
• Simple interface allows easy integration with existing systems
53
![Page 54: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/54.jpg)
ETM 555
Cloud: Unique Features
• Ease of use
– REST and HTTP(S)
• Runtime environment
– Hardware virtualisation
– Gives users full control
• Elasticity
– Pay-as-you-go
– Cloud providers can buy hardware faster than you!
54
![Page 55: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/55.jpg)
ETM 555
Cloud computing is about much more than technological capabilities.
Technology is the mechanism, but, as in any shift in business, the driver is economics.
Nicholas Carr,The author of “The Big Switch”
![Page 56: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/56.jpg)
ETM 555
We want to pay only for what we useAnd we want to control it accurately.
Better Economics
![Page 57: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/57.jpg)
ETM 555
Facing New Challenges
• Complexity of modern IT infrastructures: physical servers, virtual machines, clusters, Grids, geographical distribution
• Cost of electricity• Credit crunch• Further pressures to reduce costs• Openness to the acceptable security concept
![Page 58: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/58.jpg)
ETM 555
http://www....
![Page 59: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/59.jpg)
ETM 555
![Page 60: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/60.jpg)
ETM 555
The 70/30 SwitchThe 70/30 Switch
![Page 61: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/61.jpg)
ETM 555
Finding Solutions
• Improving utilisation rates through market based algorithms for resource allocation
• Accessing external infrastructures on-demand• Using a single management platform for all computing
resources
![Page 62: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/62.jpg)
ETM 555
Cloud vs Grid
From the customers/end users point of view
They are the same
![Page 63: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/63.jpg)
ETM 555
Grid/cloud market structure
Hardware(owned)
Hardware(service)
Middleware
Applications
Customer
Net
wor
k
![Page 64: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/64.jpg)
ETM 555
The Grid/Cloud
Advantages
• Lower cost
• Access to larger infrastructure
– Faster calculations
– More storage
• Speed
– Faster calculations
– Easier provisioning
Disadvantages
• Very complicated
• Security
• Lack of confidence
– Trust
– Compatibility
![Page 65: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/65.jpg)
ETM 555
Improving Utilization
Hardware Servers: (-) low utilisation rates, scalability problems
Uti
lisati
on
Virtual Servers: (+) improved utilisation rates, better scalability, easy disaster recovery(-) increased number of servers to manage, incompatible virtualization platforms
Enterprise/Departmental Grid:(+) improves utilisation rates of physical servers, enables collaboration(-) limited scalability, lack of interoperability between vendors, limited efficiency of policy based mechanisms
100%
0%
Grid
Cloud Computing:(+) no need to own hardware, shared access, improved utilisation through pay-as-you-use(-) incompatible platforms, ‘fair price’ is dubious to users
![Page 66: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/66.jpg)
ETM 555
Issue Classic Grid Computing Cloud computing
Why we need it?(The Problem)
To enable the R&D community to achieve its research goals in reasonable time.Computation over large data sets, or of paralleizable compute-intensive applications.
Reduce IT costs.On-demand scalability for all applications, including research, development and business applications.
Main Target Market
First - AcademiaSecond – certain industries
Mainly Industry
Business Model – Where the money comes from?
AcademiaSponsor-based (Mainly government money).
Industry paysInternal Implementations.
Hosted by commercial companies, paid-for by users. Based on the economies of scale and expertise. Only pay for what you need, when you need it: (On- Demand + Pay per Use).
Grid and Clouds
![Page 67: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/67.jpg)
ETM 555
Competition
HardwareOperatingSystem
Virtualisation
EnterpriseGrid
EnterpriseCloud
Cloud
ConstellationTechnologies
Incompatible Standards
Creates pools
of resources
Higher
utilisation
rates
Incompatible Standards
Interfaces
Interfaces and Market Mechanisms
Key differentiators:• Open source – no vendor lock-in• Scalability
![Page 68: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/68.jpg)
ETM 555
![Page 69: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/69.jpg)
ETM 555
Challenges
Security and Trust
Customer SLA – compare Cost/Performance
Dynamic VM migration – Unique Universal IP
Clouds Interoperability
Data Protection & Recovery
Standards: Security
Management Tools
Integration with Internal Infrastructure
Small compact economical applications
Cost/Performance prediction and measurement
Keep it Transparent and Simple
![Page 70: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/70.jpg)
ETM 555
"The future is about having a platform in the cloud," Microsoft Chief Steve Ballmer said of the trend in a July, 2008 e-mail to employees.
Cloud Market
![Page 71: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/71.jpg)
ETM 555
“By 2012 ,80 percent of Fortune 1000 companies
will pay for some cloud computing service ,
And
30 percent of them will pay for
cloud computing infrastructure.”
Gartner, 2008
Cloud Market
![Page 72: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/72.jpg)
ETM 555
• - CIOs -> Do more with Less (Energy costs / Recession will boost it)
- Lower cost for Scalability
- Enterprise IT budget - Spending 80% on MAINTENANCE
- In average, we utilize only 15% of our computing resources capacity
- Peak Times economy
- The Enterprise IT is not its core business
- Psychology of Internet/Cloud trust (SalesForce, Gmail, Internet banking, etc.)
- Ideal for Developers
Why Now? (Economy)
![Page 73: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/73.jpg)
ETM 555
Cost savings, leveraging economies of scale
Pay only for what you use
Resource flexibility
Rapid prototyping and market testing
Increased speed to market
Improved service levels and availability
Self-service deployment
Reduce lock-in and switching costs
Why Now? (Benefits)
![Page 74: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/74.jpg)
ETM 555
Clouds Types
VM Based (EC2, GoGrid)
Storage Based (EMC, S3)
Customers Applications based (Google)
Cloud Applications based (SalesForce)
Grid Computing/HPC Applications
Mobile Clouds (iPhone UI, WEB APPS)
Private Clouds
Cloud of Clouds
![Page 75: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/75.jpg)
ETM 555
Summary
Cloud Computing - The New IT Economy
Pay-per-Use for On-Demand Scalability
All major vendors are investing in Clouds
Cloud Trading Market will evolve
VM will be mobile across clouds
Mobile phones (iPhone) cloud users
International implications (Access to Data)
![Page 76: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/76.jpg)
ETM 555
Example Cloud: Amazon Web Services
• EC2 (Elastic Computing Cloud) is the computing service of Amazon
– Based on hardware virtualisation
– Users request virtual machine instances, pointing to an image (public or private) stored in S3
– Users have full control over each instance (e.g. access as root, if required)
– Requests can be issued via SOAP and REST
76
![Page 77: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/77.jpg)
ETM 555
Example Cloud: Amazon Web Services
•S3 (Simple Storage Service) is a service for storing and accessing data on the Amazon cloud– From a user’s point-of-view, S3 is independent
from the other Amazon services– Data is built in a hierarchical fashion, grouped in
buckets (i.e. containers) and objects– Data is accessible via various protocols
•Elastic Block Store– Locally mounted storage– Highly available
77
![Page 78: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/78.jpg)
ETM 555
Example Cloud: Amazon Web Services
• Other AWS services:
– SQS (Simple Queue Service)
– SimpleDB
– Billing services: DevPay
– Elastic IP (Static IPs for Dynamic Cloud Computing)
– Multiple Locations
78
![Page 79: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/79.jpg)
ETM 555
Example Cloud: Amazon Web Services
• Pricing information
http://aws.amazon.com/ec2/
79
![Page 80: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/80.jpg)
ETM 555
EC2 – “Google of the Clouds”
According to Vogels (Amazon CTO), 370,000 developers have registered for Amazon Web Services since their start in 2002, and the company now spends more bandwidth on the developers than it does on e-commerce. http://www.theregister.co.uk/2008/06/26/amazon_trumpets_web_services/
In the last two months of 2007 usage of Amazon Web Services grew by 40%
$131 million revenues in Q1 from AWS 60,000 customersThe majority of usage comes from banks, pharmaceuticals and other large corporations
![Page 81: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/81.jpg)
ETM 555 81
![Page 82: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/82.jpg)
ETM 555 82
Hadoop
![Page 83: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/83.jpg)
ETM 555 83
• IDC estimate put the size of the “digital universe” at - 0.18 zettabytes in 2006-forecasting a tenfold growth by 2011 to 1.8 zettabytes
• The New York Stock Exchange generates about one terabyte of new trade data per day
• Facebook hosts approximately 10 billion photos, taking up one petabyte of storage.
• The Internet Archive stores around 2 petabytes of data, and is growing at a rate of 20 terabytes per month.
• The Large Hadron Collider near Geneva, Switzerland, produce about 15 petabytes of data per year.
Data Explosion
![Page 84: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/84.jpg)
ETM 555 84
•Common A set of components and interfaces for distributed filesystems and general I/O (serialization, Java RPC, persistent data structures).
•Avro A serialization system for efficient, cross-language RPC, and persistent data storage.
•MapReduce A distributed data processing model and execution environment that runs on large clusters of commodity machines.
•HDFS A Distributed filesystem that runs on large clusters of commodity machines.
•Pig A data flow language and execution environment for exploring very large datasets. Pig runs on HDFS and MapReduce clusters.
Hadoop Projects
![Page 85: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/85.jpg)
ETM 555 85
•Hive A distributed data warehouse. Hive manages data stored in HDFS and provides a query language based on SQL (and which is translated by the runtime engine to MapReduce jobs) for querying the data.
•Hbase A distributed, column-oriented database. HBase uses HDFS for its underlying storage, and supports both batch-style computations using MapReduce and point queries (random reads).
•ZooKeeper A distributed, highly available coordination service. ZooKeeper provides primitives such as distributed locks that can be used for building distributed applications.
•Sqoop A tool for efficiently moving data between relational databases and HDFS.
Hadoop Projects
![Page 86: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/86.jpg)
ETM 555 86
RDBMS Compared to MapReduce
• MapReduce can be seen as a complement to an RDBMS
• MapReduce is a good fit for problems that need to analyze the whole dataset, in a batch fashion, particularly for ad hoc analysis.
• An RDBMS is good for point queries or updates, where the dataset has been indexed to deliver low-latency retrieval and update times of a relatively small amount of data.
• MapReduce suits applications where the data is written once, and read many times, whereas a relational database is good for datasets that are continually updated.
![Page 87: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/87.jpg)
ETM 555 87
RDBMS Compared to MapReduce
![Page 88: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/88.jpg)
ETM 555 88
Amazon’s Cloud Load Balancing Service
• Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances.• http://docs.amazonwebservices.com/ElasticLoadBalancing/latest/DeveloperGuide/
![Page 89: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/89.jpg)
ETM 555 89
Dynamic Load Balancing Web-Server Systems
Server1
Server2
Server3
Server4
ServerM
Client1
Client2
Client3
Client4
ClientN
![Page 90: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/90.jpg)
ETM 555 90
Dynamic Load Balancing Web-Server Systems
• Client-Based Approach
– Web clients, if they are aware of the Web-server system’s replicated servers, can actively route requests
– After receiving a request, the Web client selects a node of the web server cluster and, after resolving the address mapping, submits the request to the selected node, which is then responsible for responding to the client.
– Approaches:
• Netscape: wwwi.netscape.com
• Via smart client, e.g java applet
![Page 91: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/91.jpg)
ETM 555 91
Dynamic Load Balancing Web-Server Systems
• DNS-Based Approach
– The cluster DNS—the authoritative DNS server for the distributed Web system’s nodes—translates the symbolic site name (URL) to the IP address of one server
![Page 92: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/92.jpg)
ETM 555 92
Dynamic Load Balancing Web-Server Systems
• Dispatcher-Based Approach
![Page 93: ETM 555 Supplementary Lecture Notes Version 5. / 2011 Contents:](https://reader030.vdocument.in/reader030/viewer/2022013011/56812e7a550346895d941ccb/html5/thumbnails/93.jpg)
ETM 555 93
Dynamic Load Balancing Web-Server Systems
• Server-Based Approach