basics of grid and cloud computingeero/gtla/gridlecture1.pdf•6-9.02 python intro •13-16.02...

Post on 05-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

University of Tartu, Institute of Computer Science

Basics of Grid and Cloud ComputingGridi ja pilvetehnoloogia alused

(http://courses.cs.ut.ee/2012/cloud)

eero.vainikko@ut.ee

2011/12 Spring

2 Practical Information

Lectures Wed 10:15 Liivi 2 - 1111-8: Eero Vainikko – Grid Computing9-16: Satish Narayana Srirama – CloudsComputer Classes:

• group 4: Mon 10:15 Liivi 2 - 205 ;

– Grid: Hardi Teder hardi.teder@eenet.ee

– Cloud: Reimo Rebane reimo.rebane@ut.ee

• group 3: Tue 8:15 Liivi 2 - 205; Pelle Jakovits pelle.jakovits@ut.ee

• group 1: Thu 10:15 Liivi 2 - 205; Pelle Jakovits pelle.jakovits@ut.ee

• group 2: Thu 14:15 Liivi 2 - 205; Riivo Talviste riivo.talviste@ut.ee

3 Practical Information

• Final grade:

– Active participation at lectures (ca 10%)

* Devising questions for on-line study-questionary in 24h after eachlecture

– Solution of Computer Class exercises

– Cloud project

– Written exam (Wed, 30. May 2011) 50%

NB!Crucial to keep the deadlines for all home assignments!

4 Syllabus

Lectures (1-8):

• Introduction to the subject (HPC history, supercomputers, clusters, Grid; exam-ples, visions, projects...)

• Grid architecture

• Grid Security concepts (PKI, Authorisation, CA, etc.)

• Globus Toolkit (what is virtual organisation., how to achieve it using GT etc),OGSA, WSRF

• Other Grids (UNICORE, LCG2, SunGE, ...)

• Condor, OpenPBS, Sun GE, LFS.

• NorduGrid, BalticGrid, Estonian Grid.

• Desktop-Grids (MiG, F2F)

• Examples of different grid solutions in the world

5 Syllabus

Computer Classes (preliminary) schedule:

Exercises on Grid computing (Hardi Teder, Pelle Jakovits, Riivo Talviste)

• 6-9.02 Python intro

• 13-16.02 Hello, Grid! Grid information systems, submitting first grid job

• 20-23.02 Grid security. Breaking RSA code

• 27.03-1.04 Data management on grid

• 5-8.03 Job management on grid

• 12-15.03 Grid user interfaces and tools. POV-Ray rendering

• 19-22.03 Grids and clouds, the road ahead.

• 26.03-29.04 TBA

6 Syllabus

Cloud Lectures (9-16):

• by: Dr Satish Narayana Srirama

Exercises on Cloud computing (Pelle Jakovits, Riivo Talviste, Reimo Rebane)

• 5.-9.04 Amazon EC2, Amazon S3, Elastic Fox, Google AppEngine

• 12-16.04 Eucalyptus, SciCloud, Auto Scaling & special features in EC2

• 19-23.04 Hadoop

• 26-30.04 Hadoop continued & Selecting the mini project topic

• 3-7.05

• 10-14.05 Preliminary results of project

• 17-21.05

• 24-28.05 Project delivery

7 Literature

1. Fran Berman, Geoffrey C. Fox and Anthony J. G. Hey, Grid Computing. Makingthe Global Infrastructure a Reality, John Wiley & Sons, 2003, (Grid Computing(http://www.grid2002.org/)).

2. Ian Foster and Carl Kesselman (eds.), The Grid: Blueprint for a New ComputingInfrastructure, 2nd edition, Morgan Kaufmann Publishers, 2004.

3. Michael Di Stefano, Distributed Data Management for Grid Computing, JohnWiley & Sons, 2005.

4. F Travostino, J Mambretti, G Karmous-Edwards (eds.), Grid Networks: En-abling Grids with Advanced Communication Technology , John Wiley & Sons,2006.

5. Vladimir Silva, Grid Computing For Developers, Charles River Media, 2006.

6. A. Chakrabarti, Grid Computing Security, Springer 2007.

8 Literature

7. R. Prodan, T. Fahringer, Grid Computing: Experiment Management, Tool Inte-gration, and Scientific Workflows, Springer, 2007.

8. Yang Xiao, Security in Distributed, Grid, Mobile, and Pervasive Computing,Auerbach Publications, 2007.

9. Introduction to Grid Computing (http://www.redbooks.ibm.com/redbooks/pdfs/sg246778.pdf),

10. Open Grid Forum (http://www.ogf.org).

11. The Globus Alliance (http://www.globus.org/).

12. Nordugrid (http://www.nordugrid.org/).

13. Estonian Grid (http://grid.eenet.ee/).

14. Baltic Grid (http://www.balticgrid.org).

9 Literature

Python:

1. Jeffrey Elkner, Allen B. Downey, and Chris Meyers, How to Think Like a Com-puter Scientist. Learning with Python, 2nd edition, Book homepage (http://openbookproject.net/thinkcs/python/english2e/).

2. Hans Petter Langetangen, A Primer on Scientific Programming withPython, Springer, 2009. Book webpage (http://vefur.simula.no/intro-programming/).

3. Hans Petter Langetangen, Python Scripting for Computational Science. ThirdEdition, Springer 2008. Book homepage (http://folk.uio.no/hpl/scripting/).

4. Neeme Kahusk, Sissejuhatus Pythonisse (http://www.cl.ut.ee/inimesed/nkahusk/sissejuhatus-pythonisse/)

5. Python Documentation (http://www.python.org/doc/), for startPython Tutorial (http://docs.python.org/tut/tut.html)

10 Literature

6. Mark Lutz and David Ascher, Learning Python, O’Reilly Media Inc. 2004,

7. Mark Lutz, Learning Python (4th edition), O’Reilly Media, Inc. (and SafariBooks), 2009

8. Travis E. Oliphant, Guide to NumPy (http://www.tramy.us), TrelgolPublishing 2006.

Some lecture slides:

1. Kent Engström, Python Introduction (slaidid) (http://www.nsc.liu.se/ngssc-grid/python-engstrom.pdf), NGSSC course in gridcomputing, January 10-18, 2005.

2. Chris Meers, An introduction to Python, with application to scientific comput-ing (slides (http://hughm.cs.ukzn.ac.za/~murrellh/bio/lit/pysci.pdf)), Cornell Theory Center.

3. Introduction to Scientific Computing with Python (slides (http://www.physics.rutgers.edu/grad/509/python1.pdf))

11 Past and future of the course; related courses

About this course• First time Spring 2005 Basics of Grid and Cluster Computing

• Since 2009 (Spring): Basics of Grid and Cloud Computing

– Second part: Basics of Cloud Computing (3 eap)

Other related courses:MTAT.08.022 Parallel Programming Languages (6 eap) (2011)

• Distributed Systems Seminar Wed 14:15 (Fri 14:15) Liivi 2 - 315

– MsC students: MTAT.08.024 12eap (3+3+3+3)

– Bachelor students: MTAT.08.014 8eap (2+2+2+2)

– PhD students: Distributed Systems Research Seminar MTAT.08.01920eap (5+5+5+5)

12 Past and future of the course; related courses

• Parallel Computing: MTAT.08.007 6eap Autumn 2012

• Scientific Computing: MTAT.08.010 6eap Spring 2014

• Introduction to Scientific Computing: MTAT.08.025 3eap April-May 2012(University-wide elective course for PhD students)

13 Introduction 1.1 Driving forces of computational science

1 Introduction

1.1 Driving forces of computational science

High Performance Computing (HPC)

• Environment simulation; some examples:

– Climate changes

– Prediction of amount of fish in Norwegian fjords

– Ice glacier flow simulation

• Solving fluid dynamic problems

– Weather predictions

– Design of hypersonic airplanes

– Design of more efficient cars

14 Introduction 1.1 Driving forces of computational science

– Extremely quiet submarines

– Design of efficient and safe nuclear power stations

* solution bisection, turbulence

• Simulation of nuclear explosions

• Satellite data analysis

• Data analysis of DNA-sequences

• Simulation of 3D proteine molecules

• Simulation of global economical processes

• etc. in more and more fields

15 Introduction 1.1 Driving forces of computational science

Common to all examples: need for larger than usual set of resources:

• CPU cycles

• data volume

• special devices producing data

=> parallel processing

=> questions:

• how to store data?

– Data repositories

– Data repository services

• how to move data?

– Networks

– Internet and private networks

• which algorithms can be used?

– Theory and practice of paral-lel algorithms

16 Introduction 1.2 History of HPC

1.2 History of HPC

pre-history (human arrays):1929 – parallelisation of weather predictionsA bit similar:≈1940 – Russian war defense - parallel computing (tank T40

calculations)Some expert’s predictions:1947 - computer engineer Howard Aiken: USA will need in the future at most 6

computers!1977 - Seymour Cray: The computer Cray-1 will attract potentially only ca 100

clients

Reality: how many Cray-1 class computer powers do you carry with you today?

17 Introduction 1.2 History of HPC

Gordon E. Moore’s law:(1965: the number of switches dou-bles every second year )

1975: - refinement of the above:[ The number of switches / Perfor-mance ] of a CPU doubles every18 months

18 Introduction 1.2 History of HPC

first processors 102 100 Flopsmodern desktop computers 109 Gigaflops (GFlops)

modern supercomputers 1012 Teraflops (TFlops)we are about to achieve soon 1015 Petaflops (PFlops)

next step 1018 Exaflops (EFlops)

History of Computers (http://smashinghub.com/history-of-computers.htm)

Supercomputers→ Clusters 99K Grids Clouds

19 Introduction 1.3 Data Challenges

1.3 Data Challenges

How large is 1 petabyte?

• Some high resolution pictures abouteach person on the Earth

• (5 years ago: An example of

petabyte storage device:

– train wagon full of high resolu-tion magnetic tapes

– About 3 years to read throughwith a fast tape-reader)

• Today: Largest tape drives store 5TBunpacked data (StorageTek T10000)=> 1PB takes ca 205 tapes. Readingone tape takes ca 6.1h => 52 days toread all. These tapes would pile up intotower of about 5.2m in height, weight-ing less than 60 kg, in volume, ca 40%of it would fit into hand-baggage on aplane.

* Largest commercial databases today ≈a few terabytes (1012 bytes)

Science’s needs in the near future (anexample): Particle physics experimentsproduce around 10 Petabytes a year

20 Introduction 1.3 Data Challenges

Prediction for the needs: Around the year 2015 there is a need for Exabyte (1018)storage databanks and Petaflops processing power

How large is Exabyte?All the information generated in 1999 – 2 ExabytesAll spoken words by all people ever: 5 Exabytes!

One of the most challenging problems – data updates!

21 Introduction 1.4 Classification of Parallel Computers

1.4 Classification of Parallel Computers

• Architecture

Flynn’s classification

Instruction SISD SIMDstream (MISD) MIMD

Data stream

Abbreviations:

S - Single

M - Multiple

I - Instruction

D - Data

For example: Single InstructionMultiple Data stream

– Single processor computer

– Multicore processor

– distributed system

– shared memory system

• Network

– topology

* ring, array, hypercube...

– properties

* bandwidth, latency

• Memory access

– shared , distributed , hybrid

22 Introduction 1.4 Classification of Parallel Computers

• operating system

– UNIX,

– LINUX,

– (OpenMosix)

– WIN*

• Algorithm realisation

– using only hardware modules

– mixed modules (hardware andsoftware)

• Control type

– synchronous

– dataflow-driven

– asynchronous

• scope

– supercomputing

– distributed computing

– real time sytems

– mobile systems

– grid and cloud computing

– etc

23 Introduction 1.5 Supercomputers

But impossible to ignore (implicit or explicit) parallelism in a computer or a set ofcomputers

1.5 Supercomputers

• Last word in computer hardware

– one step ahead in technology

– Note: today’s supercomputers are tomorrow’s commodity systems!

• expensive

• shipped with OS

• Supercomputers→ Clusters

24 Introduction 1.6 Computer Clusters

1.6 Computer Clusters

Workstation groups connected with LAN with uniform softwareExample: Linux-clusters (Beowulf Clusters) , University of Tartu HPC

aurumasin

Special network solutions

• Myrinet (Clos-networks)

• Scali

• *-Ethernet

• Infiniband

25 Introduction 1.6 Computer Clusters

Top 500

Top500 (http://www.top500.org)

• Also, http://www.bbc.co.uk/news/10187248

top related