seminar map/reduce · seminar map/reduce 20.10.2010 prof. johann-christoph freytag, ph. d. rico...

33
Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

Upload: others

Post on 03-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

Seminar Map/Reduce

20.10.2010

Prof. Johann-Christoph Freytag, Ph. D.

Rico Bergmann

Page 2: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

contact

• Prof. Johann-Christoph Freytag Ph.D.

– Prof. at chair in Databases and Information Systems (DBIS) RUD25

• Rico Bergmann

– research assistant at chair in Databases andInformation Systems (DBIS) RUD25

– room 4‘222 (please make an appointment)

– Mail: [email protected]

Page 3: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

organisation

• weekly

• first talk: 03. nov. 2010

• RUD25 room 4‘112

• Wednesday 13-15 p.m.

• conditions for a certificate

– presentation

– term paper

– regular attendance

Page 4: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

organisation

• presentation of a theme

– 60 minutes presentation

– 30 minutes questions and feedback

– send the slides until Monday before your talk to [email protected]

• term paper

– to be handed in until: 13. Feb 2011

Page 5: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

questions?

Page 6: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

some recommendations

How to give a good talk (very briefly)?

How to write a good termpaper (even shorter)?

Page 7: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

presentations

• a good talk

– is interesting

– has a logical and observable structure

– has a take-home-message

Page 8: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

bad talks …

• have too much text or too muchpictures

• have no colors or too much colors

• have no substance or too muchsubstance

• can be found nearly everywhere

Page 9: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

the perfect talk …

• does not exist

• but: try to get near perfect

• you can make every themeinteresting (business men know this)

• you may even lie, if it helps youraudience to understand you

Page 10: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

talk - introduction

• first slide: title, name of the speaker

• motivation for the talk

– this is the key to attention

WHOOOMP– it is the appetizer for your audience

Page 11: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

research talks

• are NOT business talks

• don‘t sell

• present information– not too much (OutOfMemoryError)

– not too less (under-utilization)

• guide your audience– from known things

– step by step

– to your key message(-s)

Page 12: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

understanding your talk

• use examples

• use grafics, diagrams, pictures– they should be intuitive

– and must be explained

• make pauses

• ….

• and have a go

Page 13: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

slides

• big font (28 pt. or more)

• sans serif

• no sentences

• page numbers

• high-contrast

Page 14: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

effect of your talk

• impacts for keeping information

– 30% content

– 30% mimic

– 40% gesture

source: http://www.ifi.uzh.ch/groups/req/ftp/wap/WAP-Praesentationstechnik.pdf

Page 15: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

term paper

• is scientific „antiseptic“

• structure (recommended):

– introduction (motivation, definitions …)

– main part (describe the solution)

– outro (discussion, open issues …)

• describe in your own words

Page 16: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

style

• use short sentences

• be concise

– but don‘t let out important information

• cite correctly

• give it a clear structure

• visualize (and explain each graphic)

• give examples

Page 17: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

questions?

Page 18: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

About Clouds

Cloud Computing

Page 19: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

CC - definition

• combination of clusters and Grids

• on-demand computing

• mostly virtualized nodes (VMs)

• parallel and distributed

• dynamically provisioned

• presented as one or more unified computing resource(s)

source: „Cloud computing and emerging IT platforms: Vision, hype and reality for delivering computing as the 5th utility“, Buyya et.al, Future Generation Computer Systems 2009, Vol.26, Is.6

Page 20: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

Why Clouds?

• datacenter utilization is normallyaround 5% – 20% [AFG+10]

• sell compute power

– and storage

• industry needs a new hype after SOA

*AFG+10+ Armbrust, M., Fox, A., Griffith, R. et.al, „Above the Clouds: A Berkeley View of Cloud Computing“, UC Berkeley RAD Labs, 2010

Page 21: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

CC ingredients

• heterogeneous computer

• commodity hardware

• virtualisation

• high-speed network

Page 22: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

Cloud stack

source:http://www.saasblogs.com/images/uploads/2008/12/cloud_stack.gif

Page 23: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

Cloud Service Provider

• Google AppEngine

• Amazon Elastic Compute Cloud (EC2)

• Microsoft Azure

• force.com

• Google Docs

• … and many more

Page 24: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

questions?

Page 25: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

MapReduce

Page 26: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

MapReduce (logical)

M

M

M

R

R

M R

file Map partitions Reduce files

Page 27: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

MapReduce attributes

• fault tolerance

• implicit parallelisation

• data locality

• schema free

• robustness (skips „bad records“)

Page 28: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

questions?

Page 29: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

themes for your talk

Page 30: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

themes

1. MapReduce programming model

2. distributed file systems (GFS,HDFS)

3. Hadoop (without HDFS)

4. MapReduce vs. PDBMS (a comparison of MapReduce and Parallel DBMS)

5. HadoopDB (architectural hybrid of MapReduce and DBMS)

Page 31: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

themes

6. Hive (a data warehouse on top MapReduce with an SQL-like QL)

7. Pig/PigLatin (dataflow system with SQL-like QL)

8. PACT/Nephele (project Stratosphere - a database system in the Cloud – work in progress)

9. MapReduce Online (extension of the MapReduce model for online aggregation)

Page 32: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

themes

10. Map-Reduce-Merge (extension of the MapReduce model for joins)

11. SQL/MapReduce (uses MapReduce model for UDF-programming inside a DBMS - Aster Data nCluster)

12. MapReduce for Multi-Cores / Multi-Processors (Evaluation of the MapReduce model on Multi-Core and Multi-Processor systems – project Phoenix)

Page 33: Seminar Map/Reduce · Seminar Map/Reduce 20.10.2010 Prof. Johann-Christoph Freytag, Ph. D. Rico Bergmann

o introo talkso papero Cloudso Map/

Reduceo themes

themes

13. Dryad/DryadLiNQ (the Microsoft approach to Cloud Computing –execution system and a QL)

14. MapReduce and functional programming (the MapReduce model discussed from a functional programming perspective)