lecture 3 cs492 special topics in computer science distributed algorithms and systems

Lecture 3 CS492 Special Topics in Computer Science Distributed Algorithms and Systems

Upload: rosamund-nelson

Post on 12-Jan-2016

212 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Lecture 3

CS492 Special Topics in Computer ScienceDistributed Algorithms and Systems

2Fall 2008 CS492

MapReduce

Simplified data processing on large clusters A programming model and an associated implementation

for processing and generating large data sets Inspired by “map” and “reduce” primitives present in

Lisp and many other functional languages Simple and powerful interface

that enables automatic paralization and distribution of large-scale computations

Page 3: Lecture 3 CS492 Special Topics in Computer Science Distributed Algorithms and Systems

3Fall 2008 CS492

Map and Reduce

Map map <k1, v1> list (k2, v2) example

k1 = document, v1 = contents k2 = words, v2 = 1

Reduce reduce (k2, list(v2)) list (v2)

k2 = words, v2 = 1 list (v2) = total count of v2 (or k2)

Page 4: Lecture 3 CS492 Special Topics in Computer Science Distributed Algorithms and Systems

4Fall 2008 CS492

MapReduce Execution Overview

Page 5: Lecture 3 CS492 Special Topics in Computer Science Distributed Algorithms and Systems

5Fall 2008 CS492

Examples

Distributed grep Count of URL access frequency Reverse web-link graph Term-vector per host

Page 6: Lecture 3 CS492 Special Topics in Computer Science Distributed Algorithms and Systems

6Fall 2008 CS492

Pig platform for analyzing large data sets that consists of a

high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets

Dryad Distributed data-parallel programs from sequential build-

ing blocks (EuroSys 2007)

6.006 Lecture 24 Original: Algorithms research topics · Title: 6.006 Lecture 24 Original: Algorithms research topics Author: Demaine, Erik | Devadas, Srini Created Date: 12/13/2011

1 Parallel Algorithms IV Topics: image analysis algorithms

in Computer Science NLP in a Nutshell - KAIST …nlpcl.kaist.ac.kr/~cs492/lecture02.pdfSpecial Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 JongC. Park

08 Genetic Algorithms & Modeling - WordPress.com 08, 2015 · Genetic algorithms & Modeling, topics : Introduction, why genetic algorithm? search optimization methods, evolutionary

Caching Parallel Computational Models Other Topics in Algorithms Wednesday, August 13 th

Advanced Topics in Algorithms Shamir

Genetic Algorithms & Modeling - Myreaders.info · 2015. 12. 23. · Genetic Algorithms & Modeling Soft Computing Genetic algorithms & Modeling, topics : Introduction, why genetic

Topics in Algorithms 2007

CS 6293 Advanced Topics: Current Bioinformatics Lecture 5 Exact String Matching Algorithms

Algorithms for Maximum independent Setsgasarch/TOPICS/sat/robson.pdf · 2013-08-13 · JOURNAL OF ALGORITHMS 7,425-440 (1986) Algorithms for Maximum independent Sets J.M. ROBSON Depurtment