cpe411 parallel and distributed computing · parallel computing is a computational approach where a...

Post on 14-May-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CPE411Parallel and Distributed Computing

Week 1Introduction

Pruet Boonma

pruet@eng.cmu.ac.th

Department of Computer Engineering

Faculty of Engineering, Chiang Mai University

2

In this class

• Parallel Computing• Architecture, Paradigm and Issues

• Shared memory vs. message passing

• Operating systems and middleware

• Algorithm model and complexity

• Distributed Computing• Architecture, Paradigm and Issues

• Tier vs. peer-to-peer architectures

• Operating systems and middleware

• Algorithm model and complexity

• Hardware support

• Advanced topics

3

What I want from you

• Your original work.• Plagiarism is zero tolerance in my class.

• That means, if you copy your friend’s work, you got an F.

• It’s ok to submit an incomplete or buggy work.• We can discuss how to make it better.

• Your attention.• No attending score, so it’s ok to not attend my class.

• But you’ve to submit your work.

• If you’re in a class, please respect me and your friend.• No cell-phone ringing, no snoring (sleeping quietly is ok), no

smelly food (food/drink is ok in lecture room).

4

What do you want from me?

• Let me know.• pruet@eng.cmu.ac.th

5

Grading

● Homework 30%

● Report/Presentation 40%

● Midterm 10%

● Final 20%

● A – [85%, 100%]

● B+ – [80%, 85%)

● B – [75%, 80%)

● C+ – [70%, 75%)

● C – [65%, 70%)

● You don't want anything below C.......... believe me

6

Introduction

Let's start with some terminology

Parallel vs. Distributed vs. ConcurrentComputing

7

Introduction

Parallel computing is a computational approach where a problem is decomposed

into small problem and solved simultaneously using multiple processors.

Think of it as multiple workers try to lay bricks on a same house.

8

Introduction

Distributed computing is a computational approach where a collection of multiple

autonomous computers that communicate through a computer network tries to solve a

problem together.

Think of it as soccer players try to play soccer together as a team.

9

Introduction

Concurrent computing is a computational approach where a programs are designed as

collections of interacting computational process that may be executed in parallel.

Think of it as a student tries to listen to a lecture and plays facebook in the same time.

10

Introduction

Then, what are the differences?

11

Introduction

Parallel computing is a computational approach where a problem is decomposed into small problem and solved

simultaneously using multiple processors.

Distributed computing is a computational approach where a collection of multiple autonomous computers that

communicate through a computer network tries to solve a problem together.

Concurrent computing is a computational approach where a programs are designed as collections of interacting

computational process that may be executed in parallel.

12

Introduction

Parallel computing is a computational approach where a problem is decomposed into small problem and solved

simultaneously using multiple processors.

Distributed computing is a computational approach where a collection of multiple autonomous computers that

communicate through a computer network tries to solve a problem together.

Concurrent computing is a computational approach where a programs are designed as collections of interacting

computational process that may be executed in parallel.

13

Parallel Architectures

There are many ways to classify parallel architectures.

One of the most frequently used is Flynn's taxonomy of computer architecture which

classifies parallelism based on instruction and data flow.

What's the instruction and data flows?

14

Parallel Architectures

This is a simple Von Neumann architecture

Instruction Data

Memory(program/data)

Controlunit

Controlunit

Output

Input

15

Parallel Architectures

Flynn's taxonomySISD, SIMD, MISD, MIMD

S = singleM = Multiple

I = InstructionD = Data

16

SISD

Single instruction, single data stream.AKA synchronous architectures or sequential

computer.Data Pool

Instruction Pool

17

SIMD

Single instruction, multiple data stream.For example, GPU.

Data Pool

Instruction Pool

18

MISD

Multiple instruction, single data stream.Fault tolerance system, e.g., space shuttle

computer.Data Pool

Instruction Pool

19

MIMD

Multiple instruction, multiple data stream.E.g., distributed systems.

Data Pool

Instruction Pool

20

Parallel Architecture

The other way to classify is by looking at the level of parallelism.

From fine-grained to coarse-grained.Bit-level parallelism

Instruction-level parallelismData parallelismTask parallelism.

21

Bit-level parallelism

Well, most of computers now have bit-level parallelism

For example, a 8-bit CPU can process 8 bit data simultaneously.

Increase the number of bits (per word) can speed up the computation.

But, what's the limitations and trade-off?

22

Instruction-level parallelism

Modern computers can execute multiple instructions simultaneously using pipeline.

For example, Pentium 4 has 35 stages pipeline, so it can execute 35 instructions at a

time.

Intel Core architecture reduces the length of the pipeline to 14 stages. The penalty cost is

too high.

23

Data Parallelism

Data parallelism is performed in application code level, especially in loop.

If inside a loop, an instruction is performed on different data, then, the instruction can be performed concurrently on different data.

It's a kind of SIMD.

24

Task Parallelism

Think of it as a distributed computing.

25

Current Trends in Parallel Architecture

Multi-core processor

Massively Parallel Processing (MPP)

General-Curpose Computing on Graphic Processing Units (GPGPU)

Vector Processing

26

Multi-core ProcessorMultiple processor units (PUs) in a single

package/die.

They share main memory (RAM) but can have separate cache memory.

Every core can share the same characteristic or different. E.g, Intel multi-core processors vs. PS3

Cell processors.

Multi-core processors can have simultaneous multithreading (i.e., HyperThread) to increase

parallelism.

27

MPP

MPP is a computer with many networked processors.

Many == 100++ processors.

Each processor has its own memory and connect to the other through a high-speed interconnect

networks (i.e., 100Gbps++) and has a copy of OS+application.

Example of this system is IBM's Deep Blue (30*120MHz RISC CPU + 480*Chess Chip).

28

GPGPU

Computer graphics processing (3D rendering, texturing, shading) is suitable for data parallel

operations, by nature.

GPUs are heavily optimized for those kinds of task.

So, GPGPU utilizes GPU to perform non-graphic parallel operations.

Example of this technology: CUDA, OpenCL

29

Vector Processing

Well, it's SIMD.

For example, you can perform A = BxC, where A, B and C are vectors with only one

instruction.

Instead of 2|B| instruction.

Examples: Cray-1 supercomputer , Intel Streaming SIMD extensions (SSE)

30

Quad-Core Processor

31

Cell Processor

32

Deep Blue/BlueGene

33

Cray-1

34

What's next?

Paradigm and issue.

top related