parallel computers organizations and architecture department of computer science southern illinois...
TRANSCRIPT
![Page 1: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/1.jpg)
Parallel ComputersOrganizations and Architecture
Department of Computer ScienceSouthern Illinois University Edwardsville
Summer, 2015
Dr. Hiroshi FujinokiE-mail: [email protected]
CS 312 Computer Organization and Architecture
![Page 2: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/2.jpg)
Mult_Sched/001
CS 312 Computer Organization and Architecture
Four hardware architecture for “parallel computers”
Tightly-Coupled Multi-Processor System
Functionally-Specialized Multi-Processor System
Loosely-Coupled Multi-Processor System
Distributed Systems (“most loosely coupled systems”)
![Page 3: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/3.jpg)
MotherboardMotherboard
Mult_Sched/002
Tightly-Coupled Multi-Processor System
• Multi-Processor System (multi-processor motherboard)
• Single-Processor System with a multi-core processor
Multi-ProcessorSystem
Single-Processor Systemwith multi-core processor
ProcessorProcessor
Processor Core(ALU and others)
CS 312 Computer Organization and Architecture
![Page 4: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/4.jpg)
Mult_Sched/002
Tightly-Coupled Multi-Processor System
• Multi-Processor System (multi-processor motherboard)
CS 312 Computer Organization and Architecture
Two processors on a motherboard
![Page 5: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/5.jpg)
Mult_Sched/002
Tightly-Coupled Multi-Processor System
CS 312 Computer Organization and Architecture
• Single-Processor System with a multi-core processor
CPU cores
![Page 6: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/6.jpg)
Motherboard
Graphic Interface
Video RAM (“VRAM”)
Mult_Sched/003
Functionally-Specialized Multi-Processor System
Examples: • GPU on graphics card• Built-in processor on high-speed disk controllers or NICs
(especially those using DMA)
Processor
Monitor(CRT, Flat Panel)
DAC
Graphic-card performs D/A conversion using DAC.
GPU
GPU processes image data in the graphic-card memory
Processor sends graphic command to GPU
Graphic-card sends analog image signals (RGB-signals) to monitor
(GPU = “Graphic Processing Unit”)
CS 312 Computer Organization and Architecture
![Page 7: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/7.jpg)
Mult_Sched/003
Functionally-Specialized Multi-Processor System
Examples: • GPU on graphics card (GPU = “Graphic Processing Unit”)
CS 312 Computer Organization and Architecture
DMA SCSI I/O card
CPU
Control Program (in ROM)
![Page 8: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/8.jpg)
Mult_Sched/004
Loosely-Coupled Multi-Processor System
• Multi-Systemboard (multiple motherboard) computers
Computer System“Bus”
Processor
System Board(Motherboard)
Memory
• A computer with multiple motherboards (“blades”)
• Blades communicate through the bus
• Each blade is a computer
• Communication delay over the bus
at least “s” order
CS 312 Computer Organization and Architecture
![Page 9: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/9.jpg)
Mult_Sched/004
Loosely-Coupled Multi-Processor System
• Multi-Systemboard (multiple motherboard) computers
CS 312 Computer Organization and Architecture
![Page 10: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/10.jpg)
Mult_Sched/005
Distributed Systems (“most loosely coupled systems”)
AS 1
AS 4
AS 2
AS 3
• Processor• Local Memory• Secondary Storage• Other I/O
• Processor• Local Memory• Secondary Storage• Other I/O
• Processor• Local Memory• Secondary Storage• Other I/O
• Processor• Local Memory• Secondary Storage• Other I/O
Process(executable codes)
Process Migration
File (data)
Data MigrationNetwork
CS 312 Computer Organization and Architecture
![Page 11: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/11.jpg)
Mult_Sched/006
Three different types of tightly-coupled multi-processor systems
(1) “Fine-grained” multi-processor parallel computers
(2) “Medium-grained” multi-processor parallel computers
(3) “Coarse-grained” multi-processor parallel computers
CS 312 Computer Organization and Architecture
![Page 12: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/12.jpg)
Mult_Sched/007
Fine-Grained Multi-Process
• Fine-grained = instruction-level multi-processing
Your program(binary executable)
A = B + C;X = Y + Z;
W = A + X;
synchronization
Dependency
Granularity: 1~20 instructions
CPU CPU
CS 312 Computer Organization and Architecture
![Page 13: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/13.jpg)
Mult_Sched/008
Medium-Grained Multi-Process
• Medium-grained = thread-level multi-processing
Your program(binary executable)
ThreadA
ThreadB
ThreadC
ThreadD
Processor Processor
CS 312 Computer Organization and Architecture
![Page 14: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/14.jpg)
Mult_Sched/009
Medium-Grained Multi-Process
• Example: Web Browser
ThreadA -- Display thread (text output & jpeg image processing)
ThreadB -- Taking user inputs (edit boxes, radio boxes in the browser window
ThreadC -- Network input (receiving data from network)
ThreadD -- Network output (sending data to network)
ThreadA ThreadB ThreadC ThreadD
Receivingdata
Displayingdata
User makesinputs
Receivingdata
Transmitdata
CS 312 Computer Organization and Architecture
![Page 15: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/15.jpg)
Mult_Sched/010
Medium-Grained Multi-Process
• Example: Web Browser
ThreadA -- Display thread (text output & jpeg image processing)
ThreadB -- Taking user inputs (edit boxes, radio boxes in the browser window
ThreadC -- Network input (receiving data from network)
ThreadD -- Network output (sending data to network)
ThreadA ThreadB ThreadC ThreadD
ReceivingdataDisplaying
dataUser makesinputs
Receivingdata
Transmitdata
Browser executionwith better responses
Granularity: 20~200 instructions
CS 312 Computer Organization and Architecture
![Page 16: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/16.jpg)
Mult_Sched/011
Coarse-Grained Multi-Process
• Coarse-grained = process-level multi-tasking
Process assignment to multiple processors in multi-tasking environment
Memory
Processor
Time
CS 312 Computer Organization and Architecture
![Page 17: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki](https://reader034.vdocument.in/reader034/viewer/2022050714/56649ed85503460f94be65b5/html5/thumbnails/17.jpg)
Mult_Sched/012
Coarse-Grained Multi-Process
• Coarse-grained = process-level multi-tasking
Process assignment to multiple processors in multi-tasking environment
Memory
Processor PoolGranularity = ms order
• 1ms (@ 1GHz) = 1 million instructions
• 100ms (@ 1GHz) = 100M instructions
Granularity: 1~100 M instructions
Time
CS 312 Computer Organization and Architecture