sathya final review

33
ARRAY PROCESSOR FEATURING AN ARRAY PROCESSOR FEATURING AN EFFECTIVE FIFO BASED DATA EFFECTIVE FIFO BASED DATA STREAM MANAGEMENT STREAM MANAGEMENT PROJECT INTERNAL PROJECT INTERNAL GUIDE GUIDE Mrs.I.VATSALAPRIYA.M.E., Mrs.I.VATSALAPRIYA.M.E., PROJECT MEMBERS PROJECT MEMBERS : : S.SATHIYA SAINATHAN, S.SATHIYA SAINATHAN, P.SRIBALAMURUGAN P.SRIBALAMURUGAN

Upload: sathiyasainathan-soundararajan

Post on 19-Jan-2017

156 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sathya Final review

ARRAY PROCESSOR FEATURING ARRAY PROCESSOR FEATURING AN EFFECTIVE FIFO BASED DATA AN EFFECTIVE FIFO BASED DATA

STREAM MANAGEMENTSTREAM MANAGEMENT

PROJECT INTERNALPROJECT INTERNAL GUIDEGUIDEMrs.I.VATSALAPRIYA.M.E.,Mrs.I.VATSALAPRIYA.M.E.,PROJECT MEMBERSPROJECT MEMBERS::S.SATHIYA SAINATHAN,S.SATHIYA SAINATHAN,P.SRIBALAMURUGANP.SRIBALAMURUGAN

Page 2: Sathya Final review

SYNOPSISSYNOPSIS1. ABSTRACT

2. NEED FOR PARALLEL COMPUTING

3. INTRODUCTION TO PARALLEL PROCESSOR AND ITS FEATURES

4. ARRAY PROCESSOR

5. SYSTOLIC ARRAY PROCESSOR

6. BASE PAPER ARCHITECTURE FOR MATRIX CALCULATION

7. PROJECT THEME IMAGE ROTATION AND IMAGE TRANSPOSE

8. COMPARISON BETWEEN MATLAB AND ARRAY PROCESSOR

9. PROPOSED ARCHITECTURE

10. OUTPUT AND OTHER APPLICATIONS

11. CONCLUSION

Page 3: Sathya Final review

ABSTRACTABSTRACT• In array processors, data I/O management is the key to realizing

high-speed matrix operations that are often required in image processing.

• In this project, we propose an array processor utilizing an effective data I/O mechanism featuring external FIFOs.

• FIFOs are used as buffers to store Initial matrix data and partially processed results. Therefore, matrix operations, including the algorithm to solve the Algebraic Path Problem (APP), can be performed without any data I/Os.

• In addition, we can eliminate register files from the processing elements (PEs) if we construct the PE array by controlling the external FIFOs systematically and transferring the data from the FIFOs to the PE array (vice-versa).

• This enables us to simplify each PE structure and realize a large array processor with limited hardware resources.

• The FIFOs themselves can be easily realized using conventional discrete FIFO or memory chips.

Page 4: Sathya Final review

Need for Parallel ComputingNeed for Parallel Computing• Each and Every Future Field development

depends on Digital computing!• Controlling Applications By means of

Digital circuit is simple and cost effective.• The increase in complex computational

steps in digital processing, results in Performance degradation.

• To solve this global problem, we are going for an highly efficient architectural design for Parallel Computing.

Page 5: Sathya Final review

Parallel vs. Serial ComputingParallel vs. Serial Computing

Serial Computing Parallel Computing

Traditionally, software has been written for serial computation.To be run on a single computer having a single Central Processing Unit (CPU).A problem is broken into a discrete series of instructions.Instructions are executed one after another.Only one instruction may execute at any moment in time.

Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem.To be run using multiple CPUs.A problem is broken into discrete parts that can be solved concurrently.Each part is further broken down to a series of instructions.Instructions from each part execute simultaneously on different CPUs.

Page 6: Sathya Final review

Features of Parallel ComputingFeatures of Parallel Computing• To process Multiple datas simultaneously.• It reduces the computation time.• The cost function of extended architecture

design is compromised to achieve accuracy and speed of execution.

• Complexity is Reduced.• It has infinite advantages.

Page 7: Sathya Final review

Array processorArray processor• A multiprocessor composed of a set of

identical central processing units.• A processor, that is capable of performing

simultaneous computations on elements of an array of data in some number of dimensions.

• CPU will act synchronously(parallel) under the control of a common unit.

• Exclusively designed for matrix calculation.

Page 8: Sathya Final review

Systolic Array ProcessorSystolic Array Processor• It is the existing processor.• A systolic array is a pipe network

arrangement of processing units called cells.

• It has parallel computing operation.• Cells are used to compute data and stores

independently of each other.• Cells consist of data processing units.• DPU’s connected with each other by mesh

like arrangement.

Page 9: Sathya Final review

Block diagram of systolic arrayBlock diagram of systolic array

Page 10: Sathya Final review

Drawbacks of systolic array Drawbacks of systolic array processorprocessor

• Expensive.• Highly specialized for particular

applications.• Difficult to build.• Limited Memory.• More number of registers are required.

Page 11: Sathya Final review

Features of array processorFeatures of array processor• High speed matrix operation.• We can eliminate register files from

processing units.• This is achieved here by means of FIFO’S.• Control and scalar type instructions are

executed in the control unit .• Vector instructions are performed in the

processing elements .

Page 12: Sathya Final review

Base paper architectureBase paper architecture• A design architecture of a 2D

array processor is proposed by eliminating the use of ALU and external RAM Memory. Since all the calculations can be performed by rotating and shifting of the MATRIX data.

• Consists of individual Processing Elements.

• Supports simple instruction set

• Avoids Algebraic Path Problem.

2D toroidal structure of our Proposed array Processor

Page 13: Sathya Final review

Our Objective• Project aim is to rotate and transpose an

image in matrix by taking the image coefficients.

• The working of both Matlab and array processor ‘image rotation and transpose’.

• To show how the diffrence in ‘time and registers required’ comparing both the methods.

Page 14: Sathya Final review

Image rotation in Matlab• This is considered as the normal method of

image rotation.

• More number of clock cycle.

• More memory required.

• More internal registers to store data.

• Time consuming process.

Page 15: Sathya Final review

Image rotation in Matlab

Time taken for rotation = 0.128728 seconds.

Page 16: Sathya Final review

Example

• By taking the above 2x2 matrix let us calculate how much time and memory it’s going to consume in both the systems.

• Aim: to achieve 90’ rotation and transpose using Matlab and array processor.

location [0] [1]

[0] 1 2[1] 3 4

Page 17: Sathya Final review

Matlab algorithm for image rotation

Required variables:• Temporary variables: s,t• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])• Required variables: 6Procedure for rotation:• S=a[0][0],t=a[0][1];-------1st clock cycle• A[0][0]=a[1][0]; -----------2nd clock cycle• A[1][0]=a[1][1]; -----------3rd clock cycle• A[0][1]=s; ------------------4th clock cycle• A[1][1]=t; ------------------5th clock cycle

Page 18: Sathya Final review

Drawbacks in matlab rotation

• More variables are required.• It takes 5 clock cycles for one variable to

be rotated.• It takes 0.128sec to rotate an image.• More memory(registers) is required.• As per design consideration more gates

are also needed.

Page 19: Sathya Final review

Array processor algorithm for image rotation

For that same example, Algorithm for rotation in Array processor is:

• A[0][0]<-a[1][0];• A[0][1]<-a[0][0]; 1st clock cycle (PARALLEL)• A[1][0]<-a[1][1];• A[1][1]<-a[0][1];

(“No need of Temporary Variables”)

Page 20: Sathya Final review

Advantages of array processor Rotation

• It takes only one clock cycle.• It takes 150 nS to rotate the image.• No need of Temporary variables.• Less memory (registers).• Less gates are required.• Design is also simple.

Page 21: Sathya Final review

Matlab algorithm for image Transpose

Required variables:• Temporary variables: s,t• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])• Required variables: 6Procedure for rotation:• S=a[1][0],t=a[1][1];-------1st clock cycle• A[0][0]=a[0][0]; -----------2nd clock cycle• A[1][0]=a[0][1]; -----------3rd clock cycle• A[0][1]=s; ------------------4th clock cycle• A[1][1]=t; ------------------5th clock cycle

Page 22: Sathya Final review

Matlab algorithm for image Transpose

Time taken for transpose = 0.082730 seconds

Page 23: Sathya Final review

Drawbacks in matlab Transpose

• It takes 5 clock cycles for one variable to be transposed.

• It takes 0.0827sec to transpose an image

• More memory(registers) is required.

• As per design consideration more gates are also needed.

Page 24: Sathya Final review

Array processor algorithm for image transpose

For that same example, Algorithm for transpose in Array processor is:

• A[0][0]<-a[0][0];• A[0][1]<-a[1][0]; 1st clock cycle (PARALLEL)• A[1][0]<-a[0][1];• A[1][1]<-a[1][1];

(“No need of Temporary Variables”)

Page 25: Sathya Final review

Advantages of array processor Transpose

• It takes only one clock cycle.• It takes 100 nS to transpose the image.• No need of Temporary variables.• Less memory (registers).• Less gates are required.• Design is also simple.

Page 26: Sathya Final review

Proposed architecture for image rotation

•The internal architecture of PE’s and FIFO’s are nothing but registers. •It shouldn’t have any character as it is going to obey the coded program according to the proposed system.

Page 27: Sathya Final review

Proposed architecture for image transpose

•The internal architecture of PE’s and FIFO’s are nothing but registers. •It shouldn’t have any character as it is going to obey the coded program according to the proposed system.

Page 28: Sathya Final review

Operation of proposed system• Rotate and Transpose commands are activated.• The rotation and transpose done in a single

clock cycle synchronously.• All the processing elements are capable of

reading as well as writing the datas.• Read and write operations are performed

synchronously (Parallel).• Buses & FIFOs in between the PE’s plays a

major role in reducing the number of registers.

Page 29: Sathya Final review

Output of the rotated image coefficients

Time taken for rotation = 150 nS.

Page 30: Sathya Final review

Output of the transposed image coefficients

Time taken for transposition = 100 nS.

Page 31: Sathya Final review

Comparison between matlab and array processor operations

Page 32: Sathya Final review

OTHER APPLICATIONS OF OTHER APPLICATIONS OF ARRAY PROCESSORARRAY PROCESSOR

Source: https://computing.llnl.gov/tutorials/parallel_comp/

Page 33: Sathya Final review

CONCLUSION• Thus the image processing in array

processor is proved to be more efficient than any other system.

• In future the number of registers used can be reduced by using more buses in PE’s.

• So the time of processing can also be reduced by reducing the usage of registers.

• From this project we have learnt one end of the chip design.