enhancing the role of inlining in effective interprocedural parallelization

Post on 22-Feb-2016

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Enhancing the Role of Inlining in Effective Interprocedural Parallelization. Jichi Guo, Mike Stiles Qing Yi, Kleanthis Psarris. Problem. Inter-procedural parallelization Parallel after inlining Gain more parallelizable loops Lost of parallelized loops - PowerPoint PPT Presentation

TRANSCRIPT

Enhancing the Role of Inlining in

Effective Interprocedural Parallelization

Jichi Guo, Mike StilesQing Yi, Kleanthis Psarris

Problem• Inter-procedural parallelization

o Parallel after inlining• Gain more parallelizable loops• Lost of parallelized loops

o Inlining messes up caller / callee• Missed parallel opportunities

o Inlining increases code complexity

Goal• Keep the gain parallelizable loops• Prevent the lost parallelism• Discover the missed opportunities

Solution• Summarize the code using annotation

o Express the underlying information• Inline the annotation before parallelization

o Pass the summarized information to the compiler• Reverse-inline after parallelization

o Revert inlining side effectso Maintain equivalence

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Problems of parallel + inlining

• Parallel + inliningo Conventional inlining with heuristics and pre-transformations

• Heuristics: code size• Transformations: linearization, forward substitution

o Intra-procedural loop parallelization• Fortran do-all loop

• Goalo Gain loops in caller

• Problemso Lost loops in caller / calleeo Missed loops in caller

Problems of parallel + inlining

• Lost of parallelizable loops in caller/calleeo Transformations that cause the lost

• Forward substitution• Linearization

• Forward substitution of non-linear subscriptso Create indirect array references

• Linearization of array dimensionso Mess up array shapes

Problems of parallel + inlining

• Forward substitution of non-linear subscriptso Create indirect array referencesX2(I) ⇒ T(IX(7) + I)Y2(I) ⇒ T(IX(8) + I)Z2(I) ⇒ T(IX(9) + I)

Problems of parallel + inlining

• Linearization of array dimensionso Mess up array shapesPP(i, j, k) ⇒ PP(i + j*4 + k*16)

Problems of parallel + inlining

• Missed parallelizable loops in callero Coding styles that cause the lost

• Opaque compositional subroutineso A calls B, B calls C, C calls D, …

• Array accesso When it is difficult to determine which part is killed

• Debugging and Error Checkingo Statement that breaks the dependency is never executed

• I/O statements• Indirect array references

o ID=IDX(I), X = A(ID)

Problems of parallel + inlining

• Opaque compositional subroutineso A calls B, B calls C, C calls D, …

Problems of parallel + inlining

• Array accesso Difficult to determine which part is killedCTR computed at runtime

Problems of parallel + inlining

• Debugging and Error Checkingo Statement that breaks the dependency is never executed

• I/O statements

Problems of parallel + inlining

• Indirect array referencesIN=>NODENODE=>IRELIREL=>RHSB

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

The annotation language

• Goalo Summarize informationo Avoid ambiguity

The annotation language

• Restricted grammar• Special operators• Writing annotations

The annotation language

• Restricted grammaro Do-all loop onlyo No goto

The annotation language

• Special operatorsy = operator(x1, x2, …, xn)Purpose: abstract relation

o Unknown operator• Relation is unknown

o Generic functionso Unique operator

• Relation is one-to-one, from X to Y

The annotation language

• Writing annotationso Eliminating adverse side effects

• Preserve caller and callee if inlining breaks the dependency o Summarize opaque subroutines

• Eliminate nested function callso Array access

• Specify exact range get read/modifiedo Debugging and error handling

• Aggressive strategy: ignore checking statementso Indirect array references

• Discover unique relation

The annotation language

• Summarize opaque subroutineso Eliminate nested function calls

The annotation language

• Array accesso Specify exact range get read/modified

The annotation language

• Debugging and error handlingo Aggressive strategy: ignore checking statements

The annotation language

• Indirect array referenceso Discover unique relation

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Annotation-based inlining

• Goalo Pass annotated information to the compilero Eliminate inlining side effects

• Flowo Inline before parallelizationo Reverse-inlining after parallelizationo Verify and evaluate at last

• Implementationo POLARIS compiler for parallelizationo ROSE compiler for parsingo POET transformero PERFECT benchmark

Annotation-based inlining

• Workflowo Annotation inlining ⇒ Parallelization ⇒ Reverse-inlining

Annotation-based inlining

• Inlining annotationo Steps

• Annotation ⇒ source languageo Translating special operators

• Inlinining generated source languageo Avoiding linearization

o Translating special operators• Unknown: using uninitialized global arrays• Unique: using linear expression

o Avoiding linearization

Annotation-based inlining

• Inlining annotation

Annotation-based inlining

• Parallelize do-all loops

Annotation-based inlining

• Reverse inlining

Annotation-based inlining

• Reverse inlining is indispensibleo Inlinining is restored to function call

• Avoid lost of parallelism in caller / callee• Enable abstraction operators (unknown, unique)

Annotation-based inlining

• Verification and evaluationo Correctness, Efficiency, and Generality

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Experiment• Purpose

o What does conventional lining bring to parallelization• Gain?• Lost?• Missed?

o How good is annotation-based inlining to avoid above issues• Design

o PERFECT benchmarks (except SPEC77)o Two machines

• 8 cores Intel Mac• 4 cores AMD Operon

o End compiler• GFortran 4.2.1• IFort 11.1

• Resulto Count of Loopso Performance

Experiment• Result: Loops

o Conventional inlining• Having loss

o Annotation-based inlining• No loss, more gain

Experiment• Result: Performance

o Average speeduplimited

o Annot-based inliningalways better

Summary• Inter-procedural parallelization• Summarize effects of conventional inlining

o Gaino Losto Missed

• Propose annotation-based inliningo Annotation summaryo Enhanced inlining strategyo Reverse inlining

Thanks!Questions?

top related