optimizing sorting with genetic algorithms xiaoming li, maría jesús garzarán, and david padua...

37
Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana- Champaign

Upload: anastasia-skinner

Post on 31-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Optimizing Sorting With Genetic Algorithms

Xiaoming Li, María Jesús Garzarán, and David Padua

University of Illinois at Urbana-Champaign

Page 2: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

ESSL on Power3

Page 3: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

ESSL on Power4

Page 4: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Outline

Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

Page 5: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Motivation

No universally best sorting algorithm

Can we automatically GENERATE and tune sorting algorithms for each platform (such as FFTW and Spiral)? – Performance of sorting on the platform and on

the input characteristics.

The algorithm selection may not be enough.

Page 6: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Algorithm Selection (CGO’04)

Select the best algorithm from Quicksort, Multiway Merge Sort and CC-radix.

Relevant input characteristics: number of keys, entropy vector.

Page 7: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Algorithm Selection (CGO’0

Page 8: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Proposed Solution

We need different algorithms for different partitions

The best sorting algorithm should be the result of the composition of the these different best algorithms.

Build Composite Sorting algorithms– Identify primitives from the sorting algorithms– Design a general method to select an appropriate

sorting primitive at runtime– Design a mechanism to combine the primitives and

the selection methods to generate the composite sorting algorithm

Page 9: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Outline

Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

Page 10: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Sorting Primitives

Divide-by-Value– A step in Quicksort– Select one or multiple pivots and sort the input

array around these pivots– Parameter: number of pivots

Divide-by-Position (DP)– Divide input into same-size sub-partitions– Use heap to merge the multiple sorted sub-

partitions– Parameters: size of sub-partitions, fan-out and

size of the heap

Page 11: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Sorting Primitives

Divide-by-Radix (DR)– Non-comparison based sorting algorithm– Parameter: radix (r bits)– Step 1: Scan the input to get distribution array, which

records how many elements in each of the 2r sub-partitions.

– Step 2: Compute the accumulative distribution array, which is used as the indexes when copying the input to the destination array.

– Step 3: Copy the input to the 2r sub-partitions.1111

0123

counter

0123

0123

accum. dest.

11233012

src.

30111223

1234

Page 12: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Sorting Primitives

Divide-by-radix-assuming-uniform-distribution (DU)– Step 1 and Step 2 in DR are expensive.– If the input elements are distributed among 2r sub-

partitions near evenly, the input can be copied into the destination array directly assuming every partition have the same number of elements.

– Overhead: partition overflow– Parameter: radix (r bits)

0123

0123

accum. dest.src.

1234

30111223

11233012

Page 13: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Selection Primitives

• Branch-by-Size• Branch-by-Entropy

– Parameter: number of branches, threshold vector of the branches

Page 14: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Leaf Primitives

When the size of a partition is small, we stick to one algorithm to sort the partition fully.

Two methods are used in the cleanup operation– Quicksort– CC-Radix

Page 15: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Composite Sorting Algorithms

• The composite sorting algorithms are built from these primitives.

• The algorithms have shapes of tree.

Page 16: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Outline

Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

Page 17: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Search Strategy

Search the best tree Search the best parameter values of the

primitives– Good solutions for small size problem should be

retained to use in the solution for larger problem.

Genetic algorithms are a natural solution that satisfy the requirements:– Preserve good sub-trees– Give good sub-trees more chances to propagate

Page 18: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Composite Sorting Algorithms

• Search the best parameter values to adapt – To the architectural features– To the input characteristics

Page 19: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Search Strategy

Search for the best tree Search for the best parameter values of

the primitives– Good solutions for small size problem should be

retained to use in the solution for larger problem.

Genetic algorithms are a natural solution that satisfy the requirements:– Preserve good sub-trees– Give good sub-trees more chances to propagate

Page 20: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Genetic Algorithm

• Mutation– Mutate the structure of the algorithm.– Change the parameter values of primitives.

Page 21: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Crossover

• Propagate good sub-trees

Page 22: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Fitness Function

A fitness function measures the relative performance of the genomes in a population.

The average performance of a genome on the training inputs is the base for the fitness of the genome.

A genome which performs well across inputs is preferred– fitness is penalized when performance varies

across the test inputs

Page 23: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Library Generation

Installation phase: Use genetic algorithm to search for the sorting genome.– Set of genomes in initial population – Test the genomes in a set of inputs with

different characteristics

Page 24: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Outline

Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

Page 25: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Platforms

AMD Athlon MP Sun UltraSparcIII SGI R12000 IBM Power3 IBM Power4 Intel Itanium2 Intel Xeon

Page 26: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

AMD Athlon MP

Page 27: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Power3

Page 28: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Multiple-peak Performance

Page 29: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Outline

Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

Page 30: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

The best genomes in different regions

Page 31: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Problems of Genetic Adaptation Fitness function is the average

performance of the genome on the test inputs.

Fitness function in our genetic algorithm prefers genomes with stable performance

The genetic algorithm is not powerful enough to evolve into the complex genome which chooses the best genome in each small region

Page 32: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Using Classifier System

Search the best genomes for different regions of the input characteristics.– Selects the regions– Selects the best algorithm for each region

Nice feature: The fitness of a genomes in a region will not be affected by its fitness in other regions

Page 33: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Map sorting composition into a classifier system The input characteristics (number of keys

and entropy vector) are encoded into bit strings.

A rule in the classifier system has two parts– Condition: A string consisting of ‘0’, ‘1’, and ‘*’.

Condition string will be used to match the encoded input characteristics.

– Action: Sorting genomes without branch primitives

Page 34: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Example for Classifier Sorting

• Example:– For inputs of up-to 16M keys– Encode number of keys with 4 bits.

• 0000: 0~1M, 0001: 1~2M…• Number of keys = 10.5M. Encoded into “1100”

Condition Action Fitness

Accuracy

(dr 5 (lq 1 16)) … …

(dp 4 2 ( lr 5 16)) … …

… …

1100

1100

1100

01**

1010

110* (dv 2 ( lr 6 16))

Page 35: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Performance of Classifier Sorting• Power3

Page 36: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Power4

Page 37: Optimizing Sorting With Genetic Algorithms Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign

Conclusions

Replace the complexity of finding an efficient algorithm with the task of defining a set of generic primitives.

Design methods to search in the space of the composition of the primitives.

• Genetic algorithms• Classifier system