a parallel gpu solution to the maximal clique enumeration ...€¦ · tolerance classes are sets...

29
Dr. Christopher Henry, P. Eng. Dept. of Applied Computer Science University of Winnipeg 515 Portage Ave. Winnipeg, MB, Canada R3B 2E9 GTC 2014 A Parallel GPU Solution to the Maximal Clique Enumeration Problem for CBIR

Upload: others

Post on 23-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Dr. Christopher Henry, P. Eng.Dept. of Applied Computer Science

    University of Winnipeg515 Portage Ave.

    Winnipeg, MB, Canada R3B 2E9

    GTC 2014

    A Parallel GPU Solution to the Maximal Clique Enumeration Problem for CBIR

  • Introduction• Goal: Quantify the nearness (or apartness) of sets of

    objects based on their descriptions• Tolerance near set theory provides a framework for this

    assessment• Approach is dependent on finding all the tolerance

    classes on a set of objects• The problem of finding tolerance classes has recently

    been mapped to performing Maximal Clique Enumeration (MCE)

    • Focus: An efficient method for finding all tolerance classes on a set of objects

    • Application: Content-Based Image Retrieval

  • Tolerance Near Sets• Goal: Determine the perceptual nearness of disjoint sets

    of objects• Near sets provide:

    • Framework• A systematic method to formally answer the question “Are these

    sets similar?”• To what extent

    • Traces its origins contributions of Webner, Fechner, Poincaré, Zeeman, Sossinsky, Reiz, Pawlak, Orłowska, and Peters

  • Tolerance Relation and Tolerance Classes• Tolerance near sets are defined by a description-based

    tolerance relation• Tolerance relations provide a view of the world without

    transitivity

  • Nearness Measure• The nearness measure was created out of a need to

    determine the degree that near sets resemble each other

  • Application: Content-Based Image Retrieval

    • Goal: Retrieve images based on content of an image• Rather than on semantic string or keyword associated with the

    image

  • Perceptual Image Analysis• Subimages are the perceptual objects used in this work• Aim: find tolerance classes (sets) contained in the union

    of subimages from two images• MCE is the approach used to find these tolerance classes

  • Image Dataset• Results are based on images from the SIMPLcity image

    dataset

  • Maximal Clique Enumeration• Observation: The problem of finding classes can be

    mapped to the Maximal Clique Enumeration Problem• Classes can be found using an algorithm with reduced

    complexity• Based in graph theory

    • MCE problem consists of finding all the maximal cliques among an undirected graph• Let G=(V,E) denote an undirected graph

    • V: set of vertices• E: set of edges that connect pairs of distinct vertices from V

    • A clique is a set of vertices where each pair of vertices in the clique is connected by an edge in E

    • A maximal clique in G is a clique whose vertices are not all contained in some larger clique

  • Bron-Kerbosh MCE Algorthm• First serial algorithm for MCE was developed by Harary

    and Ross• Since then, two main approaches have been established

    to solve the MCE problem• Greedy approach by Bron-Kerbosh

    • Concurrently discovered by Akkoyunlu• Output-sensitive approaches

    • General idea: find maximal cliques through a depth-first search• Branches are formed based on candidate cliques• Backtracking occurs once a maximal clique has been discovered

    • Algorithm essentially marks new nodes and processes them

  • Maximal Clique Enumeration

  • Visualization of Tree Structure

  • GPU Algorithm• Each block of threads processes one pair of images• Performs comparison between query image and

    candidate image• Each thread executes MCE algorithm on 1 node in the tree• i.e. each thread finds all child nodes for a given node

    • One level of the tree is processed per iteration• Threads and blocks are arranged in one dimension

  • GPU Algorithm

  • GPU Algorithm• Each block is assigned memory space for processing a

    different pair of images

    • Stopping condition

  • Algorithm Iteration

    Block-level Sorting Device-level Sorting

  • Node Data Structure• Recall, each node contains three data structures

    1. clique: A list of vertices in the current search path2. cands: A list of candidates vertices that are not in clique

    • But, connected to every vertex in clique3. not: A list of vertices that are connected to every vertex in clique

    • But, would form a redundant path if combined with vertices in clique

    • Each of these are stored as a bit set • Contained in an array of type unsigned char

    • Bit set representation allows for efficient intersection and counting operations

    • These nodes are stored in global memory• Also too large to store in shared memory• CGMA is 1 since node is only used by 1 thread

  • Global Memory• The input and output of each kernel iteration are nodes• Each thread can generate multiple nodes during each

    kernel iteration• The number of output nodes are not known a priori

    • After each iteration a radix sort is used to move the output to contiguous input locations• Sort is performed on offsets using CUB library

    • Each thread is also updating its portion of the nearness measure after each iteration• Can be calculated as a weighted average

  • Texture Memory• Each node requires the adjacency matrix during MCE

    • For this CBIR application the adjacency matrix is ~26KB• Too large for shared and constant memory

    • Also not possible to know adjacency matrix access patterns a priori

    • Thus, adjacency matrix is stored in texture memory• One for each block (each pair of images)

  • Timing Results

    ε CPU MCE (sec)i7-930

    GPU MCE (sec)GeForce GTX

    460

    GPU MCE (sec)K20

    K20 Memory(MB)

    0.1 0.85 0.1 0.009 5850.2 0.84 0.06 0.023 7240.3 2.45 1.42 0.035 8620.4 28.71 3.39 0.078 1277

  • CBIR Results

  • CBIR Results

  • CBIR Results

  • CBIR Results

  • Conclusion• The algorithm presented here is tailored to a specific

    dataset• Approach can be adapted for use on larger graphs• The key restraint is the upper limit on the amount of

    nodes each thread can generate• Enough storage must be available for these nodes

  • Acknowledgements• This research has been supported by:

    • The Natural Sciences and Engineering Research Council of Canada (NSERC) grant 418413

    • NVIDIA Corporation

    • Special thanks to:• Keith Massey• Tariq Alusaifeer• Sheela Ramanna• James F. Peters

  • References• J. F. Peters and P. Wasilewski, “Foundations of near sets,” Info. Sci., vol. 179, no. 18, pp. 3091–3109, 2009.• C. J. Henry, “Near Sets: Theory and Applications,” Ph.D. dissertation, University of Manitoba, CAN, 2010, Available

    at: https://mspace.lib.umanitoba.ca/handle/1993/4267.• C. J. Henry and S. Ramanna, “Maximal clique enumeration in finding near neighbourhoods,” Transactions on Rough

    Sets, vol. LNCS 7736, pp. 103–124, 2013.• T. Alusaifeer, S. Ramanna, C. J. Henry, and J. Peters, “GPU implementation of MCE approach to finding near

    neighbourhoods,” Lecture Notes in Artificial Intelligence, vol. LNAI 8171, pp. 251–262, 2013.• G. T. Fechner, Elements of Psychophysics, vol. I. London, UK: Hold, Rinehart &Winston, 1966, H. E. Adler’s trans. of

    Elemente der Psychophysik, 1860.• H. Poincaré, Science and Hypothesis. Brock University: The Mead Project, 1905, L. G. Ward’s translation.• E. C. Zeeman, “The topology of the brain and the visual perception,” in Topoloy of 3-manifolds and selected topics,

    K. M. Fort, Ed. New Jersey: Prentice Hall, 1965, pp. 240–256.• A. B. Sossinsky, “Tolerance space theory and some applications,” Acta Applicandae Mathematicae: An International

    Survey Journal on Applying Mathematics and Mathematical Applications, vol. 5, no. 2, pp. 137–167, 1986.• F. Riesz, “Stetigkeitsbegriff und abstrakte mengenlehre,” Atti del IV Congresso Internazionale dei Matematici, vol. II,

    pp. 18–24, 1908.• Z. Pawlak, “Classification of objects by means of attributes,” Institute for Computer Science, Polish Academy of

    Sciences, Tech. Rep. PAS 429, 1981.• E. Orłowska, “Semantics of vague concepts. Applications of rough sets,” Institute for Computer Science, Polish

    Academy of Sciences, Tech. Rep. 469, 1982.• J. F. Peters, “Near sets. General theory about nearness of objects,” Applied Mathematical Sciences, vol. 1, no. 53,

    pp. 2609–2629, 2007.• J. F. Peters and P. Wasilewski, “Tolerance spaces: Origins, theoretical aspects and applications,” Information

    Sciences, vol. 195, no. 0, pp. 211–225, 2012.

  • References• C. J. Henry, “Perceptually indiscernibility, rough sets, descriptively near sets, and image analysis,” Transactions on

    Rough Sets, vol. LNCS 7255, pp. 41–121, 2012.• J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-sensitive integrated matching for picture libraries,”

    IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947–963, 2001.• F. Harary and I. C. Ross, “A procedure for clique detection using the group matrix,” Sociometry, vol. 20, no. 3, pp.

    205–215, 1957.• M. C. Schmidt, N. f. Samatova, K. Thomas, and P. Byung-Hoon, "A scalable, parallel algorithm for maximal clique

    enumeration," Journal of Parallel and Distributed Computing, vol. 69, pp. 417-428, 2009.• C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communications of the ACM,

    vol. 16, no. 9, pp. 575–577, 1973.• E. A. Akkoyunlu, “The enumeration of maximal cliques of large graphs,” SIAM Journal on Computing, vol. 2, no. 1,

    pp. 1–6, 1973.• S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa, “A new algorithm for generating all the maximal independent

    sets,” SIAM Journal on Computing, vol. 6, pp. 505–517, 1977.• K. Makino and T. Uno, “New algorithms for enumerating all maximal cliques,” Algorithm Theory: SWAT 2004, LNCS,

    vol. 3111, pp. 260–272, 2004.

  • Thank You