lecture notes inartificial intelligence 7301 - springer978-3-642-30217-6/1.pdf · lecture notes...

22
Lecture Notes in Artificial Intelligence 7301 Subseries of Lecture Notes in Computer Science LNAI Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany LNAI Founding Series Editor Joerg Siekmann DFKI and Saarland University, Saarbrücken, Germany

Upload: lekien

Post on 11-May-2018

223 views

Category:

Documents


2 download

TRANSCRIPT

Lecture Notes in Artificial Intelligence 7301

Subseries of Lecture Notes in Computer Science

LNAI Series Editors

Randy GoebelUniversity of Alberta, Edmonton, Canada

Yuzuru TanakaHokkaido University, Sapporo, Japan

Wolfgang WahlsterDFKI and Saarland University, Saarbrücken, Germany

LNAI Founding Series Editor

Joerg SiekmannDFKI and Saarland University, Saarbrücken, Germany

Pang-Ning Tan Sanjay ChawlaChin Kuan Ho James Bailey (Eds.)

Advances inKnowledge Discoveryand Data Mining

16th Pacific-Asia Conference, PAKDD 2012Kuala Lumpur, Malaysia, May 29 – June 1, 2012Proceedings, Part I

13

Series Editors

Randy Goebel, University of Alberta, Edmonton, CanadaJörg Siekmann, University of Saarland, Saarbrücken, GermanyWolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany

Volume Editors

Pang-Ning TanMichigan State University, Department of Computer Science and Engineering428 S. Shaw Lane, 48824-1226 East Lansing, MI, USAE-mail: [email protected]

Sanjay ChawlaUniversity of Sydney, School of Information Technologies1 Cleveland St., 2006 Sydney, NSW, AustraliaE-mail: [email protected]

Chin Kuan HoMultimedia University, Faculty of Computing and InformaticsJalan Multimedia, 63100 Cyberjaya, Selangor, MalaysiaE-mail: [email protected]

James BaileyThe University of Melbourne, Department of Computing and Information Systems111 Barry Street, 3053 Melbourne, VIC, AustraliaE-mail: [email protected]

ISSN 0302-9743 e-ISSN 1611-3349ISBN 978-3-642-30216-9 e-ISBN 978-3-642-30217-6DOI 10.1007/978-3-642-30217-6Springer Heidelberg Dordrecht London New York

Library of Congress Control Number: 2012937031

CR Subject Classification (1998): I.2, H.3, H.4, H.2.8, C.2, J.1

LNCS Sublibrary: SL 7 – Artificial Intelligence© Springer-Verlag Berlin Heidelberg 2012This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,in its current version, and permission for use must always be obtained from Springer. Violations are liableto prosecution under the German Copyright Law.The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,even in the absence of a specific statement, that such names are exempt from the relevant protective lawsand regulations and therefore free for general use.

Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

Preface

PAKDD 2012 was the 16th conference of the Pacific Asia Conference series onKnowledge Discovery and Data Mining. For the first time, the conference washeld in Malaysia, which has a vibrant economy and an aspiration to transformitself into a knowledge-based society. Malaysians are also known to be very ac-tive in social media such as Facebook and Twitter. Many private companiesand government agencies in Malaysia are already adopting database and datawarehousing systems, which over time will accumulate massive amounts of datawaiting to be mined. Having PAKDD 2012 organized in Malaysia was thereforevery timely as it created a good opportunity for the local data professionals to ac-quire cutting-edge knowledge in the field through the conference talks, tutorialsand workshops.

The PAKDD conference series is a meeting place for both university re-searchers and data professionals to share the latest research results. The PAKDD2012 call for papers attracted a total of 241 submissions from 32 countries in allsix continents (Asia, Europe, Africa, North America, South America, and Aus-tralasia), of which 20 (8.3%) were accepted for full presentation and 66 (27.4%)were accepted for short presentation. Each submitted paper underwent a rigorousdouble-blind review process and was assigned to at least four Program Commit-tee (PC) members. Every paper was reviewed by at least three PC members,with nearly two-thirds of them receiving four reviews or more. One of the changesin the review process this year was the adoption of a two-tier approach, in whicha senior PC member was appointed to oversee the reviews for each paper. Inthe case where there was significant divergence in the review ratings, the seniorPC members also initiated a discussion phase before providing the Program Co-chairs with their final recommendation. The Program Co-chairs went througheach of the senior PC members’ recommendations, as well as the submitted pa-pers and reviews, to come up with the final selection. We thank all reviewers(Senior PC, PC and external invitees) for their efforts in reviewing the papersin a timely fashion (altogether, more than 94% of the reviews were completedby the time the notification was sent). Without their hard work, we would nothave been able to see such a high-quality program.

The three-day conference program included three keynote talks by world-renowned data mining experts, namely, Chandrakant D. Patel from HP Labs(Joules of Available Energy as the Global Currency: The Role of Knowledge Dis-covery and Data Mining); Charles Elkan from the University of California atSan Diego (Learning to Make Predictions in Networks); and Ian Witten fromthe University of Waikato (Semantic Document Representation: Do It with Wik-ification). The program also included four workshops, three tutorials, a doctoralsymposium, and several paper sessions. Other than these intellectually inspiringevents, participants of PAKDD 2012 were able to enjoy several social events

VI Preface

throughout the conference. These included a welcome reception on day one, abanquet on day two and a free city tour on day three. Finally, PAKDD 2012organized a data mining competition for those who wanted to lay their handson mining some real-world datasets.

Putting a conference together with a scale like PAKDD 2012 requires tremen-dous efforts from the organizing team as well as financial support from thesponsors. We thank Takashi Washio, Jun Luo and Hui Xiong for organizingthe workshops and tutorials, and coordinating with the workshop/tutorial or-ganizers/speakers. We also owe James Bailey a big thank you for preparing theconference proceedings. Finally, we had a great team of Publicity Co-chairs, Lo-cal Organization Co-chairs, and helpers. They ensured the conference attractedmany local and international participants, and the conference program proceededsmoothly.

We would like to express our gratitude to SAS, AFOSR/AOARD (Air ForceOffice of Scientific Research/Asian Office of Aerospace Research and Devel-opment), MDeC (Multimedia Development Corporation), PIKOM (ComputerIndustry Association of Malaysia) and other organizations for their generoussponsorhip and support. We also wish to thank the PAKDD Steering Com-mittee for offering the student travel support grant and the grant for the beststudent paper award(s), and UTAR and MMU for providing the administrativesupport.

Philip YuEe-Peng Lim

Hong-Tat EwePang-Ning TanSanjay ChawlaChin-Kuan Ho

Organization

Organizing Committee

Conference Co-chairsPhilip Yu University of Illinois at Chicago, USAHong-Tat Ewe Universiti Tunku Abdul Rahman, MalaysiaEe-Peng Lim Singapore Management University, Singapore

Program Co-chairs

Pang-Ning Tan Michigan State University, USASanjay Chawla The University of Sydney, AustraliaChin-Kuan Ho Multimedia University, Malaysia

Workshop Co-chairs

Takashi Washio Osaka University, JapanJun Luo Shenzhen Institute of Advanced Technology,

China

Tutorial Co-chair

Hui Xiong Rutgers University, USA

Local Organization Co-chairs

Victor Tan Universiti Tunku Abdul Rahman, MalaysiaWen-Cheong Chin Multimedia University, MalaysiaSoung-Yue Liew Universiti Tunku Abdul Rahman, Malaysia

Publicity Co-chairs

Rui Kuang University of Minnesota, USAMing Li Nanjing University, ChinaMyra Spiliopoulou University of Magdeburg, Germany

Publication Chair

James Bailey University of Melbourne, Australia

VIII Organization

Local Arrangements Committee

Soung Yue Liew (Co-chair) Victor Tan (Co-chair)Wen Cheong Chin (Co-chair) Kok Why Ng (Co-chair)Nadim Jahangir Choo Yee TingChee Onn Wong Chiung Ching HoChong Pei Fen Hau Lee TongTimothy Yap James OoiKok Leong Chan Yong Haur TayAzurawati Chian Wen TooKhong Leng Lim MariamMichelle Meei Hao HooKean Vee Sor PriyaMadhavan Simon LauChin Chwee Wong Swee Ling Chean

Steering Committee

Co-chairsGraham Williams Australian National University, AustraliaTu Bao Ho Japan Advanced Institute of Science and

Technology, Japan

Life MembersHiroshi Motoda AFOSR/AOARD and Osaka University, JapanRao Kotagiri University of Melbourne, AustraliaNing Zhong Maebashi Institute of Technology, JapanMasaru Kitsuregawa Tokyo University, JapanDavid Cheung University of Hong Kong, ChinaGraham Williams Australian National University, AustraliaMing-Syan Chen National Taiwan University, Taiwan, ROC

MembersHuan Liu Arizona State University, USAKyu-Young Whang Korea Advanced Institute of Science and

Technology, KoreaChengqi Zhang University of Technology Sydney, AustraliaTu Bao Ho Japan Advanced Institute of Science and

Technology, JapanEe-Peng Lim Singapore Management University, SingaporeJaideep Srivastava University of Minnesota, USAZhi-Hua Zhou Nanjing University, ChinaTakashi Washio Institute of Scientific and Industrial Research,

Osaka UniversityThanaruk Theeramunkong Thammasat University, Thailand

Organization IX

P. Krishna Reddy International Institute of InformationTechnology, Hyderabad (IIIT-H), India

Joshua Z. Huang Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, China

Senior Program Committee

Anirban Dasgupta Yahoo! Research Silicon Valley, USAArno Siebes Universiteit Utrecht, The NetherlandsBart Goethals University of Antwerp, BelgiumBernhard Pfahringer The University of Waikato, New ZealandDacheng Tao Nanyang Technological University, SingaporeEe-Peng Lim Singapore Management University, SingaporeHaixun Wang Microsoft Research Asia, ChinaHisashi Kashima University of Tokyo, JapanJeffrey Xu Yu The Chinese University of Hong Kong,

Hong KongJian Pei Simon Fraser University, CanadaJianyong Wang Tsinghua University, ChinaJiuyong Li University of South Australia, AustraliaKyuseok Shim Seoul National University, KoreaMasashi Sugiyama Tokyo Institute of Technology, JapanNg Wee Keong Nanyang Technological University, SingaporeNitesh V. Chawla University of Notre Dame, USAOsmar R. Zaiane University of Alberta, CanadaPanagiotis Karras Rutgers University, USAPeter Christen The Australian National University, AustraliaSameep Mehta IBM Research, IndiaSanjay Ranka University of Florida, USAShivani Agarwal Indian Institute of Science, IndiaWei Wang University of North Carolina at Chapel Hill,

USAYu Zheng Microsoft Research Asia, China

Program Committee

Aditya Krishna Menon University of California, USAAixin Sun Nanyang Technological University, SingaporeAkihiro Inokuchi Osaka University, JapanAlbrecht Zimmerman Katholieke Universiteit Leuven, BelgiumAlexandre Termier Universite Joseph Fourier, FranceAlfredo Cuzzocrea ICAR-CNR and University of Calabria, ItalyAmol Ghoting IBM T.J. Watson Research Center, USAAndreas Hotho University of Kassel, GermanyAndrzej Skowron University of Warsaw, PolandAnnalisa Appice Universita degli Studi di Bari, Italy

X Organization

Anne Denton North Dakota State University, USAAnne Laurent Montpellier 2 University, FranceAoying Zhou East China Normal University, Shanghai,

ChinaArbee Chen National Chengchi University, TaiwanAristides Gionis Yahoo! Research, SpainAryya Gangopadhyay University of Maryland, USAAtsuhiro Takasu National Institute of Informatics, JapanAtsuyoshi Nakamura Hokkaido University, JapanBenjamin C.M. Fung Concordia University, CanadaBettina Berendt Katholieke Universiteit Leuven, BelgiumBo Zhang Tsinghua University, ChinaBradley Malin Vanderbilt University, USABruno Cremilleux Universite de Caen, FranceChandan Reddy Wayne State University, USAChang-Tien Lu Virginia Polytechnic Institute and

State University, USACharles Ling The University of Western Ontario, CanadaChengkai Li The University of Texas at Arlington, USAChengqi Zhang University of Technology, AustraliaChiranjib Bhattachar Indian Institute of Science, IndiaChoochart Haruechaiy National Electronics and Computer Technology

Center (NECTEC), ThailandChotirat Ratanamatan Chulalongkorn University, ThailandChunsheng Yang Institute for Information Technology, CanadaClement Yu University of Illinois at Chicago, USADaisuke Ikeda Kyshu University, JapanDan Simovici University of Massachusetts Boston, USADao-Qing Dai Sun Yat-Sen University, ChinaDaoqiang Zhang Nanjing University of Aeronautics and

Astronautics, ChinaDavid Albrecht Monash University, AustraliaDavid Taniar Monash University, AustraliaDavid Lo Singapore Management University, SingaporeDavid F. Gleich Purdue University, USADavood Rafiei University of Alberta, CanadaDeept Kumar Virginia Polytechnic Institute and

State University, USADejing Dou University of Oregon, USADi Wu Polytechnic Institute of NYU, USADiane Cook Washington State University, USADiansheng Guo University of South Carolina, USADragan Gamberger Rudjer Boskovic Institute, CroatiaDu Zhang California State University, USAEfstratios Gallopoulos University of Patras, GreeceElena Baralis Politecnico di Torino, Italy

Organization XI

Eyke Huellermeier University of Marburg, GermanyFabrizio Silvestri Istituto di Scienza e Tecnologie

dell’Informazione (ISTI), ItalyFeifei Li Florida State University, USAFlorent Masseglia INRIA, FranceFosca Giannotti Universita di Pisa, ItalyFrancesco Bonchi Yahoo! Research, SpainFrans Coenen University of Liverpool, UKGang Li Deakin University, AustraliaGao Cong Nanyang Technological University, SingaporeGeorge Karypis University of Minnesota, USAGiuseppe Manco Universita della Calabria, ItalyGraham Williams Australian Taxation Office, AustraliaHady Lauw Institute for Infocomm Research, SingaporeHaibin Cheng Yahoo! Labs, USAHaimonti Dutta Columbia University, USAHanghang Tong IBM T.J. Watson Research Center, USAHarry Zhang University of New Brunswick, CanadaHassab Elgawi Osman University of Tokyo, JapanHideo Bannai Kyshu University, JapanHiroyuki Kawano Nanzan University, JapanHong Cheng The Chinese University of Hong Kong,

Hong KongHua Lu Aalborg University, DenmarkHuan Liu Arizona State University, USAHui Wang University of Ulster, UKHuidong Jin Chinese University of Hong Kong, Hong KongIoannis Androulakis Rutgers University, USAIrena Koprinska University of Sydney, AustraliaIvor Tsang The Hong Kong University of Science and

Technology, Hong KongJaakko Hollmen Aalto University, FinlandJames Caverlee Texas A&M University, USAJason Wang New Jersey’s Science and Technology

University, USAJean-Francois Boulicaut Universite de Lyon, FranceJean-Marc Petit Universite de Lyon, FranceJeffrey Ullman Stanford University, USAJialie Shen Singapore Management University, SingaporeJian Yin Sun Yat-Sen University, ChinaJieping Ye Arizona State University, USAJinze Liu University of Kentucky, USAJohn Keane The University of Manchester, UKJosep Domingo-Ferrer Universitat Rovira i Virgili, SpainJunbin Gao Charles Sturt University, AustraliaJunping Zhang Fudan University, China

XII Organization

Kamalika Das NASA Ames Research Center, USAKanishka Bhaduri NASA, USAKeith Marsolo Cincinnati Children’s Hospital Medical Center,

USAKeith Chan The Hong Kong Polytechnic University,

Hong KongKennichi Yoshida University of Tsukuba, JapanKitsana Waiyamai Kasetsart University, ThailandKonstantinos Kalpakis University of Maryland Baltimore County, USAKouzou Ohara Aoyama-Gakuin University, JapanKrishnamoorthy Sivakumar Washington State University, USAKun Liu Yahoo! Labs, USAKuo-Wei Hsu National Chengchi University, TaiwanLarry Hall University of South Florida, USALarry Holder Washington State University, USALatifur Khan University of Texas at Dallas, USALiang Wang NLPR, Institute of Automation Chinese

Academy of Science, ChinaLim Chee Peng Universiti Sains Malaysia, MalaysiaLisa Singh Georgetown University, USAMaguelonne Teisseire Maison de la Teledetection, FranceManabu Okumura Japan Advanced Institute of Science and

Technology, JapanMarco Maggini Universita degli Studi di Siena, ItalyMarian Vajtersic University of Salzburg, AustriaMarut Buranarach National Electronics and Computer Technology

Center, ThailandMary Elaine Califf Illinois State University, USAMarzena Kryszkiewicz Warsaw University of Technology, PolandMasayuki Numao Osaka University, JapanMasoud Makrehchi University of Waterloo, CanadaMatjaz Gams J. Stefan Institute, SloveniaMengjie Zhang Victoria University of Wellington, New ZealandMichael Hahsler Southern Methodist University, USAMichael Bruckner University of Potsdam, GermanyMichalis Vazirgianni INRIA/FUTURS, FranceMin Song New Jersey Institute of Technology, USAMin Yao Zhejiang University, ChinaMing-Syan Chen National Taiwan University, TaiwanMingli Song Zhejiang University, ChinaMirco Nanni Istituto di Scienza e Tecnologie

dell’Informazione (ISTI), ItalyMurali Mani Worcester Polytechnic Institute, USAMurat Kantarcioglu University of Texas at Dallas, USANagaraj Kota Yahoo! Labs, IndiaNgoc-Thanh Nguyen Wroclaw University of Technology, Poland

Organization XIII

Olivia Sheng University of Utah, USAPabitra Mitra Indian Institute of Technology Kharagpur,

IndiaPanagiotis Papadimitriou Stanford University, USAPhilippe Lenca Telecom Bretagne, FrancePing Li Cornell University, USAQi Li Western Kentucky University, USAQi He IBM Research, USAQingshan Liu NLPR, Institute of Automation Chinese

Academy of Science, ChinaRichi Nayak Queensland University of Technologies,

AustraliaRobert Hilderman University of Regina, CanadaRoberto Bayardo Google, Inc, USARohan Baxter Australian Taxation Office, AustraliaRui Camacho Universidade do Porto, PortugalRuoming Jin Kent State University, USASachindra Joshi IBM Research, IndiaSanjay Jain National University of Singapore, SingaporeScott Sanner Australian National University, AustraliaSee-Kiong Ng Singapore University of Technology and Design,

SingaporeSelcuk Candan Arizona State University, USAShashi Shekhar University of Minnesota, USAShen-Shyang Ho California Institute of Technology, USASheng Zhong State University of New York at Buffalo, USAShichao Zhang University of Technology, AustraliaShiguang Shan Institute of Computing Technology Chinese

Academy of Sciences, ChinaShoji Hirano Shimane University, JapanShu-Ching Chen Florida International University, USAShuigeng Zhou Fudan University, ChinaShusaku Tsumoto Shimane University, JapanShyam-Kumar Gupta Indian Institute of Technology, IndiaSilvia Chiusano Politecnico di Torino, ItalySongcan Chen Nanjing University of Aeronautics and

Astronautics, ChinaSourav S. Bhowmick Nanyang Technological University, SingaporeSrikanta Tirthapura Iowa State University, USASrivatsan Laxman Microsoft Research, IndiaStefan Rueping Fraunhofer IAIS, GermanySung-Ho Ha Kyungpook National University, KoreaSzymon Jaroszewicz University of Massachusetts Boston, USATadashi Nomoto National Institute of Japanese Literature,

JapanTakehisa Yairi University of Tokyo, Japan

XIV Organization

Takeshi Fukuda IBM, JapanTamir Tassa The Open University, IsraelTao Li Florida International University, USATapio Elomaa Tampere University of Technology, FinlandTetsuya Yoshida Hokkaido University, JapanThepchai Supnithi National Electronics and Computer Technology

Center, ThailandThomas Seidl RWTH Aachen University, GermanyTom Croonenborghs Katholieke Hogeschool Kempen, BelgiumToon Calders Eindhoven University of Technology,

The NetherlandsToshihiro Kamishima National Institute of Advanced Industrial

Science and Technology, JapanToshiro Minami Kyushu University Library, JapanTru Cao Ho Chi Minh City University of Technology,

VietnamTsuyoshi Murata Tokyo Institute of Technology, JapanTu-Bao Ho Japan Advanced Institute of Science and

Technology, JapanVarun Chandola Oak Ridge National Laboratory, USAVincent S. Tseng National Cheng Kung University, TaiwanVincenzo Piuri Universita degli Studi di Milano, ItalyVladimir Estivill-Castro Griffith University, AustraliaWagner Meira Universidade Federal de Minas Gerais, BrazilWai Lam The Chinese University of Hong Kong,

Hong KongWalter Kosters Universiteit Leiden, The NetherlandsWanpracha Chaovalitw The State University of New Jersey Rutgers,

USAWei Fan IBM T.J. Watson Research Center, USAWeining Qian East China Normal University, ChinaWen-Chih Peng National Chiao Tung University, TaiwanWilfred Ng Hong Kong University of Science and

Technology, Hong KongWoong-Kee Loh Sungkyul University, South KoreaXiaofang Zhou The University of Queensland, AustraliaXiaohua Hu Drexel University, USAXiaohui Liu Brunel University, UKXiaoli Li Institute for Infocomm Research, SingaporeXin Wang University of Calgary, CanadaXindong Wu University of Vermont, USAXingquan Zhu Florida Atlantic University, USAXintao Wu University of North Carolina at Charlotte, USAXu Sun Cornell University, USAXuan Vinh Nguyen Monash University, AustraliaXue Li The University of Queensland, Australia

Organization XV

Xuelong Li University of London, UKXuemin Lin The University of New South Wales, AustraliaXueyi Wang Northwest Nazarene University, USAYan Liu IBM Research, USAYan Jia National University of Defense Technology,

ChinaYang Zhou Yahoo!, USAYang-Sae Moon Kangwon National University, KoreaYasuhiko Morimoto Hiroshima University, JapanYi-Dong Shen Institute of Software, Chinese Academy of

Sciences, ChinaYi-Ping Chen La Trobe University, AustraliaYifeng Zeng Aalborg University, DenmarkYiu-ming Cheung Hong Kong Baptist University, Hong KongYong Guan Iowa State University, USAYonghong Peng University of Bradford, UKYue Lu University of Illinois at Urbana-Champaign,

USAYun Chi NEC Laboratories America, Inc., USAYunhua Hu Microsoft Research Asia, ChinaZheng Chen Microsoft Research Asia, ChinaZhi-Hua Zhou Nanjing University, ChinaZhiyuan Chen University of Maryland Baltimore County, USAZhongfei Zhang Binghamton University, USAZili Zhang Deakin University, Australia

External and Invited Reviewers

Aurelie Bertaux Antonio BrunoTianyu Cao Rui ChenZhiyong Cheng Patricia Lopez CuevaJeremiah Deng Stephen GuoRaymond Heatherly Lam HoangPeter Karsmakers Sofiane LagraaStephane Lallich Ivan LeePeipei Li Zhao LiLin Liu Corrado LoglisciZhenyu Lu Marc MertensBenjamin Negrevergne Marc PlantevitJing Ren Yelong ShengArnaud Soulet Vassilios VerykiosPetros Venetis Guan WangLexing Xie Yintao YuYan Zhang

XVI Organization

Sponsors

Table of Contents – Part I

Supervised Learning: Active, Ensemble, Rare-Classand Online

Time-Evolving Relational Classification and Ensemble Methods . . . . . . . . 1Ryan Rossi and Jennifer Neville

Active Learning for Hierarchical Text Classification . . . . . . . . . . . . . . . . . . 14Xiao Li, Da Kuang, and Charles X. Ling

TeamSkill Evolved: Mixed Classification Schemes for Team-BasedMulti-player Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Colin DeLong and Jaideep Srivastava

A Novel Weighted Ensemble Technique for Time Series Forecasting . . . . . 38Ratnadip Adhikari and R.K. Agrawal

Techniques for Efficient Learning without Search . . . . . . . . . . . . . . . . . . . . . 50Houssam Salem, Pramuditha Suraweera, Geoffrey I. Webb, andJanice R. Boughton

An Aggressive Margin-Based Algorithm for Incremental Learning . . . . . . 62JuiHsi Fu and SingLing Lee

Two-View Online Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Tam T. Nguyen, Kuiyu Chang, and Siu Cheung Hui

A Generic Classifier-Ensemble Approach for Biomedical Named EntityRecognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Zhihua Liao and Zili Zhang

Neighborhood Random Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Djamel Abdelkader Zighed, Diala Ezzeddine, and Fabien Rico

SRF: A Framework for the Study of Classifier Behavior under TrainingSet Mislabeling Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Katsiaryna Mirylenka, George Giannakopoulos, andThemis Palpanas

Building Decision Trees for the Multi-class Imbalance Problem . . . . . . . . . 122T. Ryan Hoens, Qi Qian, Nitesh V. Chawla, and Zhi-Hua Zhou

Scalable Random Forests for Massive Data . . . . . . . . . . . . . . . . . . . . . . . . . . 135Bingguo Li, Xiaojun Chen, Mark Junjie Li,Joshua Zhexue Huang, and Shengzhong Feng

XVIII Table of Contents – Part I

Hybrid Random Forests: Advantages of Mixed Trees in ClassifyingText Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Baoxun Xu, Joshua Zhexue Huang, Graham Williams,Mark Junjie Li, and Yunming Ye

Learning Tree Structure of Label Dependency for Multi-labelLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Bin Fu, Zhihai Wang, Rong Pan, Guandong Xu, and Peter Dolog

Multiple Instance Learning for Group Record Linkage . . . . . . . . . . . . . . . . 171Zhichun Fu, Jun Zhou, Peter Christen, and Mac Boot

Incremental Set Recommendation Based on Class Differences . . . . . . . . . . 183Yasuyuki Shirai, Koji Tsuruma, Yuko Sakurai, Satoshi Oyama, andShin-ichi Minato

Active Learning for Cross Language Text Categorization . . . . . . . . . . . . . . 195Yue Liu, Lin Dai, Weitao Zhou, and Heyan Huang

Evasion Attack of Multi-class Linear Classifiers . . . . . . . . . . . . . . . . . . . . . . 207Han Xiao, Thomas Stibor, and Claudia Eckert

Foundation of Mining Class-Imbalanced Data . . . . . . . . . . . . . . . . . . . . . . . . 219Da Kuang, Charles X. Ling, and Jun Du

Active Learning with c-Certainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231Eileen A. Ni and Charles X. Ling

A Term Association Translation Model for Naive Bayes TextClassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Meng-Sung Wu and Hsin-Min Wang

A Double-Ensemble Approach for Classifying Skewed Data Streams . . . . 254Chongsheng Zhang and Paolo Soda

Generating Balanced Classifier-Independent Training Samples fromUnlabeled Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

Youngja Park, Zijie Qi, Suresh N. Chari, and Ian M. Molloy

Nystrom Approximate Model Selection for LSSVM . . . . . . . . . . . . . . . . . . . 282Lizhong Ding and Shizhong Liao

Exploiting Label Dependency for Hierarchical Multi-labelClassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

Noor Alaydie, Chandan K. Reddy, and Farshad Fotouhi

Diversity Analysis on Boosting Nominal Concepts . . . . . . . . . . . . . . . . . . . . 306Nida Meddouri, Hela Khoufi, and Mondher Sadok Maddouri

Table of Contents – Part I XIX

Extreme Value Prediction for Zero-Inflated Data . . . . . . . . . . . . . . . . . . . . . 318Fan Xin and Zubin Abraham

Learning to Diversify Expert Finding with Subtopics . . . . . . . . . . . . . . . . . 330Hang Su, Jie Tang, and Wanling Hong

An Associative Classifier for Uncertain Datasets . . . . . . . . . . . . . . . . . . . . . 342Metanat Hooshsadat and Osmar R. Zaıane

Unsupervised Learning: Clustering, ProbabilisticModeling

Neighborhood-Based Smoothing of External Cluster ValidityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

Ken-ichi Fukui and Masayuki Numao

Sequential Entity Group Topic Model for Getting Topic Flows of EntityGroups within One Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

Young-Seob Jeong and Ho-Jin Choi

Topological Comparisons of Proximity Measures . . . . . . . . . . . . . . . . . . . . . 379Djamel Abdelkader Zighed, Rafik Abdesselam, and Asmelash Hadgu

Quad-tuple PLSA: Incorporating Entity and Its Rating in AspectIdentification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392

Wenjuan Luo, Fuzhen Zhuang, Qing He, and Zhongzhi Shi

Clustering-Based k -Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405Xianmang He, HuaHui Chen, Yefang Chen, Yihong Dong,Peng Wang, and Zhenhua Huang

Unsupervised Ensemble Learning for Mining Top-n Outliers . . . . . . . . . . . 418Jun Gao, Weiming Hu, Zhongfei(Mark) Zhang, and Ou Wu

Towards Personalized Context-Aware Recommendation by MiningContext Logs through Topic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Kuifei Yu, Baoxian Zhang, Hengshu Zhu, Huanhuan Cao, andJilei Tian

Mining of Temporal Coherent Subspace Clusters in Multivariate TimeSeries Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444

Hardy Kremer, Stephan Gunnemann, Arne Held, and Thomas Seidl

A Vertex Similarity Probability Model for Finding Network CommunityStructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456

Kan Li and Yin Pang

Hybrid-ε-greedy for Mobile Context-Aware RecommenderSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468

Djallel Bouneffouf, Amel Bouzeghoub, and Alda Lopes Gancarski

XX Table of Contents – Part I

Unsupervised Multi-label Text Classification Using a World KnowledgeOntology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

Xiaohui Tao, Yuefeng Li, Raymond Y.K. Lau, and Hua Wang

Semantic Social Network Analysis with Text Corpora . . . . . . . . . . . . . . . . . 493Dong-mei Yang, Hui Zheng, Ji-kun Yan, and Ye Jin

Visualizing Clusters in Parallel Coordinates for Visual KnowledgeDiscovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505

Yang Xiang, David Fuhry, Ruoming Jin, Ye Zhao, and Kun Huang

Feature Enriched Nonparametric Bayesian Co-clustering . . . . . . . . . . . . . . 517Pu Wang, Carlotta Domeniconi, Huzefa Rangwala, andKathryn B. Laskey

Shape-Based Clustering for Time Series Data . . . . . . . . . . . . . . . . . . . . . . . . 530Warissara Meesrikamolkul, Vit Niennattrakul, andChotirat Ann Ratanamahatana

Privacy-Preserving EM Algorithm for Clustering on Social Network . . . . 542Bin Yang, Issei Sato, and Hiroshi Nakagawa

Named Entity Recognition and Identification for Finding the Owner ofa Home Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554

Vassilis Plachouras, Matthieu Riviere, and Michalis Vazirgiannis

Clustering and Understanding Documents via DiscriminationInformation Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566

Malik Tahir Hassan and Asim Karim

A Semi-supervised Incremental Clustering Algorithm for StreamingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

Maria Halkidi, Myra Spiliopoulou, and Aikaterini Pavlou

Unsupervised Sparse Matrix Co-clustering for Marketing and SalesIntelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591

Anastasios Zouzias, Michail Vlachos, and Nikolaos M. Freris

Expectation-Maximization Collaborative Filtering with Explicit andImplicit Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604

Bin Wang, Mohammadreza Rahimi, Dequan Zhou, and Xin Wang

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617

Table of Contents – Part II

Pattern Mining: Networks, Graphs, Time-Series andOutlier Detection

Heterogeneous Ensemble for Feature Drifts in Data Streams . . . . . . . . . . . 1Hai-Long Nguyen, Yew-Kwong Woon, Wee-Keong Ng, and Li Wan

OMC-IDS: At the Cross-Roads of OLAP Mining and IntrusionDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Hanen Brahmi, Imen Brahmi, and Sadok Ben Yahia

Towards Linear Time Overlapping Community Detection in SocialNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Jierui Xie and Boleslaw K. Szymanski

WeightTransmitter: Weighted Association Rule Mining UsingLandmark Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Yun Sing Koh, Russel Pears, and Gillian Dobbie

Co-occurring Cluster Mining for Damage Patterns Analysis of a FuelCell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Daiki Inaba, Ken-ichi Fukui, Kazuhisa Sato,Junichirou Mizusaki, and Masayuki Numao

New Exact Concise Representation of Rare Correlated Patterns:Application to Intrusion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Souad Bouasker, Tarek Hamrouni, and Sadok Ben Yahia

Life Activity Modeling of News Event on Twitter Using EnergyFunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Rong Lu, Zhiheng Xu, Yang Zhang, and Qing Yang

Quantifying Reciprocity in Large Weighted CommunicationNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Leman Akoglu, Pedro O.S. Vaz de Melo, and Christos Faloutsos

Hierarchical Graph Summarization: Leveraging Hybrid Informationthrough Visible and Invisible Linkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Rui Yan, Zi Yuan, Xiaojun Wan, Yan Zhang, and Xiaoming Li

Mining Mobile Users’ Activities Based on Search Query Text andContext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Bingyue Peng, Yujing Wang, and Jian-Tao Sun

Spread of Information in a Social Network Using Influential Nodes . . . . . 121Arpan Chaudhury, Partha Basuchowdhuri, and Subhashis Majumder

XXII Table of Contents – Part II

Discovering Coverage Patterns for Banner Advertisement Placement . . . . 133P. Gowtham Srinivas, P. Krishna Reddy, S. Bhargav,R. Uday Kiran, and D. Satheesh Kumar

Discovering Unknown But Interesting Items on Personal SocialNetwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Juang-Lin Duan, Shashi Prasad, and Jen-Wei Huang

The Pattern Next Door: Towards Spatio-sequential PatternDiscovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Hugo Alatrista Salas, Sandra Bringay, Frederic Flouvat,Nazha Selmaoui-Folcher, and Maguelonne Teisseire

Accelerating Outlier Detection with Uncertain Data Using GraphicsProcessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Takazumi Matsumoto and Edward Hung

Finding Collections of k -Clique Percolated Components in AttributedGraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Pierre-Nicolas Mougel, Christophe Rigotti, and Olivier Gandrillon

Reciprocal and Heterogeneous Link Prediction in Social Networks . . . . . . 193Xiongcai Cai, Michael Bain, Alfred Krzywicki, Wayne Wobcke,Yang Sok Kim, Paul Compton, and Ashesh Mahidadia

Detecting Multiple Stochastic Network Motifs in Network Data . . . . . . . . 205Kai Liu, William K. Cheung, and Jiming Liu

Scalable Similarity Matching in Streaming Time Series . . . . . . . . . . . . . . . 218Alice Marascu, Suleiman A. Khan, and Themis Palpanas

Scalable Mining of Frequent Tri-concepts from Folksonomies . . . . . . . . . . 231Chiraz Trabelsi, Nader Jelassi, and Sadok Ben Yahia

SHARD: A Framework for Sequential, Hierarchical Anomaly Rankingand Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Jason Robinson, Margaret Lonergan, Lisa Singh,Allison Candido, and Mehmet Sayal

Instant Social Graph Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256Sen Wu, Jie Tang, and Bo Gao

Data Manipulation: Pre-processing and DimensionReduction

Peer Matrix Alignment: A New Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 268Mohammed Kayed

Table of Contents – Part II XXIII

Domain Transfer Dimensionality Reduction via Discriminant KernelLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

Ming Zeng and Jiangtao Ren

Prioritizing Disease Genes by Bi-Random Walk . . . . . . . . . . . . . . . . . . . . . . 292Maoqiang Xie, Taehyun Hwang, and Rui Kuang

Selecting Feature Subset via Constraint Association Rules . . . . . . . . . . . . . 304Guangtao Wang and Qinbao Song

RadialViz: An Orientation-Free Frequent Pattern Visualizer . . . . . . . . . . . 322Carson Kai-Sang Leung and Fan Jiang

Feature Weighting by RELIEF Based on Local HyperplaneApproximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

Hongmin Cai and Michael Ng

Towards Identity Disclosure Control in Private HypergraphPublishing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

Yidong Li and Hong Shen

EWNI: Efficient Anonymization of Vulnerable Individuals in SocialNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

Frank Nagle, Lisa Singh, and Aris Gkoulalas-Divanis

A Pruning-Based Approach for Searching Precise and GeneralizedRegion for Synthetic Minority Over-Sampling . . . . . . . . . . . . . . . . . . . . . . . . 371

Kamthorn Puntumapon and Kitsana Waiyamai

Towards More Efficient Multi-label Classification Using Dependent andIndependent Dual Space Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

Eakasit Pacharawongsakda and Thanaruk Theeramunkong

Automatic Identification of Protagonist in Fairy Tales Using Verb . . . . . . 395Hui-Ngo Goh, Lay-Ki Soon, and Su-Cheng Haw

CD: A Coupled Discretization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407Can Wang, Mingchun Wang, Zhong She, and Longbing Cao

Co-embedding of Structurally Missing Data by Locally LinearAlignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

Takehisa Yairi

Relevant Feature Selection from EEG Signal for Mental TaskClassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Akshansh Gupta and R.K. Agrawal

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443