trends in computer science research · 2014-01-23 · kip irvine trends in computer science...

35
Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor * Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed J. Zaki, Communications of the ACM, 56(10): 74-83, 2013 Apirak Hoonlor Faculty of ICT, Mahidol University at E-Leader 2014, Bangkok

Upload: others

Post on 06-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Kip Irvine

TRENDS IN COMPUTER SCIENCE RESEARCH*

Present byApirak Hoonlor

*Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed J. Zaki, Communications of the ACM, 56(10): 74-83, 2013

Apirak HoonlorFaculty of ICT, Mahidol University

atE-Leader 2014, Bangkok

Page 2: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Computer science is an expanding research field.

� The number of research paper published increased over the past two decades.

� Trend analysis

1

1/2012 <Title> <Name LastName> Seminar in Computer ScienceFrog Design, “20 Tech Trends for 2013”, http://frogdesign.com

Page 3: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Web of Science – Thomson Reuters

Trend Analysis on Science datasets

2

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 4: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Trend Analysis on Science datasets

2

1/2012 <Title> <Name LastName> Seminar in Computer ScienceThomson Reuters, “Web Of Science”, http://thomsonreuters.com/web-of-science/

Page 5: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Web of Science – Thomson Reuters

� MapEquation -- Rosvall and Bergstrom

Trend Analysis on Science datasets

2

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 6: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Trend Analysis on Science datasets

2

1/2012 <Title> <Name LastName> Seminar in Computer Science“MapEquation”, http://www.mapequation.org [1]

Page 7: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

� Web of Science – Thomson Reuters

� MapEquation -- Rosvall and Bergstrom

� Other works

� “Is science becoming more interdisciplinary?” –Porter and Rafols

Trend Analysis on Science datasets

2

1/2012 <Title> <Name LastName> Seminar in Computer Science

Porter and Rafols

� “Gender and computing in conference papers” –Cohoon et al.

� “The Claremont report on database research” –Agrawal et al.

Page 8: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Introduction

“What drives a research topic and turns it into a trend?”

3

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 9: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Analysis

� ACM Dataset: ACM Digital Library

� IEEE Dataset: IEEE Xplore Digital Library

� NSF Dataset: Publicly available awarded grants from www.nsf.gov

Datasets: Collected from 1990 - 2010

4

1/2012 <Title> <Name LastName> Seminar in Computer ScienceImages were taken respectively from ACM Digital Library, IEEE Xplore, and NSF sites.

Page 10: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 11: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 12: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 13: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 14: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 15: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

4Analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 16: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Result

Datasets

5

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 17: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Landscapes of Computer Science research 6

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 18: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Top rankings

Landscapes of Computer Science research

DF – Document Frequency

TFIDF - Term frequency inverse document frequency

nk,d

nw,d�⋅ log

D

j : k ∈ d j

7

1/2012 <Title> <Name LastName> Seminar in Computer Science

where |D| is the number of documents and nk,d is the number of times k appears in d

nw,dw∈d� j : k ∈ d j

Page 19: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Landscapes of Computer Science research

� Most publications in collaboration, data mining, information retrieval, machine learning, privacy, and XML appeared 2000–2010

� Internet and World Wide Web did not appear in any publication until 1995

Interesting findings

8

1/2012 <Title> <Name LastName> Seminar in Computer Science

publication until 1995

- NSFNET - Net

- Point-to-point - Internetworking

- TCP/IP

Page 20: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Networks of Computer Science Research 9

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 21: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Bursty-period Analysis

� Used to evaluate the influence of research funding on publications and vice versa

� “Bursty keywords” := The keywords appearing with uncommonly high frequency during some intervals as defined below

The beginning point

10

1/2012 <Title> <Name LastName> Seminar in Computer Science

defined below

where w is the keyword/topic of interest, t is a time period, dt is a document created during time t, d is any document, and T is the total time over which all documents were created.

Burst(w,t) =dt : w ∈ dt

d : w ∈ d−

1

T

Page 22: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

� If a keyword became burstyin ACM data first, it became bursty in NSF 2.4 years later on average

� In the reverse case, the average delay was 4.8 years

Bursty-period Analysis

ACM vs. NSF

� IEEE became bursty in NSF 3.4 years later on average

� In the reverse case, the average delay was 5.7 years.

� 67% of keyword became

IEEE vs. NSF

11

1/2012 <Title> <Name LastName> Seminar in Computer Science

average delay was 4.8 years

� 75% of keyword became bursty in the NSF dataset before it did in the ACM dataset

� 16% of cases, it was the reverse

bursty in the NSF dataset before it did in the IEEE dataset

� 16% of cases, it was the reverse

Page 23: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Trend analysis

� The linear regression trend line.

� Plot a line based on the fraction of papers to see the trends.

� Look at two trends:

(i) 1990-2010, and

12

1/2012 <Title> <Name LastName> Seminar in Computer Science

(i) 1990-2010, and

(ii) 2006 - 2010

Page 24: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Trend analysis

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 25: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Venue of publications 14

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 26: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

CS Researchers

� Find the sequences of publications in the same major research category by the same author

� Frequent sequence mining

� cSpade sequence mining algorithm

� A one-year gap in publication dates

How often do you publish in a certain area?

15

1/2012 <Title> <Name LastName> Seminar in Computer Science

� A one-year gap in publication dates

� Min. Freq. at least 1% of documents

� Look at the half-life

� How long does it take for the # of publication by the authors in certain topic to reduce by half.

Page 27: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

CS Researchers

ACM Dataset

16

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 28: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

CS Researchers

IEEE Dataset

17

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 29: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

CS Researchers

Why is there a significant drops?

Why can’t one person publish in

Putting things in perspective

18

1/2012 <Title> <Name LastName> Seminar in Computer Science

Why can’t one person publish in any given topics?

Page 30: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

CS Researchers

� Jack Dongarra – University of Tennessee, Knoxville

� High-performance linear algebra software

� His research interests have evolved continuously in response to challenges created by new computer technologies

� George Cybenko – Dartmouth College, USA

Putting things in perspective

19

1/2012 <Title> <Name LastName> Seminar in Computer Science

� George Cybenko – Dartmouth College, USA

� High-performance computing and classification by neural networks

� Investigate each subject “in five-year (more or less) phases” then “discovers an open field often related to previous work.” One exception was a major shift in 1992 when moving from one university to another.

Page 31: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Communities of CS researchers 20

1/2012 <Title> <Name LastName> Seminar in Computer Science

Page 32: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Conclusion

� CS continues to experience continuous and fundamental transformation;

� A burst of new keywords in grants generally precedes their burst in publications; less than 1/3 of new keywords burst in publications first, reflecting the

Key findings

21

1/2012 <Title> <Name LastName> Seminar in Computer Science

keywords burst in publications first, reflecting the importance of funding for success of new CS fields.

� A typical scientist’s research focus changes in roughly a 10-year cycle and often includes a once-in-a-career dramatic shift, likely in response to evolving technology creating new CS fields.

Page 33: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

Key References

[1] Edler, D. and M. Rosvall, M. Map Generator software package. MapEquation, Umeå, Sweden, 2010; http://www.mapequation.org

[2] Goldberg, M., Magdon-Ismail, M., Nambirajan, S., and Thompson, J. Tracking and predicting evolution of social communities. In Proceedings of the Third IEEE

22

1/2012 <Title> <Name LastName> Seminar in Computer Science

social communities. In Proceedings of the Third IEEE International Conference on Social Computing (Boston, Oct. 9–11), IEEE, 2011, 780–783.

[3] Hoonlor, A., Szymanski, B.K., Zaki, M.J., and Thompson, J. An Evolution of Computer Science Research. RPI Technical Report 12-01, Rensselaer Polytechnic Institute, Troy, NY, 2012; http://www.cs.rpi.edu/research/tr.html

Page 34: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

[4] Lappas, T. et al. On Burstiness-Aware Search for Document Sequences. In Proceedings of the 15th ACM SIGKDD (Paris, France, June 28–July 1). ACM Press, New York, 2009, 477-486.

[5] Zaki, M.J. Sequences mining in categorical domains:

Key References 22

1/2012 <Title> <Name LastName> Seminar in Computer Science

[5] Zaki, M.J. Sequences mining in categorical domains: Incorporating constraints. In Proceedings of the Ninth ACM Int. Conf. on Information and Knowledge Management (Washington D.C., Nov. 6–11).

Page 35: TRENDS IN COMPUTER SCIENCE RESEARCH · 2014-01-23 · Kip Irvine TRENDS IN COMPUTER SCIENCE RESEARCH * Present by Apirak Hoonlor *Apirak Hoonlor, Boleslaw K. Szymanski, and Mohammed

1/2012 <Title> <Name LastName> Seminar in Computer Science