towards mining software repositories research that matters

21
Towards Mining Software Repositories Research that Matters Tao Xie Department of Computer Science University of Illinois at Urbana-Champaign, USA [email protected]

Upload: tao-xie

Post on 27-Jun-2015

661 views

Category:

Documents


2 download

DESCRIPTION

Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/

TRANSCRIPT

Page 1: Towards Mining Software Repositories Research that Matters

Towards Mining Software Repositories Research that

MattersTao Xie

Department of Computer ScienceUniversity of Illinois at Urbana-Champaign, USA

[email protected]

Page 2: Towards Mining Software Repositories Research that Matters

Machine Learning that Matters

“The basic argument in her paper is that machine learning might be in danger of losing its impact because the community as a whole has become quite self-referential. People are probably solving real-world problems using ML methods, but there is little sharing of these results within the community. Instead, people focus on existing benchmarks which might have originally had some connection to real-world problems which has been long forgotten, however.”

“She proposes a number of tasks like $100M solved through ML based decision making or a human life saved through a diagnosis or an intervention recommended by an ML system to get ML back on track.”

ICML’12

http://icml.cc/2012/papers/298.pdf

http://blog.mikiobraun.de/2012/06/is-machine-learning-losing-impact.html

Page 3: Towards Mining Software Repositories Research that Matters

Redwine and Riddle Study (1985)

• From idea to “the point it can be popularized and disseminated to the technical community at large”– Worst case: 23 years– Best case: 11 years– Mean: 17 years

• 7.5 years from developed technology to wide availability

Source©S. L. Pfleeger

Sam Redwine Jr., William Riddle: Software Technology Maturation, In Proc. ICSE 1985.

Page 4: Towards Mining Software Repositories Research that Matters

Technology Maturation: Middleware

Source©A. Wolfhttp://www.sigsoft.org/impact/docs/ImpactWolfBCS2008.pdf

15-20 years between first publication of an idea and widespread availability in products

Page 5: Towards Mining Software Repositories Research that Matters

Technology Maturation: Middleware

Source©A. Wolf

http://www.sigsoft.org/impact/docs/ImpactWolfBCS2008.pdf

15-20 years between first publication of an idea and widespread availability in products

Shall we just stay in our comfort zone to wait for 15-20 years for our research to (or

not to) produce practice impact??

How about the research that we did 15-20 years ago??

[Caveat: don’t forget the need of long-term/blue-sky research!!]

Page 6: Towards Mining Software Repositories Research that Matters

2012 NSF Workshop on Formal Methods• Goal: to identify the future directions in

research in formal methods and its transition to industrial practice.

• Success examples mentioned by the attendees– SLAM/SDV– ASTREE– SMT-based tools– …

http://goto.ucsd.edu/~rjhala/NSFWorkshop/

Page 7: Towards Mining Software Repositories Research that Matters

“What Happened to the Promise of Software Tools?” – Jim Larus

http://www.srl.inf.ethz.ch/workshop2014/eth-larus.pdf

https://www.youtube.com/watch?v=kO9OYnkeRTM

Page 8: Towards Mining Software Repositories Research that Matters

Impacts, Impacts, Impacts, …

Image source: http://engage.synecoretech.com/marketing-technology-for-growth/bid/155279/How-Online-Content-Impacts-Your-Social-Media-Marketing-Strategy

Page 9: Towards Mining Software Repositories Research that Matters

Research Impacts

99319

22786

32987

Page 10: Towards Mining Software Repositories Research that Matters

Research Impacts SIGSOFT Impact Paper Awards, ICSE MIP awards, …

Page 11: Towards Mining Software Repositories Research that Matters

Practice Impacts ACM Software System Awards

31 Awardees

http://awards.acm.org/software_system/

Page 12: Towards Mining Software Repositories Research that Matters

Practice Impacts ACM Software System Awards

• Development Environments/Tools– 2013: Coq– 2012: LLVM– 2011: Eclipse– 2007: Statemate– 2006: Eiffel– 2005: The Boyer-Moore Theorem Prover (ACL2)– 2003: MAKE– 2001: SPIN– 1992: Interlisp

• Languages– 2002: Java– 1998: The S System (R statistical analysis)– 1997: Tcl/Tk– 1987: SMALLTALK

Page 13: Towards Mining Software Repositories Research that Matters

2012 LLVM born at Illinois

• The openness of the LLVM technology and the quality of its architecture and engineering design are key factors in understanding the success it has had both in academia and industry

Vikram Adve Chris Lattner Evan Cheng

http://llvm.org/

Page 14: Towards Mining Software Repositories Research that Matters

Practice Impacts commercialization/industrial adoption

SAGE

ASTRÉE

Statechart

XIAOSTACKMINE

SPIN

Moles SAS

Microsoft Research

Page 15: Towards Mining Software Repositories Research that Matters

Practice Impacts research publications industrial adoption done by others

• ICSE 00 Daikon paper by Ernst et al. Agitar Agitator– https://homes.cs.washington.edu/~mernst/pubs/invariants-relevance-icse2000.pdf

• ASE 04 Rostra paper by Xie et al. Parasoft Jtest improvement– http://web.engr.illinois.edu/~taoxie/publications/ase04.pdf

• PLDI/FSE 05 DART/CUTE papers by Sen et al. MSR SAGE, Pex– http://srl.cs.berkeley.edu/~ksen/papers/dart.pdf – http://srl.cs.berkeley.edu/~ksen/papers/C159-sen.pdf

Page 16: Towards Mining Software Repositories Research that Matters

HOW???

• Are these impact goals too far from you?• Can you plan for that?• What if you are in a university research

group?• …

Page 17: Towards Mining Software Repositories Research that Matters

(How) Can A University Group Do It?

• Aim for research impacts more commonly– but sometimes/often may not be predicted well,

e.g., Whyper [USENIX SEC 13] http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf

• Start a startup– but desirable to have right people (e.g., former students) to start– but software engineering tools may not sell crazily

• Collaborate with industrial research labs– but many research lab projects may look like univ. projects

• Collaborate with industrial product groups– but many probs faced by product groups may not be “researchy”

• At least focus on problems that matter (now or future)!

Page 18: Towards Mining Software Repositories Research that Matters

(How) Can A University Group Do It?• Need to balance/unify producing great

students vs./and great (high practice-impact) research

http://www.cs.washington.edu/people/faculty/notkin/students

conts.

Page 19: Towards Mining Software Repositories Research that Matters

Experience Reports on Successful Tool Transfer• Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated

Test Generation Tool to Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE 2014, Experience Papers. http://web.engr.illinois.edu/~taoxie/publications/ase14-pexexperiences.pdf

• Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software Analytics for Incident Management of Online Services: An Experience Report. In Proceedings ASE 2013, Experience Paper. http://web.engr.illinois.edu/~taoxie/publications/ase13-sas.pdf

• Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie. Software Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software Analytics, 2013. http://web.engr.illinois.edu/~taoxie/publications/ieeesoft13-softanalytics.pdf

• Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012. http://web.engr.illinois.edu/~taoxie/publications/acsac12-xiao.pdf

Page 20: Towards Mining Software Repositories Research that Matters

Q & Ahttp://www.cs.illinois.edu/homes/taoxie/

Contact: [email protected]

Supported in part by a Microsoft Research Award, NSF grants CCF-1349666, CNS-1434582, CCF-1434596, CCF-1434590, CNS-1439481, and the USA National Security Agency (NSA) Science of Security Lablet.

Discussion

Page 21: Towards Mining Software Repositories Research that Matters

Discussion Topics: HOW???

• Are these impact goals too far from you?• Can you plan for that?• What if you are in a university research

group?• …