hire a machine to code - michael arthur bucko & aurélien nicolas

39
Hire a machine to code Michael Arthur Bucko, Aurélien Nicolas 1

Upload: withthebest

Post on 15-Apr-2017

205 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Hire a machine to codeMichael Arthur Bucko, Aurélien Nicolas

1

Page 2: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

We are learning the relation between human communication and source code to take communication to the next level.

2

Page 3: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Agenda

● What is Deckard? Our vision and products

● Software team’s and developer’s perspectives

● Problems and solutions in coding

● Understanding source code

● Our work

● Demo!

3

Page 4: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Our vision

Step 1: Machines joining regular software teams to help them create better code faster

Step 2: First large-scale code transplants

Step 3: First machines writing their own code without humans

4

Page 5: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Our product

● Deckard is building a framework for making code-based interactions between human and intelligent machines more relevant○ We approach the problem from (at least) two angles:

■ Enriching human software developer’s and team’s context ■ Learning novel code representations to enable more

effective communication between machines and humans

5

Page 6: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Team

Engineer

Brain

Helping single developer By enriching individual Contexts and communication

Helping single developer And enriching their context

IDEs

Independent of IDE

Not only finding all information relevant on time, but Also enabling a completely new interaction with software

Ensemble-based decisions using novel representations of problems, users and source code data

Teams and developers

6

Page 7: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Problems in communication using code

7

Page 8: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Problems in communication using code

Connecting humans with code by creating innovative code exploration

Enriching human-human interaction (real-time)

Learning better code representations

Researching code transplantation

Code context Understanding developer’s preferences

Code understanding Understanding current code in real-time

Code navigation Understanding where to go next, what to do

Knowledge sharing Sharing code intelligence

8

Page 9: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Software team’s perspective

● Small teams define and build products that people love

● Not only engineers in teams, even engineers have diverse skills sets

● Team members share knowledge using a variety of channels

● Engineers learn from many sources of data

9

Page 10: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Software developer’s perspective

● Developers are overwhelmed by data in their current contexts -- they need

assistants who do part of their job

● Developers lack the right data -- they should know better ways of solving

their problems to avoid tweaking and patching

● Assistants should be able to provide highly relevant data in real-time

10

Page 11: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Our work

11

Page 12: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Step 1

Step 1: Machines joining regular software teams to help them create better code faster

Step 2: First large-scale code transplants

Step 3: First machines writing their own code without humans

12

Page 13: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Plan for Step 1

PROBLEM SOLUTION

Ineffective interaction between human members of software teams

Profiling developers and making information more relevant

Ineffective interaction between humans and code

Requires understanding code better

Augmenting “working memory” (navigation)

Better knowledge sharing (dd protocol)

Relevant information on time

13

Page 14: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Plan for Step 1

PROBLEM SOLUTION

Coding faster Better real-time navigation (using summarization)

Sharing code knowledge more effectively (dd protocol)

Making code more re-usable (transplantation)

Understanding software development better (learning paths, code exploration modes, diversity of technology and skills)

14

Page 15: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Understanding source code

15

Page 16: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Ensemble

- Understanding source code requires is more than regular text summarization

- Regular: sentence reduction, sentence combination, syntactic transformation, paraphrasing, generalisation etc.[1]

- Source-code-related concepts: code folding, code execution flow, code re-usability, etc. - Source code data: var names, method names, logic, comments, git commits, types, etc.- NLG: Generating project metadata [2]

- We create an ensemble with source code-related features (novel representation of code)

1. A Neural Attention Model for Abstractive Sentence Summarization,Alexander M. Rush, Sumit Chopra, Jason Weston2. Automatic Documentation Generation via Source Code Summarization of Method Context,Paul W. McBurney and Collin McMillan 16

Page 17: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

...

Data- Who you are in the team, - What you do, - What you know about codebase, - What is known about your problem in the web,- Who might be able to help you.

...

RepresentationsCreating novel representations of source code:

- diverse programming languages with different syntaxes- we not only want to understand the current code, but also create better programming languages

Understanding source code requires novel representations

17

Page 18: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- Introduction

- We experiment with SWUM (Software Word Usage Model) and NLG- We model source code using call graphs- We use both abstractive and extractive summarization used for

understanding source code- Focus on abstractive methods -- we experiment with building source representation

- For user profiling: we have access to programmer’s interaction with code, but also needs, settings, code styles, search results

181. Autofolding for Source Code Summarization, Jaroslav Fowke, Razvan Ranca , Miltiadis Allamanis , Mirella Lapata and Charles Sutton2. Automatic Documentation Generation via Source Code Summarization of Method Context, Paul W. McBurney and Collin McMillan

Page 19: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 1/6

- We use a tree-based TASSAL (using scoped topic model) for creating some of the source code summarization features

- We use NAMAS (attention-based summarization) for creating some of the code summarization features

- We test code execution tools like code2flow or pycallgraph for creating code flow features

19

Page 20: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 2/6

- We use a tree-based TASSAL (using scoped topic model) for creating some of the source code summarization features

- We use NAMAS (attention-based summarization) for creating some of the code summarization features

- We test code execution tools like code2flow or pycallgraph for creating code flow features

20

Page 21: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 3/6

- We use a tree-based TASSAL (using scoped topic model) for creating some of the source code summarization features

- We use NAMAS (attention-based summarization) for creating some of the code summarization features

- We experiment with code execution tools like code2flow or pycallgraph for creating code flow features

21

Page 22: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 4/6

- We use our proprietary file tree-based parser independent of language to create:

- Call graph feature- Code flow-related features- Code meaning features- Complexity-related features

- We use multi-class classification for learning about specific files- We use RAKE (rapid automatic keyword extraction)

22

Page 23: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 5/6

- We also use our proprietary file tree-based parser independent of language to create:

- Call graph feature - Code flow-related features- Code meaning features- Complexity-related features

- We use multi-class classification for learning about specific files- We use RAKE (rapid automatic keyword extraction)

23

Page 24: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Features- 6/6

- We also use our proprietary file tree-based parser independent of language to create:

- Call graph feature- Code flow-related features- Code meaning features- Complexity-related features

- We use multi-class classification for learning about specific files- We use RAKE (rapid automatic keyword extraction)

24

Page 25: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

We are also researching novel approaches to dealing with source code

25

Page 26: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Summarization leads to transplantation

● Summarization is going to make everything clear, clarity is going to make

more code re-usable○ Re-usability can lead to successful code transplantation attempts

● Making code transplantation easier is going to boost software development

○ We are researching how to transplant source code to increase the capabilities of virtual

assistants

26

Page 27: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Summarization needs navigation

● When we show new (and more relevant) data to developers, they will be

solving different problems (in different ways)○ We need to give them new ways of traversing the code and sharing code information

● Current navigation you can see in the demo!

27

Page 28: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Understanding code requires learning paths

● All problems have follow-up problems

○ Example: searching for more specific terms like “collision detection” often indicates that

you will be trying to create a computer game or simulation

● Deckard learns not only about the current code context, but also about the

bigger picture related to the problem

○ We come up with numerous metrics measuring source code’s performance from novel

perspectives

28

Page 29: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Understanding code requires assistance

- Why is coding machines (currently) “difficult” for humans?- Making machines do what we imagine is tough, because we speak different languages

- Things are started, but not finished, then no one can use them

- Lots of code and no one knows all of it, make code simpler, document it

- Many capabilities of programming languages are unknown, patching != solving

- There’s many problems in software engineering that machines can solve

- Machines are already among us, but now they will be more proactive and have more

serious responsibilities29

Page 30: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Our work

30

Page 31: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Our work

- We want machines to work in software teams together with people, so we

create proactive assistants

- We also want to transform coders into supercoders, so we re-invent source

code navigation

- Finally, we want to make source code re-usable, so we work on

summarization and code transplantation tools

31

Page 32: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Suggestions:unrelatedrandom ()otheronesomethingweeksagoduh

StringStringStringboolean

intYour thing

</>Thanks

Google search

Autocomplete

IDE code search

Search tickets/commits

Ask someone

Time consuming > provides pages

Limited > too little documentation

IDE search > keyword based, no relevancy

Messy search > few code references

Efficient...but high cost

32

Page 33: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

33

Page 34: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Click on/highlightany part of your code...

...and get contextual insightsdynamically & in real time

Click on any link and navigate through the code in both

directions

Ask task-related questions &get code recommendation

(from own or open code)

TEXT EDITOR / IDE DECKARD

34

Page 35: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Slack integration

35

Page 36: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

API

- deckardSummarise: deckard summarises your source code.

- deckardClarity: deckard recognises typical reusable code vs unique logic.

- deckardGraph: deckard turns your source code into knowledge graph.

- We are working on our API!

36

Page 37: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

DCODE for sharing source code information

Team

IDEs

dcode:// code link

Sees codein own IDE

37

Use cases:- Chats- Tickets- In-code

hyperlinksCode

Reads code

Shares code

Page 38: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

github.com / deckardai

DCODE A URL scheme for sharing source code information

CodeSearch A get-started tool for discovering code using graph representations

PuppyParachute A semi-automated testing helper for Python

YaP A modern shell language derived from Python

38

Page 39: Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

Thank you!

Let’s revolutionise development39