apaar shanker a review by ava: from data to …jarulraj/courses/8803-f18/...gt 8803 // fall 2018...
TRANSCRIPT
![Page 1: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/1.jpg)
Ava: From data to insights
through conversations
DATA ANALYTICS USING DEEP LEARNING
GT CS 8803 // FALL 2018 //
A review byApaar Shanker
![Page 2: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/2.jpg)
GT 8803 // Fall 2018
Paper under review
Ava: From Data to Insights
Through Conversation
Authors: Rogers Jeffrey Leo John1, Navneet Potti1, Jignesh M. Patel1
Computer Sciences Department, 1University of Wisconsin-Madison
Publication: CIDR ‘17
doi:http://pages.cs.wisc.edu/~jignesh/publ/Ava.pdf
2
![Page 3: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/3.jpg)
Thecurrent paradigm of data driven decision making
![Page 4: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/4.jpg)
GT 8803 // Fall 2018
Issues with the current model
1. Lost In translation
2. Long turnaround time
3. Correctness
4. Reproducibility
5. A cognitive overload due to surfeit of models and libraries
4
![Page 5: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/5.jpg)
GT 8803 // Fall 2018
Proposed Solution
Key Observations:
- Controlled natural language methods are now practically implemented as interfaces to software toolboxes
- The data science workflow can be templatized
5
We can use a chat-bot as a natural language UI to set up a data science pipeline by drawing on templates stored in a library.
![Page 6: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/6.jpg)
6
![Page 7: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/7.jpg)
7
Meta Task Meta Task
Meta Task Meta TaskTask
Task
Once a workflow has been finalized - only the pipeline(constituted of dotted blue boxes) needs to be preserved.
The workflow is a (often cyclic) graph. The actual pipeline is a subgraph of the workflow graph.
Typical Data Science Workflow
![Page 8: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/8.jpg)
GT 8803 // Fall 2018
Data Science Workflow can be Templatized
from sklearn import tree
model = DecisionTreeRegressor(criterion= ’mse’, splitter= ’best’, max_depth=None)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
8
There is a clean separation of specification (parameter values) and template, such that task can be composed by simply substituting parameters into a pre-defined code template.
![Page 9: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/9.jpg)
9
![Page 10: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/10.jpg)
10
Introducing AVA
![Page 11: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/11.jpg)
11
AVA in action
![Page 12: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/12.jpg)
12
Architecture
Rest API
Jpype
![Page 13: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/13.jpg)
13
![Page 14: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/14.jpg)
GT 8803 // Fall 2018
Results
14
A group of 16 students with some ML background (via coursework) and Python proficiency were asked to to do supervised learning on a Kaggle Dataset.
![Page 15: Apaar Shanker A review by Ava: From data to …jarulraj/courses/8803-f18/...GT 8803 // Fall 2018 Paper under review Ava: From Data to Insights Through Conversation Authors: Rogers](https://reader035.vdocument.in/reader035/viewer/2022070420/5f031ba77e708231d407931c/html5/thumbnails/15.jpg)
GT 8803 // Fall 2018
Issues and Enhancements
15
❖ Accuracy of the AVA models versus human models
❖ The addition of templates to the repository can be automated.
❖ Work on the knowledge-base based recommendation system?
❖ Handling unstructured data:➢ A customizable file-parser
❖ Handling larger than memory input data
❖ Uncertainty quantification in the output as a model guideline
❖ Where is the Code?