tweetool (0. 1 100 version) final report yilei qian computer science university of southern...
Post on 19-Dec-2015
216 views
TRANSCRIPT
![Page 1: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/1.jpg)
Tweetool (0. 1 100 version)Final Report
Yilei Qian
Computer Science
University of Southern
California
A Twitter Recommend System based on Topic Modeling
![Page 2: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/2.jpg)
Ideas
• Following too many points on Twitter
• Too many news every day
• Cannot find the interested and valued news
• Don’t know the name which user want to follow
• Need someone to recommend who to follow
• Need someone to recommend the hottest news
• Use topic modeling to re-rank all the user
![Page 3: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/3.jpg)
Traditional Method
![Page 4: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/4.jpg)
Traditional Method
![Page 5: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/5.jpg)
Traditional Method
![Page 6: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/6.jpg)
Topic Modeling
![Page 7: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/7.jpg)
Topic Modeling
![Page 8: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/8.jpg)
Topic Modeling
• a topic model is a type of statistical model for
discovering the abstract "topics" that occur in a
collection of documents.
• Always used in natural language processing.
Reference Papers:
Steyvers,m. and Griffiths, T., “Probabilistic topic
models,” Hand book of latent semantic analysis
Blei, D.M and Ng, A.Y and Jordan, M.I, “Latent
Dirichlet Allocation”, The Journal of Machine Learning
Research 2003
![Page 9: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/9.jpg)
Label based LDA
Step:
1. Build the LDA Model
2. Train the model instance by train document
3. Run the LDA for all the data based on trained model
instance
Problem:
4. Punctuation marks. E.g. “”,.={}() …
5. Frequent words. E.g I , you….
6. Other Noise
![Page 10: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/10.jpg)
Result Generate
1. By Angle
Value = 2. By Distance
Value =
![Page 11: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/11.jpg)
13-Dimension Topics
1. Art & Design2. Book3. Business4. Charity5. Entertainment6. Family7. Fashion8. Food & Drink9. Health10. Music11. News12. Science & Technology13. Sports
![Page 12: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/12.jpg)
Languages & Tools
• Web UI: HTML + AJAX(Unfinished) +CSS(unfinished)+Twitter
REST API
• Android UI: Java, Android 2.1(unfinished)
• Server Side: Java 1.6, Servlet 2.0, Spring 3.0, Hibernate 3.3
• Twitter API: Twitter4j 2.2.1 (300 request per hour)
• Server: Tomcat 7.08
• Database: MySQL 5.5
• Data Package: JSON
• Develop Platform: Eclipse 3.4
• Total code lines: 2000(+) + 2421 + 462 = 5000(+)
• Subversion:
• http://tweetool-yilei.googlecode.com/svn/trunk/tweetool-yilei-read-
only
![Page 13: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/13.jpg)
Architecture
DB
Twitterfetch
LLDATweetool
Hibernate DAO
Work Flow
Servlets
Work Flow
Work Flow
Mobile DeviceHTML
APPLICATIONCONTEXT
![Page 14: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/14.jpg)
Distributed Crawler & Computing
![Page 15: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/15.jpg)
Problems(endless T_T)
1. High noise in topic model
• Few words, Odd marks, Abbreviation
2. Unfamiliar with Twitter API, A lot of bugs
3. Transaction Problems
4. The Ugly UI
5. Poor performance
6. Don’t have enough time. Many functions are
unfinished
7. Tweetool system should be reconstructed !!!
Environment: 7000+Users 22,0000+Tweets
![Page 16: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/16.jpg)
Future Work
1. Try to finish it
2. Debug
3. Build a better train file
4. Add feedback function
5. Better topics classification
![Page 17: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/17.jpg)
Web UI (Design Version)
![Page 18: Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California qianyilei.usc@gmail.com A Twitter Recommend System](https://reader035.vdocument.in/reader035/viewer/2022081516/56649d2d5503460f94a03659/html5/thumbnails/18.jpg)
Android UI
FunctionButton
FunctionButton
FunctionButton
FunctionButton
Titile
Main Menu News Menu
Title
News
News
News