sentiment analysis and visualization using uima and...
TRANSCRIPT
![Page 1: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/1.jpg)
Sentiment Analysis and Visualization using UIMA and
SolrCarlos Rodríguez Penagos, David García Narbona,
Guillem Massó Sanabre, Jens Grivolla, Joan Codina Filbà
![Page 2: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/2.jpg)
Sentiment Analysis
Social Media Monitoring, Reputation Management, Opinion Mining, ...
“Who says what about what?”or “What do people say about my product/brand?”
![Page 3: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/3.jpg)
Product Review Analysis
Objective: analysing customer opinion from unstructed product reviewsApproach:
detect Opinionated Units (Targets and Cues) → UIMAdata mining / visualization of target-cue relations → Solr, Cluto, etc.
![Page 4: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/4.jpg)
Architecture Overview
![Page 5: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/5.jpg)
Architecture Overview (detail)
Review&Database&
Linguis'c)annota'on)• POS,&lemmas,&NER,&Chunks,&dependencies,&etc.&
Target&&&Cue&detecBon&
Opinionated)Unit)detec'on)• T&C&CorrelaBon&via&dependencies&
OU&polarity&Assignment&OU&indexing&Data&visualizaBon&
![Page 6: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/6.jpg)
OU detection
combine statistical and rule-based approaches
reliably find known entities and opinion expressionsdiscover new entities and opinions
![Page 7: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/7.jpg)
OU detection
mark known Targets (e.g. brand / product names, etc.) and known Cues (e.g. polar words and expressions)detect new Targets and Cues using statistical modelsrelate Targets and Cues through syntactic dependencies
![Page 8: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/8.jpg)
OU detection
target target cuecue
![Page 9: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/9.jpg)
Visualization
using Ajax-Solrall data preprocessed and indexed with Solrflexible interactive querying/filteringclustering using Carrot, Cluto, Solr-based kNN, etc.
![Page 10: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/10.jpg)
Visualization
http://webmining.barcelonamedia.org/sm_yahoo/
![Page 11: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/11.jpg)
Some results
participated in SemEval (assigning polarity to tweets) with good results:
5th out of 23 submissions0.86 avg. F1 measure
customer review corpus (manually annotated at BM):
88.5% correctly identified OUs70% correct polarity
![Page 12: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/12.jpg)
UIMA: challenges
combining components from different sources (and languages: Java, C++, Python)unified Type Systemnon-programmers need to create pipelines and AEs
![Page 13: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/13.jpg)
UIMA components
OpenNLP (Apache)JNET (JulieLabs)Zanzibar (Tor Vergata University)Lemmatizer (BM)DeSR (University of Pisa, wrapper by BM)DependencyTreeWalker (BM)Weka Wrapper (based on MAWUI by Mayo Clinic)UIMA Collection Tools (BM)
![Page 14: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/14.jpg)
OpenNLP
no code changesalready TS independentjust add XML descriptor + resource (model)
![Page 15: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/15.jpg)
JNET
major code changesmade TS independentfixed bugs related to rich feature vectorswould be nice to merge upstream
![Page 16: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/16.jpg)
Zanzibar
used for NP detectionmajor code changes / bug fixesupstream?
seems mostly abandoned (2011)probably move over to RUTA
![Page 17: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/17.jpg)
Lemmatizer
uses ConceptMapper to generate all possible lemmascustom module to filter candidates by POS tag
![Page 18: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/18.jpg)
DeSR
wrapper for the DeSR parser (https://sites.google.com/site/desrparser/)developed using UIMA-CPPdeveloped at BMavailable on GitHub (https://github.com/BarcelonaMedia-ViL/desr-uima)
![Page 19: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/19.jpg)
DependencyTreeWalker
developed at BMuses Pythonnatorenables lookups in the dependency graphused to validate Target-Cue relations
![Page 20: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/20.jpg)
Weka Wrapper
based on MAWUImany changes
adapted to newer UIMA versionsbug fixes, ...
upstream not updated since 2008our own changes not published so far
![Page 21: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/21.jpg)
Configurable Annotator
taken from LuCAS (Apache UIMA)preprocessing / extraction as a separate module (without lucene dependency)used to prepare annotations for WEKA and Solr
![Page 22: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/22.jpg)
UIMA Collection Tools
mostly based on example CRs and CCs from UIMAuse MySQL (or Solr) instead of files
CR: plain text and XMICC: flat DB row representation or XMIannotation viewer: works with XMI from DB
developed at BMpublished on GitHub
![Page 23: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/23.jpg)
What we do well
separation of code and configurationtype system independence of codemanaging code and components with git and maven
![Page 24: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/24.jpg)
What we need to do/learn
better resource handling (maven?)
avoid redundancies between code and descriptors (uimaFIT?)
automatize creation of new components (e.g. variants using other models)
publish our changes
github
upstream
integrate a better rule engine (Ruta?)
better separation of libraries, etc. for CPP or Python annotators
![Page 25: Sentiment Analysis and Visualization using UIMA and Solruima.apache.org/downloads/gscl2013/slides_5.pdf · Sentiment Analysis and Visualization using UIMA and ... Visualization using](https://reader035.vdocument.in/reader035/viewer/2022070609/5ad0ced27f8b9ad24f8e18fd/html5/thumbnails/25.jpg)
And Now for Something Completely Different
New EU (FP7) project: EUMSSI“Event Understanding through Multimodal Social Stream Interpretation”➯ using UIMA as an integration platform for multimodal analysis layersstarts December 2013