determining a digital profile from public social media information
DESCRIPTION
Presentation on master's thesis "Determining a digital profile from public social media information."TRANSCRIPT
![Page 1: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/1.jpg)
DETERMINING A DIGITAL PROFILE FROM PUBLIC SOCIAL MEDIA INFORMATIONDepartment of Informatics, School of Informatics and Engineering2013/14
KAROLINA STAMBLEWSKA
B00075232
![Page 2: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/2.jpg)
Outline Motivation Review of existing tools Data harvesting Data harvesting with Selenium 2.0 Resolving e-mail address Demo Results
![Page 3: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/3.jpg)
Motivation
Q2 201 4 surveys conducted by Ipsos MRBI with 1000 respondents aged 15
![Page 4: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/4.jpg)
Motivation CPL “Employment Market
monitor report” indicates that 60% of them using social media sources for pre-screening or for background digital footprinting job applicants’ prior to employment (CPL, Q3, 2013)
Options Regularly Sometimes Never Do not
approve
of this
Google 13% 26% 53% 8%
LinkedIn 30% 41% 24% 5%
Facebook 9% 22% 59% 10%
Google+ 3% 7% 81% 9%
Other 10% 4% 77% 9%
![Page 5: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/5.jpg)
Motivation
jobseekers
lying or exag-gerating
Fasle claim to speak an-tother language
inflating IT skills
Global-Lingo.com surveys UK jobseekers market in Q1 2014
63% of jobseekers admitted to lying on their CV !!!!!!
![Page 6: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/6.jpg)
Motivation – “resume-less”
“Having an ability to showcase and validate a candidate’s work through a social graph (Twitter, About.me, Facebook, Slideshare, Google+, forums, etc.), search engine footprint (special URL references to projects, linkbacks, publications, etc.), network connections is much more powerful than just 1 – 2 pages and 3 prepped references. The prospective employer now has an ability to fully evaluate a candidate and understand if they are a fit or not based on actual work, not just 2 pages of crafty wording.”
#socialCV Mr. Vala Afshar (Chief Customer Officer @Enterasys)
![Page 7: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/7.jpg)
Motivation – GitHub “Forget LinkedIn: Companies turn to GitHub to find tech talent” – CNET.COM
In the red-hot market for skilled software engineers, companies looking to make great hires are discovering that relying on traditional services that showcase candidates' work histories -- but not their actual work -- is a great way to miss out on the best available talent.
GitHub, a place where hiring managers and recruiters alike are increasingly turning to find not just the potential employees who look best on paper, but the ones that actively (and publicly) demonstrate their capabilities.
![Page 8: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/8.jpg)
Review of existing tools
![Page 9: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/9.jpg)
Data harvesting
Website Application Programming Interface (also called Web API ) – provides client with interface query over website provider database via HTTP request messages. In result client gets data output in XML or JSON.
Web scraping – software based technique, which transform the unstructured data on the web (typically HTML), into structured data that can be stored and analysed
![Page 10: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/10.jpg)
Web harvesting - caveats
Web 2.0 - highly driven on AJAX and dynamically populate HTML depends on user’s preferences and various conditions
Basic python libraries don’t catch all source code, as object may be hidden or event driven
Usually secure with SSL/TLS
Selenium 2.0 has capability like native webdriver imitate the functionality of Android,
Firefox, Google Chrome, Internet Explorer, Safari, Opera and event JavaScript HtmlUnit framework Phantomjs
perfect for dynamic populated elements
allows selecting elements via various html attributes from tag name, id to Xpath and even CSS selector
Caveats Solution
![Page 11: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/11.jpg)
Web harvestingFacebook Friends List example
![Page 12: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/12.jpg)
Web harvestingFacebook Friends List example
![Page 13: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/13.jpg)
Resolving e-mail address Network Method Extra Details
1 Facebook Direct
2 Twitter Gmail
3 SlideShare.net Direct Advance search, by user
4 Academia.edu Gmail
5 Github Semi-Direct Specific query over local-part of e-mail
address
6 LinkedIn Gmail Caveats
1) not resolving e-mail address until,
user send invitation to e-mail
address owner
2) caching previous search queries and
suggest them in next query round
![Page 14: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/14.jpg)
Resolving e-mail addressFacebook Search Engine
Find/Invite Friends
![Page 15: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/15.jpg)
Resolving e-mail address Academia.edu Twitter
![Page 16: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/16.jpg)
Resolving e-mail address GitHub SlideShare
![Page 17: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/17.jpg)
Resolving e-mail addressLinkedIn
SlideShare
![Page 18: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/18.jpg)
Resolving e-mail addressLinkedIn
SlideShare
![Page 19: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/19.jpg)
Resolving e-mail addressLinkedIn
SlideShare
![Page 20: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/20.jpg)
Demo
![Page 21: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/21.jpg)
Results
Overview of performance test of three open source search engine vs. implemented prototype (ScrapYA).
grav
atar
twitt
er
face
book
gith
ub
stum
bleu
pon
vim
eo
Yout
ube
pica
ssa
pint
rest
klou
t
four
squa
re
amaz
oneb
ay
aol l
ives
tream
soun
dClo
ud
inst
agra
m g+ho
me
slide
shar
e
abou
t.me
linke
din
acad
emia
.edu
0
1
2
3
4
5
6
7
8
9
people smartpiplspokeoscrapYA
![Page 22: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/22.jpg)
Results
Refined result of search test to the Social Media platforms implemented by the prototype (ScrapYA).
twitterfacebook
githubslideshare
academia.edu
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
scrapYAspokeopiplpeople smart
![Page 23: Determining a digital profile from public social media information](https://reader033.vdocument.in/reader033/viewer/2022061211/547a560eb47959a4098b4a25/html5/thumbnails/23.jpg)
Karolina Stamblewska
B00075232
Determining a digital profile from public social media information
Click icon to add picture