my journey to data science · data science is the extraction of actionable knowledge directly from...

43
A journey into Data Science in Libraries Luis Martinez-Uribe Data Scientist Library Fundación Juan March Photo with CC BY-NC-SA 2.0 licence taken from https://www.flickr.com/photos/bg2axk/ Second EDISON Conference - 16 March 2017

Upload: others

Post on 16-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

A journey into Data

Science in Libraries

Luis Martinez-Uribe

Data Scientist

Library

Fundación Juan March

Photo with CC BY-NC-SA 2.0 licence taken from https://www.flickr.com/photos/bg2axk/

Second EDISON Conference - 16 March 2017

Page 2: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Outline

Data Science My personal journey

Data Science at Fundación Juan March

Training

Looking ahead

Page 3: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

What is Data Science?

Page 4: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Data science is the extraction of actionable knowledge directly from datathrough a process of discovery, or hypothesis formulation and hypothesis testing.

National Institute of Standards and Technology (NIST) Big Data Working Group (2015)

Page 5: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Data Science

Statistics

Computer science

Mathematics

Social science

Software engineering

Ethics

Artificial Intelligence

Page 6: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

What is Data Science for?

Page 7: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

What is Data Science for?

Page 8: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

A pathway to Data Science in Libraries

Page 9: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

BSc Mathematics

Social Science Data Librarian

Research Data Management

Data Scientist

MSc InformationSystems

PhD

Sociology

Big Data

Page 10: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 11: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 12: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 13: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Klavans, Richard and Kevin W. Boyack. (2006). “Quantitative Evaluation of Large Maps of

Science.” Scientometrics 68 (3): 475-499.

Page 14: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 15: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 16: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 17: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 18: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Page 19: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Research Data Management

Page 20: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 21: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Page 22: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

CURATION

CAPTURE AND STORAGE

ANALYSIS

VIZ AND BI

Tools

Page 23: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

KnowledgePortals

01Our users and visitors

02Classifiers

03Search & recommend

04Vizs

05Analysis of social networks

06

Examples of Data Science Activities

Page 24: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Knowledge portals

Page 25: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 26: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 27: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Our users and visitors

Page 28: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Classification of our content

Keyword generation for events

Abstract

FormatTitle

...

EVENTS WITH KEYWORDS

(training set)

WORD PROCESSING

(stopwords, expressions)

STATISTICALINDICATORS

(frequency, word length, position,)

PREDICTIVE MODELS (machine learning)

CLASSIFIER(80% precision)

Page 29: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Search and recommendation systems

Search

Integrating data from 6.000 events, 500 exhibitions and art catalogues and 5.000 Library items

RecommendationsUsing meaningfull words in the title and keywords.

Page 30: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Interactive web graphs

Page 31: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Twitter networks and real time sentiment analysis

Page 32: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Training and education

Page 33: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Data Science methods

Page 34: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Big Data technologies

Page 35: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

PhD in Social Sciences, department of sociology

Develop analitical and visual framework for the social analysis of Big Cultural Data from libraries, archives and museums

Photo with CC BY-SA 2.0 licence taken from https://www.flickr.com/photos/seiho/

Page 36: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

datamonster.co

Page 37: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

• the is to ...learn all you

can about your data…from where it was first created.

•Embrace the broader reality…all the information that is yet to be stored in technology.”

Page 38: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Looking ahead

Page 39: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Artificial

intelligence

Page 40: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 41: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing
Page 42: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Looking ahead

"The best prophet of the future is the past." Lord Byron

Page 43: My journey to Data Science · Data science is the extraction of actionable knowledge directly from data through a process of discovery, or hypothesis formulation and hypothesis testing

Thanks

[email protected]

@luismart

es.linkedin.com/in/luismartinezuribe