is6125 database analysis and design lecture 2: the changing nature and role of data rob gleasure...
TRANSCRIPT
IS6125Database Analysis and DesignLecture 2: The changing nature and role of dataRob Gleasure
IS6125
Today’s session Change of time/place for next week Data a few years ago Data now
The cloud Big data Business Intelligence
The case of Spotify
Data a few years ago
Image from http://www.hotcleaner.com/web_storage.html
The Cloud
Web is overtaking/has overtaken desktop Mobile is replacing local Utility-based computing is replacing once-off purchase
Makes resources seem endless Lowers risk in terms of usage (pay as you go)
Static data center Data center in the cloud
Demand
Capacity
Time
Demand
Capacity
Time
Slide Credits: Berkeley RAD Lab
The Cloud
The ‘Internet of things’ was born in about 2009 More devices connected to the Web than people…
Image from http://computinged.com/edge/become-part-of-the-cloud-computing-revolution/
The Cloud
This has meaningful implications for data in terms of Capacity Measurement Integration Security Privacy
Big data
The idea is that the vast amounts of interaction data allow for systems that are nuanced and responsive in ways that were previously not possible
Also a realisation that, if it can be analysed, this data is a huge commodity, meaning new business models are possible
So when is data ‘big data’
3 Vs of Big data
Volume Facebook generates 10TB of new data daily, Twitter 7TB A Boeing 737 generates 240 terabytes of flight data during a
flight from one side of the US to the other
We can use all of this data to tell us something, if we know the right questions to ask
3 Vs of Big data
From http://www.slideshare.net/ibmcanada/big-dataturning-data-into-insights?qid=0b4c69bc-3db2-4e12-ae47-a362a25752eb&v=qf1&b=&from_search=3
Traditional Approach Big Data Approach
Analyze small subsets of data
Analyze all data
Analyzedinformation
All available information
All available informationanalyzed
3 Vs of Big data
Velocity Clickstreams and asynchronous data transfer can capture what
millions of users are doing right now
Make a change, then watch the response.
No guesswork required up front as to what to gather, we can induce the interesting stuff as we see it
3 Vs of Big data
From http://www.slideshare.net/ibmcanada/big-dataturning-data-into-insights?qid=0b4c69bc-3db2-4e12-ae47-a362a25752eb&v=qf1&b=&from_search=3
Start with hypothesis and test against selected data
Explore all data andidentify correlations
Hypothesis Question
DataAnswer
Exploration
CorrelationInsight
Traditional Approach Big Data Approach
Data
3 Vs of Big data
Variety Move from structured data to unstructured data, including image
recognition, text mining, etc. Gathered from users, applications, systems, sensors
Increasingly comprehensive data view of our ecosystem The Internet of Things
The Internet of Things
From http://www.pcworld.com/article/2039413/new-intel-ceo-creates-mysterious-new-devices-division.html
The Internet of Things
RFID sensors, bluetooth, microprocessors, wifi all becoming easier to embed in ‘dumb’ devices
Move to mobile also means more data streaming from us at all times, e.g. location, call activity, net use
The Internet of Things
Smart homes/smart cities Temperature, lighting, food stocks, energy, security
Smart cars Diagnostics, traffic suggestions, sensors, self-driving
Smart healthcare Worn and intravenous computing detects issues early and
monitors care outcomes remotely
Smart factories, farms Machines coordinated efficiently, linked dynamically to
consumption models
Big data
Success stories Books
Barnes and Noble: Discovered that readers often quit nonfiction books less than halfway through. Introduced highly successful new series of short books on topical themes
Amazon: originally used a panel of expert reviewers for books. Data surplus allowed them to create increasingly predictive recommendations. Panel has since been disbanded and 1/3 of sales are now driven by the recommender system
Big data and the Internet of Things Success stories (continued)
Transport Flyontime.us: used historical weather and flight delay
information to predict likelihood of flights get delayed Farecast: looked at ticket prices for specific flights based on
historical data, then advised users to buy or wait according to predicted fare costing trajectory
UPS: Uses a range of traffic data to calculate most efficient time/fuel efficient routes according to complex algorithm
Big data and the Internet of Things Famous success stories (continued)
Healthcare Modernizing Medicine EMA dermatology system
https://www.youtube.com/watch?v=jMGaGtK9nzU
Big data and the Internet of Things Famous success stories (continued)
Social media Google (data for information relevance) Twitter (c.f. #RescuePH) Facebook (social data)
Issues with big data
Google Flu Trends Life imitating data, imitating life?
No one is really average height
Your Xbox knows you like that Katy Perry song
Also, Target called to say your teenage daughter is pregnant.
Icecream sales and shark attacks…
From http://xkcd.com/552/
Icecream sales and shark attacks continued (correlation, not causation)
Target’s family monitoring continued
Assignment 1
In groups, you are tasked with identifying and researching a business that uses data in an interesting and creative way. The report should be approximately 2000 words and describe the
key values offered by the business to its consumers, how this differentiates it from competitors, and how its use of data at different points in the creation, delivery, and support of products/services enables this differentiation.
You don’t need to go into deep technical detail concerning how data is handled, nor about the technologies used. However you should discuss data-related processes at a high-level, insofar as you understand them from the information you gathered
The report is due on the 23rd October, at which time a soft-bound report should be handed into Ann O’Riordan in room 3.75
Assignment 1
The groups are as follows: Group 1: Hennessy, John James; Gao, Yun; Kenny, Mark Paul; Group 2: O'Driscoll, Nicole; Flood, Lee; Yang, Siyu; Group 3: Duggan, Claire Bernadette; Nolan, Robert Cunningham;
Power, Declan; Group 4: Huang, Junqi; Kenneally, Alan Kieran; Murray, Jack Joseph; Group 5: Lawton, Fiona Margaret; Hennessy, Darragh Ross; Chen, Qi; Group 6: Xu, Chenjun; Kilcoyne, Shane Anthony; O'Donovan, Mary-
Kate; Group 7: O'Donovan, Paul Andrew; Guerin, Steven John;
MolerRodriguez, Marta; Group 8: O'Riordan, Christina Eilish; Anso, Gabriel; Mc Carthy, Patricia; Group 9: O'Donovan, Eileen; Wang, Mengjian; Lowham, Joshua George; Group 10: Kerrisk, Edward; Meaney, Brendan; Qin, Xiaolu;
Want to read more?
On Modernizing Medicine https://www.modmed.com/
On Spotify http://www.bigdata-startups.com/BigData-startup/big-data-
enabled-spotify-change-music-industry/#!prettyPhoto On the cloud and big data
The Little Book of Cloud Computing 2013 edition, Lars Nielsen