transforming big data into smart data: deriving value via harnessing volume, variety, and velocity...
DESCRIPTION
Keynote given at ICDE2014, April 2014. Details at: http://ieee-icde2014.eecs.northwestern.edu/keynotes.html A video of a version of this talk is available here: http://youtu.be/8RhpFlfpJ-A (download to see many hidden slides). Two versions of this talk, targeted at Smart Energy and Personalized Digital Health domains/apps at: http://wiki.knoesis.org/index.php/Smart_Data Previous (older) version replaced by this version: http://www.slideshare.net/apsheth/big-data-to-smart-data-keynoteTRANSCRIPT
Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity
using semantics and Semantic Web
Put Knoesis Banner
Keynote at 30th IEEE International Conference on Data Engineering (ICDE) 2014
Amit ShethLexisNexis Ohio Eminent Scholar & Exec. Director,
The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)Wright State, USA
2
Amit Sheth’s PHD students
Ashutosh Jadhav
Hemant Purohit
Vinh Nguyen Lu Chen
Pramod AnantharamSujan
Perera
Alan Smith
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa Gunaratna
Delroy Cameron
Sanjaya Wijeratne
Wenbo Wang
Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)Special Thanks
Pavan Kapanipathi
Special Thanks Special Thanks
Special Thanks
Shreyansh Bhatt
Acknowledgements: Kno.e.sis team, Funds - NSF, NIH, AFRL, Industry…
4
2011
How much data?
48(2013)
500(2013)
http://www.knowledgeinfusion.com/blog/2011/11/get-your-head-out-of-the-clouds-and-into-big-data/
5
Only 0.5% to 1% of the data is used for analysis.
http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explodehttp://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
6
Variety – not just structure but modality: multimodal, multisensory
Structured
Unstructured
Semi structured
Audio
Video
Images
7
Velocity
Fast Data
Rapid Changes
Real-Time/Stream Analysis
Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail
9
• What if your data volume gets so large and varied you don't know how to deal with it?
• Do you store all your data?• Do you analyze it all?• What is coverage, skew, quality?
How can you find out which data points are really important?
• How can you use it to your best advantage?
Questions typically asked on Big Data
http://www.sas.com/big-data/
10http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/
Variety of Data Analytics Enablers
11
• Prediction of the spread of flu in real time during H1N1 2009– Google tested a mammoth of 450 million different mathematical
models to test the search terms that provided 45 important parameters
– Model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• FareCast: predict the direction of air fares over different routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• NY city manholes problem [ICML Discussion, 2012]
Illustrative Big Data Applications
12
Current focus mainly to serve business intelligence and targeted analytics needs, not to serve complex individual and collective human needs (e.g., empower human in health, fitness and well-being; better disaster coordination, personalized smart energy)
What is missing?
13
highly personalized/individualized/contextualized Incorporate real-world complexity:
- multi-modal and multi-sensory nature of physical-world and human perception
Can More Data beat better algorithms? Can Big Data replace human judgment?
Many opportunities, many challenges, lessons to apply
15
• Not just data to information, not just analysis, but actionable information, delivering insight and support better decision making right in the context of human activities
What is needed?
Data InformationActionable: An apple a day
keeps the doctor away
16
What is needed? Taking inspiration from cognitive models
• Bottom up and top down cognitive processes: – Bottom up: find patterns, mine (ML, …)– Top down: Infusion of models and background
knowledge (data + knowledge + reasoning)
Left(plans)/Right(perceives) BrainTop(plans)/Bottom(perceives) Brainhttp://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270
17
• Ambient processing as much as possible while enabling natural human involvement to guide the system
What is needed?
Smart Refrigerator: Low on Apples
Adapting the Plan: shopping for apples
18
Contextual
Information Smart Data
Makes Sense to a human
Is actionable – timely and better decisions/outcomes
20
My 2004-2005 formulation of SMART DATA - Semagix
Formulation of Smart Data strategy providing services for Search, Explore, Notify.
“Use of Ontologies and Data repositories to gain
relevant insights”
21
Smart Data (2013 retake)
Smart data makes sense out of Big data
It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-
turn providing actionable information and improve decision
making.
22
OF human, BY human FOR human
Smart data is focused on the actionable value achieved by human
involvement in data creation, processing and consumption phases
for improving the human experience.
Another perspective on Smart Data
23
OF human, BY human FOR human
Another perspective on Smart Data
24Petabytes of Physical(sensory)-Cyber-Social Data everyday!
More on PCS Computing: http://wiki.knoesis.org/index.php/PCS
‘OF human’ : Relevant Real-time Data Streams for Human Experience
25
OF human, BY human FOR human
Another perspective on Smart Data
Use of Prior Human-created Knowledge Models
26
‘BY human’: Involving Crowd Intelligence in data processing workflows
Crowdsourcing and Domain-expert guided Machine Learning Modeling
27
OF human, BY human FOR human
Another perspective on Smart Data
28
Detection of events, such as wheezing sound, indoor temperature, humidity,
dust, and CO level
Weather Application
Asthma Healthcare Application
Close the window at home during day to avoid CO in
gush, to avoid asthma attacks at night
‘FOR human’ : Improving Human Experience
Population Level
Personal
Public Health
Action in the Physical World
Luminosity
CO levelCO in gush during day time
29
Electricity usage over a day, device at work, power consumption, cost/kWh,
heat index, relative humidity, and public events from social stream
Weather Application
Power Monitoring Application
‘FOR human’ : Improving Human Experience
Population Level Observations
Personal Level Observations
Action in the Physical World
Washing and drying has resulted in significant cost
since it was done during peak load period. Consider
changing this time to night.
30
Every one and everything has Big Data –It is Smart Data that matter!
31
• Healthcare: ADFH, Asthma, GI– Using kHealth system
• Social Media Analysis:Crisis coordination– Using Twitris platform
• Smart Cities: Traffic management
I will use applications in 3 domains to demonstrate
43
• Healthcare: ADFH, Asthma, GI– Using kHealth system
• Social Media Analysis:Crisis coordination– Using Twitris platform
• Smart Cities: Traffic management
Smart Data Applications
44
A Historical Perspective on Collecting Health Observations
Diseases treated onlyby external observations
First peek beyond justexternal observations
Information overload!
Doctors relied only on external observations
Stethoscope was the first instrument to go beyond just external
observations
Though the stethoscope has survived, it is only one among many observations
in modern medicine
http://en.wikipedia.org/wiki/Timeline_of_medicine_and_medical_technology
2600 BC ~1815 Today
Imhotep
Laennec’s stethoscope
Image Credit: British Museum
45
The Patient of the FutureMIT Technology Review, 2012
http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/
46
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information
canary in a coal mine
Empowering Individuals (who are not Larry Smarr!) for their own health
kHealth: knowledge-enabled healthcare
Weight Scale
Heart Rate Monitor
Blood PressureMonitor
47
Sensors
Android Device (w/ kHealth App)
Readmissions cost $17B/year: $50K/readmission; Total kHealth kit cost: <
$500
kHealth Kit for the application for reducing ADHF readmission
ADHF – Acute Decompensated Heart Failure
48
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
25 million
300 million
$50 billion
155,000
593,000
People in the U.S. are diagnosed with asthma (7 million are children)1.
People suffering from asthma worldwide2.
Spent on asthma alone in a year2
Hospital admissions in 20063
Emergency department visits in 20063
Asthma: Severity of the problem
Sensordrone (Carbon monoxide,
temperature, humidity) Node Sensor
(exhaled Nitric Oxide)
49
Sensors
Android Device (w/ kHealth App)
Total cost: ~ $500
kHealth Kit for the application for Asthma management
*Along with two sensors in the kit, the application uses a variety of population level signals from the web:
Pollen level Air Quality Temperature & Humidity
51
Data Overload for Patients/health aficionados
Providing actionable information in a timely manner is crucial to avoid information overload or fatigue
Personal level Signals
Public level Signals
Population level Signals
52
Data Overload Spanning Physical-Cyber-Social Modalities
Increasingly, real-world events are: (a)Continuous: Observations are fine grained over time
(b)Multimodal, multisensory: Observations span PCS modalities
54
what can we do to avoid asthma episode?
Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
ValueWhat risk factors influence asthma control?What is the contribution of each risk factor?
sem
antic
s Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information
WHY Big Data to Smart Data: Asthma example
kHealth: Health Signal Processing Architecture
Personal level Signals
Public level Signals
Population level Signals
Domain Knowledge
Risk Model
Events from Social Streams
Take Medication before going to work
Avoid going out in the evening due to high pollen levels
Contact doctor
AnalysisPersonalized Actionable
Information
Data Acquisition & aggregation
55
57
Asthma Domain Knowledge
Domain Knowledge
ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist
Asthma Control and Actionable Information
58
Patient Health Score (diagnostic)
Risk assessment model
Semantic Perception
Personal level Signals
Public level Signals
Domain Knowledge
Population level Signals
GREEN -- Well Controlled YELLOW – Not well controlledRed -- poor controlled
How controlled is my asthma?
59
Patient Vulnerability Score (prognostic)
Risk assessment model
Semantic Perception
Personal level Signals
Public level Signals
Domain Knowledge
Population level Signals
Patient health Score
How vulnerable* is my control level today?
*considering changing environmental conditions and current control level
60
3.4 billion people will have smartphones or tablets by 2017 -- Research2Guidance
“Intelligence at the Edges” for Digital Health
http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html
m-health app market is predicted to reach $26 billion in 2017 -- Research2Guidance
63
Sensordrone – for monitoring environmental air quality
Wheezometer – for monitoringwheezing sounds
Can I reduce my asthma attacks at night?
What are the triggers? What is the wheezing level?
What is the propensity toward asthma?
What is the exposure level over a day?
Commute to Work
Asthma: Actionable Information for Asthma Patients
Luminosity
CO level
CO in gush during day time
Actionable Information
Personal level Signals
Public level Signals
Population level Signals
What is the air quality indoors?
64
Population Level
Personal
Wheeze – YesDo you have tightness of chest? –Yes
Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location>
<ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert Knowledge
Background Knowledge
tweet reporting pollution level and asthma attacks
Acceleration readings fromon-phone sensors
Sensor and personal observations
Signals from personal, personal spaces, and community spaces
Risk Category assigned by doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Health Signal Extraction to Understanding
Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor
70
RDF OWL
How are machines supposed to integrate and interpret sensor data?
Semantic Sensor Networks (SSN)
71
W3C Semantic Sensor Network Ontology
Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
73
W3C Semantic Sensor Network Ontology
Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
SSNOntology
2 Interpreted data(deductive)[in OWL] e.g., threshold
1 Annotated Data[in RDF]e.g., label
0 Raw Data[in TEXT]e.g., number
Levels of Abstraction
3 Interpreted data (abductive)[in OWL]e.g., diagnosis
Intellego
“150”
Systolic blood pressure of 150 mmHg
ElevatedBlood
Pressure
Hyperthyroidism
less
use
ful …
…
mor
e us
eful
……
75
76
Making sense of sensor data with
77
People are good at making sense of sensory input
What can we learn from cognitive models of perception?• The key ingredient is prior knowledge
78* based on Neisser’s cognitive model of perception
ObserveProperty
PerceiveFeature
Explanation
Discrimination
1
2
Perception Cycle*
Translating low-level signals into high-level knowledge
Focusing attention on those aspects of the environment that provide useful information
Prior Knowledge
79
To enable machine perception,
Semantic Web technology is used to integrate sensor data with prior knowledge on the Web
80
Prior knowledge on the Web
W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph
81
Prior knowledge on the Web
W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph
82
ObserveProperty
PerceiveFeature
Explanation1
Translating low-level signals into high-level knowledge
Explanation
Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
85
Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features
ObserveProperty
PerceiveFeature
Explanation
Discrimination2
Focusing attention on those aspects of the environment that provide useful information
Discrimination
89
Discrimination
Discriminating Property: is neither expected nor not-applicable
DiscriminatingProperty ≡ ¬ExpectedProperty ¬NotApplicableProperty⊓
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Discriminating Property Explanatory Feature
90
Semantic scalability: Resource savings of abstracting sensor data
Orders of magnitude resource savings for generating and storing relevant abstractions vs. raw observations.
Relevant abstractions
Raw observations
92
How do we implement machine perception efficiently on aresource-constrained device?
Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time
• Runs out of resources with prior knowledge >> 15 nodes• Asymptotic complexity: O(n3)
93
intelligence at the edge
Approach 1: Send all sensor observations to the cloud for processing
Approach 2: downscale semantic processing so that each device is capable of machine perception
Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
94
Efficient execution of machine perception
Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning
0101100011010011110010101100011011011010110001101001111001010110001101011000110100111
95
O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to
linear
Evaluation on a mobile device
96
2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web
3 Intelligence at the edgeBy downscaling semantic inference, machine perception can
execute efficiently on resource-constrained devices
Semantic Perception for smarter analytics: 3 ideas to takeaway
1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making
98
• Healthcare: ADFH, Asthma, GI– Using kHealth system
• Social Media Analysis:Crisis coordination– Using Twitris platform
• Smart Cities: Traffic management
Smart Data Applications
99
Smart Data for Social Good
Mining human behavior to help societal and humanitarian development• crisis response coordination,
harassment, gender-based violence, …
100
20 million tweets with “sandy, hurricane” keywords between Oct 27th and Nov 1st
2nd most popular topic on Facebook during 2012
Social (Big) Data during Crisis- Example of Hurricane Sandy
• http://www.guardian.co.uk/news/datablog/2012/oct/31/twitter-sandy-flooding
• http://www.huffingtonpost.com/2012/11/02/twitter-hurricane-sandy_n_2066281.html
• http://mashable.com/2012/10/31/hurricane-sandy-facebook/
103
Social Semantic
Web Application
Real time
Multi Faceted
Analysis
Insights of Important Events including disaster response
coordination
http://usatoday30.usatoday.com/news/politics/twitter-election-meter
http://twitris.knoesis.org/
104
Twitris’ Dimensions of Integrated Semantic Analysis
Sheth et al. Twitris- a System for Collective Social Intelligence, ESNAM-2014
113
What is Smart Data in the context of Disaster Management
ACTIONABLE: Timely delivery of right resources and information to the right people at right location!
Because everyone wants to Help, but DON’T KNOW HOW!
114
Really sparse Signal to Noise:• 2M tweets during the first 48 hrs. of #Oklahoma-tornado-2013
- 1.3% as the precise resource donation requests to help - 0.02% as the precise resource donation offers to help
• Anyone know how to get involved to help the tornado victims in Oklahoma??#tornado #oklahomacity (OFFER)
• I want to donate to the Oklahoma cause shoes clothes even food if I can (OFFER)
Disaster Response Coordination:Finding Actionable Nuggets for Responders to act
• Text REDCROSS to 909-99 to donate to those impacted by the Moore tornado! http://t.co/oQMljkicPs (REQUEST)
• Please donate to Oklahoma disaster relief efforts.: http://t.co/crRvLAaHtk (REQUEST)
For responders, most important information is the scarcity and availability of resources
Blog by our colleague Patrick Meier on this analysis: http://irevolution.net/2013/05/29/analyzing-tweets-tornado/
Join us for the Social Good!
http://twitris.knoesis.org
RT @OpOKRelief: Southgate Baptist Church
on 4th Street in Moore has food, water, clothes, diapers, toys, and more. If you can't go,call 794
Text \"FOOD\" to 32333, REDCROSS to 90999, or STORM to 80888 to donate $10
in storm relief. #moore #oklahoma
#disasterrelief #donate
Want to help animals in #Oklahoma? @ASPCA tells
how you can help: http://t.co/mt8l9PwzmO
CITIZEN SENSORS
RESPONSE TEAMS (including humanitarian
org. and ‘pseudo’ responders)
VICTIM SITE
Coordination of needs and offers
Using Social MediaDoes anyone
know where to send a check to donate to the
tornado victims?
Where do I go to help out for
volunteer work around Moore? Anyone know?
Anyone know where to donate
to help the animals from the
Oklahoma disaster?
#oklahoma #dogs
Matched
Matched
Matched
Serving the need!
If you would like to volunteer today, help is desperately
needed in Shawnee. Call 273-5331 for more info
http://www.slideshare.net/knoesis/iccm-2013ignitetalkhemantpurohitunnairobi 115Purohit et al. Emergency-relief coordination on social media: Automatically matching resource requests and offers, 2014. With Int’l collaborator
QCRI
126
Continuous Semantics for Evolving Events to Extract Smart Data
127
Heliopolis is a suburb of
Cairo.
Dynamic Model Creation
Continuous Semantics
130
• Healthcare: ADFH, Asthma, GI– Using kHealth system
• Social Media Analysis:Crisis coordination– Using Twitris platform
• Smart Cities: Traffic management
Smart Data Applications
131
Traffic Management
To improve the everyday life entangled due to our most common problem of ‘stuck in traffic’
1321IBM Smarter Traffic
Severity of the Traffic Problem
133
Vehicular traffic data from San Francisco Bay Area aggregated from on-road sensors (numerical) and incident reports (textual)
http://511.org/
Every minute update of speed, volume, travel time, and occupancy resulting in 178 million link status observations, 738 active events, and 146 scheduled events with many unevenly sampled observations collected over 3 months.
Variety Volume
VeracityVelocity
ValueCan we detect the onset of traffic congestion?Can we characterize traffic congestion based on events?Can we estimate traffic delays in a road network?
sem
antic
s Representing prior knowledge of traffic lead to a focused exploration of this massive dataset
Big Data to Smart Data: Traffic Management example
134
Duration: 36 months
Requested funding: 2.531.202 €
CityPulse Consortium
City of Aarhus
City of Brasov
Textual Streams for City Related Events
135
City Infrastructure
Tweets from a cityPOS
Tagging
Hybrid NER+ Event term extraction
Geohashing
Temporal Estimation
Impact Assessment
Event Aggregation
OSM Locations
SCRIBE ontology
511.org hierarchy
City Event Extraction
City Event Extraction Solution Architecture
City Event Annotation
OSM – Google Open Street MapsNER – Named Entity Recognition 136
City Event Annotation – CRF Annotation Examples
Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-LOCATION Brewing I-LOCATION Company O w/ O 8 O others) O http://t.co/w0eGEJjApY O
B-LOCATIONI-LOCATIONB-EVENTI-EVENTO
Tags used in our approach:
These are the annotations providedby a Conditional Random Field modeltrained on tweet corpus to spotcity related events and location
BIO – Beginning, Intermediate, and Other is a notation used in multi-phrase entity spotting 138
City Events from Sensor and Social Streams can be…
• Complementary• Additional information• e.g., slow traffic from sensor data and accident from textual data
• Corroborative• Additional confidence• e.g., accident event supporting a accident report from ground truth
• Timely • Additional insight• e.g., knowing poor visibility before formal report from ground truth
143
Events from Social Streams and City Department*
Corroborative EventsComplementary Events
Event SourcesCity events extracted from tweets511.org, Active events e.g., accidents, breakdowns 511.org, Scheduled events e.g., football game, parade
City event from twitter providing complementary and corroborative evidence for fog reported by 511.org
*511.org 146
147
Actionable Information in City Management
Tweets from a CityTraffic Sensor Data OSM Locations
SCRIBE ontology
511.org hierarchy
Web of Data
How issues in a city can be resolved?e.g., what should I do when I have fog condition?
149
• Big Data is every where– at individual level and not just limited to
corporation – with growing complexity: multimodal, Physical-
Cyber-Social• Analysis is not sufficient• Bottom up techniques is not sufficient, need
top down processing, need background knowledge
Take Away
150
Take Away
• Focus on Humans and Improve human life and experience with SMART Data.– Data to Information to Contextually Relevant
Abstractions– Actionable Information (Value from data) to assist
and support Human in decision making.
• Focus on Value -- SMART Data– Big Data Challenges without the intention of deriving
Value is a “Journey without GOAL”.
153
thank you, and please visit us at
http://knoesis.org/vision
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA
Smart Data
Ohio Center of Excellence in Knowledge-enabled Computing
• Among top universities in the world in World Wide Web (cf: 5-yr impact, Microsoft Academic Search: shared 2nd place in Mar13)
• Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )
• 100 researchers including 15 World Class faculty (>3K citations/faculty) and 45+ PhD students- practically all funded
• $2M+/yr research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)
155
Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity
using semantics and the Semantic WebAmit Sheth, Kno.e.sis, Wright State University