unlock your big data with analytics and bi on office 365 - off103
TRANSCRIPT
www.expertpointsolutions.com
Unlock your Big Data with Analytics and BI on Office365
Brian Culver● SharePoint Fest DC ● OFF103 ● April 8-10, 2015
About Brian Culver
SharePoint Solutions Architect for Expert Point Solutions
Based in Houston, TX
Author
ProveIT! Analytics
SharePoint 2010 Unleashed
Various White Papers
Speaker and Blogger
Session Agenda
What is Big Data?
Understanding Sentiment Analysis
Connecting Big Data and Business Intelligence
Create an Azure HDInsight Cluster
Load data into Blob Storage
Validate data via HDInsight
Hadoop and C#
Visualizing Results via PowerView
Closing comments
What is Big Data?
Big Data is about personalization and knowledge. Understanding our customers and relationships to the world.
• 27% of customers have seen Personalization online
• 86% of those say Personalization influenced what they purchased to some extent
• 31% want a more Personalized experience
• 59% of customers who have experienced Personalization believe it has a noticeable influence on purchasing
• 58% prefer product recommendations from previous purchases over other forms of personalization
What is Big Data?
Big Data is data that is “too” complex, large, and/or fast.
Big Data offers a new set of approaches for analyzing data sets that
were not previously accessible which posed challenges across one or
more of the “3 V’s”:
Volume - too big and large - Terabytes (and more) of credit card
transactions, web usage data, system logs, etc.
Variety - too Complex - Unstructured data such as social media,
customer reviews, call center records, etc.
Velocity - too Fast - Sensor data, live web traffic, mobile phone usage,
GPS data, etc.
What is Big Data?
Web app
optimization
Smart meter
monitoring
Equipment
monitoring
Advertising
analysisLife sciences
research
Fraud
detection
Healthcare
outcomesWeather
forecasting
Natural
resource
exploration
Social
network
analysis
Churn
analysis
Traffic flow
optimization
IT
infrastructure
optimization
Legal
discovery
COMMON BIG DATA CUSTOMER SCENARIOS
GAIN COMPETITIVE ADVANTAGE BY MOVING FIRST
AND FAST IN YOUR INDUSTRY
How Pier 1 uses HDInsight to improve and customize
customer experience:
http://www.youtube.com/watch?v=fN8Cixcc5yg
What is Big Data?
How are customers using HDInsight?Azure HDInsight + Machine Learning
What is Big Data?
Ambari - Cluster provisioning, management, and monitoring.
Avro (Microsoft .NET Library for Avro) - Data serialization for the
Microsoft .NET environment.
Hive - Structured Query Language (SQL)-like querying.
Mahout - Machine learning.
MapReduce and YARN - Distributed processing and resource
management.
Oozie - Workflow management.
Pig - Simpler scripting for MapReduce transformations.
Sqoop - Data import and export.
ZooKeeper - Coordination of processes in distributed systems.
RE
DU
CE
MA
PMap + Reduce = Extract, Load + Transform
Raw Data Raw Data Raw Data Raw Data
Mapper Mapper Mapper Mapper
Data Data Data Data
Reducer
Output
What is Big Data?
Understanding Sentiment Analysis
Sentiment Analysis is the process of
understanding the emotional content of text
Understanding Sentiment Analysis
For example:
Free Form Text
I had a fantastic time on holiday at your resort. The service was
excellent and awesome. My family really enjoyed themselves.
We look forward to next year. One thing though, the pool was
closed which sucked.
Hotel Feedback
Understanding Sentiment Analysis
Take a list of positive and negative words
Positive
Good
Great
Fantastic
Excellent
Friendly
Awesome
Enjoyed
Negative
Bad
Worse
Rubbish
Sucked
Awful
Terrible
Bogus
Understanding Sentiment Analysis
I had a fantastic time on holiday at your resort. The
service was excellent and awesome. My family
really enjoyed themselves. We look forward to next
year. One thing though, the pool was closed which
sucked.
Hotel Feedback
Connecting Big Data & Business Intelligence
In the following demo, we will cover the following:
• Create an Azure HDInsight Cluster
• Create Storage
• Create HDInsight Cluster
• Load data into Blob Storage
• Validate data via HDInsight
• Hadoop and C#
• Visualizing Results via Excel (PowerQuery,
PowerView, etc.)
Closing Comments
Big Data is about understanding what your customers are saying and
thinking.
Anything can be understood and processed but it requires time to
analyze and understand.
Any device that creates data can produce valuable information.
Try different things and let the patterns emerge.
Constructive Feedback Is Appreciated
Great information,
but would like to
have learned more
about [Insert Topic]Brian – Your
presentation
was …
Good
Demos!
Thanks!
Thank you!
Brian Culver, MCM
Twitter:
@spbrianculver
E-mail:
Blog:
http://blog.expertpointsolutions.com/
Slides:
http://www.slideshare.net/bculver
Resources
Twitter Sentiment Processing
http://tweetsentiment.azurewebsites.net/
http://azure.microsoft.com/en-us/documentation/articles/hdinsight-hbase-analyze-twitter-sentiment/