big data usecases
TRANSCRIPT
BigData Use-cases
- Prepared by - Vishal Shukla
- Pranav Shukla- Krishna Meet
-
Brevitaz Overview
● Founded in 2014● Small team of technocrats delivering Big Data Solutions● Global client-base in Europe and Asia-pacific region● Expertise
○ Full-text search○ Real-time analytics○ Log analytics○ BigData analytics○ IoT based solutions○ Machine learning○ BigData warehousing
● Technologies○ Spark, Hadoop, Kafka, Flume, Storm○ Elasticsearch, Logstash, Kibana○ MongoDB, Cassandra, HBase, Apache Titan○ Impala, Spark SQL, Hawk○ Java & Spring stack, Typesafe stack (Scala, Akka, Spray, Slick)○ AngularJS
Agenda
➔ Big Data & Analytics➔ Full-text search➔ Log analytics➔ Big Data Analytics➔ Real-time Analytics➔ IoT Analytics➔ Machine Learning on BigData➔ Big Data Warehousing➔ Big Data is for Everyone
Analytics
Data is growing!
Full-text Search
Spot the right data quickly
“It’s all about being able to spot
Right Information at Right Time
◎ Relevance search in near real-time○ Find results matching “iphone”. Please don’t show me
Iphone chargers in first page.◎ Fuzzy search and search suggestions
○ Find results matching “iphne"◎ Faceted search
○ Filters in amazon after searching a keyword◎ Complex search with multiple criteria
○ Find me products matching “iphone” with in price range 30000 INR to 50000 INR and color “Space grey”
◎ Geo-spatial search○ Find restaurants within 10 km radius from my current
location. And yes, I want to see closer ones on top.
Full-text search - What it is?
What are you talking! I am here for BigData!
Elasticsearch - does all of these for massive volume, variety and velocity
◎ Crawl third-party websites ◎ Aggregate and classify the data◎ Develop custom application on top of classified
data
Use-case - Information Aggregator
◎ Google’s “Did you mean?” ◎ Search suggestions as you type◎ Text analytics◎ McGrowHill - Transform text-books into digital
learning resource◎ SoundCloud - Quickly find music that interests
them
Other use-cases
Log Analytics
Collect, analyze and Improvise
“Transform your dumb logs into
actionable insights
● Use machine generated logs to get operational insights
● Sensors, application servers, web servers or any IoT device logs
To interactively answer questions like...◎ How many users signed up this week?◎ How users are using your website / mobile app◎ How successful is our advertising campaign?◎ Why is the database slow?◎ Which are the websites categories my team is
spending the most time at?◎ Who are the potential employees to resign next?
Log Analytics - What it is?
What’s the big deal!
Use-case - Network Logs Analysis
◎ High velocity◎ High volume◎ Collect, analyze and improvise
◎ Analyze click stream data to provide personalized offers and user experience
◎ Interactive drill-down analysis◎ Compliance reporting through interactive
dashboard◎ Real-time alerts on invalid login attempts◎ Detect outages◎ Multi-channel funnel reporting for your
Advertising campaigns to find out which channels contribute the most for conversions
Other use-cases
BigData AnalyticsMake your data speak
“Combine all sources of data to uncover hidden patterns and
unknown relations in your data
● Take your transactional data from various sources
● Take operational and user behaviour logs data ● Collect social data● Combine data collected from various sources to To interactively answer questions like...◎ What is increase or decrease in sales over the
years?◎ How many unique customers are acquired this
year?◎ Which products are trending disproportionately
this year?
Big Data Analytics - What it is?
Usecase - Supply chain management
◎ RFID labels can indicate which product is where at what time
◎ Get more accurate business insights◎ Theft detection
◎ Social media sentiment analysis to get end-user feedback on launched products
◎ Identify market trends◎ Predict employees attrition◎ Customer churn analysis◎ Influencer analysis◎ Lead generation◎ Proactive issues monitoring◎ For insurance companies, identify potential
customers by combining birth, marriage and health data
Other use-cases
Real-time Analytics
Analyse instantaneously as you collect data
“Lag of seconds can make a
fraudster and you
● Ingest streaming data, possibly at high velocity● Analyse and react immediatelyTo solve problems like...◎ Identify changing trends in real-time◎ Detect fraud◎ Analyse policy violations and react immediately◎ Reduce downtimes◎ Provide better and quicker business decisions
Real-time Analytics - What it is?
Use-case - Enrich Customer Experience
◎ Get real-time feeds about customer location or products being browsed
◎ Combine with historical user behaviours◎ Roll out offers in real-time
◎ Hospitality Industry○ Bad weather reduces travel, which then
reduces overnight lodging○ Combine weather data with flight
cancellation to identify stranded travellers○ Offer hotel coupons based on near by
location.
Other use-cases
◎ Fraud detection◎ Predict and enrich customer experience based on
location, lifestyle◎ Real-time process visibility across an enterprise◎ Suggest optimal routes based on current traffic
data◎ Get player performance metrics in real-time to
substitute players at right time
Other use-cases
IoT Analytics
Let machines communicate
● Use sensors to detect low level data● Report the captured data to server● Analyse and get back to userTo provide smart alerts and suggestions like◎ Schedule maintenance of machines◎ Your pulse rate is disproportionately increasing◎ Medicines manufactured in a batch is not
complying to standards
IoT Based Smart Solutions - What it is?
Use-case
◎ Performance measurement & maintenance schedule
DIAGRAM
◎ In agriculture, Sensors can detect crop health along with geo data and based on that alert can be sent to farmers where they need to focus
◎ In retail, smart-shelves can detect and send alerts on when to replenish
◎ Smart home can analyze the patterns of each family member and optimize energy usage
Other use-cases
Machine Learning
on BigData
Make the machines learn from data
What is machine learning?
◎ Machine learning is not programming a machine to do stuff
◎ Machine learning is making the machine learn and adapt based on the observed data
Where is machine learning used?
● Identify similarities between products, users● Predict values from past data● Classify items into categories, like an email is spam
or not spamin order to ...◎ Predict expected outcome◎ Categorize large amounts of data◎ Optimize algorithms or paths◎ Find similarities◎ Improve quality of predictions continuously
“Recommending the right products
makes the difference between selling or not selling a product
Use-case - Recommending Products
◎ Compare thousands of users/products with each other to find similar “clusters”
◎ Content-based filtering - Recommend similar products to what customer has already bought
◎ Find similar customers to the current customer and recommend him what they have bought
◎ Apply what is known as Clustering algorithms in machine learning on Big Data
Use-case - Optimise team combination in Sports
◎ Choose best performing team with limited budget
◎ It was first applied in Baseball, now many professional games use these techniques
◎ Choose a team consisting of players who could win at least enough games to make to the play-offs
◎ Use data analysis techniques to find undervalued players
Use-case - Sports
What they achieved?
◎ Average 90 wins in each season in less than 30M $
◎ Same number of wins in 1/3rd of budget than another team
◎ 20 more wins than another team with similar budget
Other use-cases
◎ Fraud detection in banking and other sectors◎ Fine grained customer segmentation for targeted
products◎ Predicting next product failure and sending a
replacement part in advance◎ Predict best candidates
Big Data WarehousingCatch all that you can so, you
can analyze it later
Why modernize Data Warehouse with Big Data?
Traditional Enterprise Data Warehouse (EDW) can only◎ Store only structured data◎ Extremely expensive license cost per TB of storage◎ Capacity constrained with ETL and query workloadsbig data will help to...◎ Store unstructured, semi-structured data◎ Combine your structured data with other sources◎ Run interactive SQL queries on big data◎ Offload ETL workload from your EDW◎ Offload less frequently used data from your EDW◎ Save licensing costs
Use-case - Modernizing Data Warehouse
◎ Low cost storage for years of data◎ Data lake for structured, unstructured and semi-
structured data◎ Interactive queries on historic data
◎ Online archival with reporting○ Make years of data available
◎ ETL off-loading○ Spark jobs to reduce ETL job time from hours
to minutes◎ Batch reports off-loading
○ Reduce load on your warehouse by off-loading batch reports
◎ Big Data Discovery○ Proactively find patterns guided by the
system
Other use-cases
But we are just a startup !
“Start small. Then scale.
Next stepsTry, evaluate and adopt in risk-
free manner
◎ Identify sources of your unused data○ like server logs○ social streams
◎ Collect and store on cloud to minimize initial investment
◎ Many cloud options like Amazon EC2, Databricks, Altiscale...
◎ Use open-source analytics engines like Elasticsearch, Kibana. They are free to use.
◎ Experience the success◎ Automate using sensors or IoT devices to add
more sources of useful data
Start small and then scale
◎ https://aws.amazon.com/public-data-sets/
◎ https://data.gov.in/
◎ https://open-data.europa.eu/en/data/
◎ https://www.data.gov/
◎ https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
Some open datasets to play with
Woo-ha! I am feeling empowered!
Thanks!Any questions?
Contact Us@pranavshukla81http://in.linkedin.com/in/[email protected]
@vishal1shukla2https://in.linkedin.com/in/[email protected]