(atd 9) microsoft big data platform

25
Big Data: I Microsoft ima slona za utrku Luka Lovošević, Antonio Faletar Microsoft Hrvatska MICROSOFT HRVATSKA

Upload: luka-lovosevic

Post on 20-Jun-2015

112 views

Category:

Technology


10 download

DESCRIPTION

Micorosft Big Data Platform Cloud Azure Hadoop HDInsight Hive Pig Mahout

TRANSCRIPT

Page 1: (ATD 9) Microsoft Big Data Platform

Big Data: I Microsoft ima slona za utrkuLuka Lovošević, Antonio FaletarMicrosoft Hrvatska

• MICROSOFT HRVATSKA

Page 2: (ATD 9) Microsoft Big Data Platform

SadržajUvod u Big DataPregled MS platformeHadoopDemo

Page 3: (ATD 9) Microsoft Big Data Platform

Što je Big Data?

Page 4: (ATD 9) Microsoft Big Data Platform

MICROSOFT CONFIDENTIAL – INTERNAL ONLY

Page 5: (ATD 9) Microsoft Big Data Platform

Što je Big Data?Podaci koji su vam bitni, ali ih tradicionalnim alatimane možete procesirati.

VOLUME(Količina)

VARIETY (Struktura)

VELOCITY (Brzina, real-

time)

Page 6: (ATD 9) Microsoft Big Data Platform

Izvori podataka

Logovi Text

Pametne kuće Senzori

Vrijeme i lokacija RFID

Telemetrija Društvene mreže

Page 7: (ATD 9) Microsoft Big Data Platform

Big Data algoritmi

Analiza na društvenim mrežama

Slični artikli (npr. web shop) Real-time analiza Česti skupovi artikala

Reklamiranje na webu

Analiza povezanih pojmova

Sustavi preporukaKlastering (grupiranje)

c

Page 8: (ATD 9) Microsoft Big Data Platform

Microsoft Big Data platforma

Page 9: (ATD 9) Microsoft Big Data Platform

Microsoft Big Data platforma

Hadoop – HDInsight

(Windows ili Azure)

SQL Server 2012 Parallel Data Warehouse

SQL Server StreamInsight

Self-service BI alati

Page 10: (ATD 9) Microsoft Big Data Platform

Malo više o Hadoopu

Page 11: (ATD 9) Microsoft Big Data Platform

Što je Hadoop?Platforma za procesiranje velike količine podataka

Apache, open source

Google GFS i MapReduce

Visoko skalabilan i distribuiran

Commodity hardver

2013

Yahoo!

EnterpriseHadoop

Apache projekt

2004 2008 2010 20122006

Page 12: (ATD 9) Microsoft Big Data Platform

Hadoop arhitektura

Page 13: (ATD 9) Microsoft Big Data Platform

Node

NodeNode

Podaci

Node

MapReduce

Page 14: (ATD 9) Microsoft Big Data Platform

// Map Reduce function in JavaScript

var map = function (key, value, context) {var words = value.split(/[^a-zA-Z]/);for (var i = 0; i < words.length; i++) {

if (words[i] !== "")context.write(words[i].toLowerCase(),1);}}};

var reduce = function (key, values, context) {var sum = 0;while (values.hasNext()) {sum += parseInt(values.next());

}context.write(key, sum);};

NodeNode

NodeNode

Program

MapReduce

Page 15: (ATD 9) Microsoft Big Data Platform

Primjer za MapReduce

Page 16: (ATD 9) Microsoft Big Data Platform

Alati za uspješno Hadoopiranje

Page 17: (ATD 9) Microsoft Big Data Platform

Pig

Procesiranje i oblikovanjepodataka

ETL tool

MapReduce

Page 18: (ATD 9) Microsoft Big Data Platform

Hive

Strukturiranje podataka

SQL sintaksa

ODBC, Excel …

MapReduce

Page 19: (ATD 9) Microsoft Big Data Platform

MahoutBiblioteka gotovih algoritama

Strojno učenje (npr. clustering, recommendation, …)

MapReduce

Page 20: (ATD 9) Microsoft Big Data Platform

HDInsight

Hadoop

Programiranje u .NET-uSecurity, HA & managementPodrška za virtualizacijuIntegracija s Microsoft BI alatimaIsto iskustvo za on-premise i cloud

Hadoop za Windows ServerHadoop za Windows Azure

Page 21: (ATD 9) Microsoft Big Data Platform

Demo

Windows Azure HDInsight

Page 22: (ATD 9) Microsoft Big Data Platform

Hadoop 2.0

HortonWorks Stinger inicijativa

Tez (interactive) vs. batch

Streaming (Storm project), itd.

Page 23: (ATD 9) Microsoft Big Data Platform

ZaključakBig data trendHadoop de facto standardWindows Azure HDInsightOpen source

Page 24: (ATD 9) Microsoft Big Data Platform

Pitanja?

Page 25: (ATD 9) Microsoft Big Data Platform

Hvala!