hybris – cloud - bigdata v1.0 19/11/2014 yassine mejri

Post on 21-Dec-2015

225 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

HYBRIS – CLOUD - BIGDATA

V1.0 19/11/2014Yassine MEJRI

Agenda

HYBRIS-CLOUD-BIGDATA

Cloud Windows Azure

Deploying Hybris on Windows Azure

Elasticsearch

Kibana

Use cases : Analytics, Machine learning.

Cloud Computing

CLOUD

A standardised IT capability (services, software or infrastructure) delivered via internet technologies in a pay-per-use, self-service way

A style of computing where massively scalable IT-related capabilities are provided “as a service” using internet technologies to multiple external customers

Cloud services are shared services, under virtualised management, accessible over the internet

History

CLOUD

1960 : John McCarthy’s Concept 1960 : John McCarthy’s Concept

1999 : Salesforce.com1999 : Salesforce.com

2000 : Microsoft 2001 : IBM2000 : Microsoft 2001 : IBM

2005 : Amazon2005 : Amazon

2007 : Google and IBM2007 : Google and IBM

2008 : Gartner Research2008 : Gartner Research

“Launch of Amazon web services”

“Start researching Cloud Computing”

“Computation may someday be organized as a public utility."

“Pioneered the concept of delivering enterprise applications via a simple website”

“Expanded Sass Concept through web service”

“Start using Cloud Computing in many organization”

Cloud computing providers

CLOUD

http://www.cloudscreener.com/

WINDOWS AZURE

CLOUD

WINDOWS AZURE LAYERS

WINDOWS AZURE

Cloud service model

WINDOWS AZURE

Geo-location Datacenter

WINDOWS AZURE

Building and running apps

WINDOWS AZURE

Building and running apps

WINDOWS AZURE

Windows Azure Blob Storage

Architecture

WINDOWS AZURE BLOB STORAGE

Azure Blob storage is a service for storing large amounts of unstructured data, such as text or binary data, that can

be accessed from anywhere in the world via HTTP or HTTPS. Common uses of Blob storage include:

Serving images or documents directly to a browser

Storing files for distributed access

Streaming video and audio

Performing secure backup and disaster recovery

Java API

WINDOWS AZURE BLOB STORAGE

Connexion String :

public static final String storageConnectionString = "DefaultEndpointsProtocol=http;" +

"AccountName=your_storage_account;" + "AccountKey=your_storage_account_key";

Create container :

CloudStorageAccount storageAccount = CloudStorageAccount.parse(storageConnectionString);

CloudBlobClient blobClient = storageAccount.createCloudBlobClient();

CloudBlobContainer container = blobClient.getContainerReference("images");

container.createIfNotExists();

Java API

WINDOWS AZURE BLOB STORAGE

Change permissions :

BlobContainerPermissions containerPermissions = new BlobContainerPermissions();

containerPermissions.setPublicAccess(BlobContainerPublicAccessType.CONTAINER);

container.uploadPermissions(containerPermissions);

Upload blob :

final String filePath = "C:\\myimages\\myimage.jpg";

CloudBlockBlob blob = container.getBlockBlobReference("myimage.jpg");

File source = new File(filePath);

blob.upload(new FileInputStream(source), source.length());

Download blob :

for (ListBlobItem blobItem : container.listBlobs()) {

if (blobItem instanceof CloudBlob) {

CloudBlob blob = (CloudBlob) blobItem;

blob.download(new FileOutputStream("C:\\mydownloads\\" + blob.getName()));

}

}

Tables NoSQL

WINDOWS AZURE BLOB STORAGE

Queue

WINDOWS AZURE BLOB STORAGE

Windows Azure Management Console

CLOUD

Azure SDK : Powershell, Node.js, Java …

CLOUD

Windows azure SDK :

Import-AzurePublishSettingsFile -PublishSettingsFile "full path to downloaded file“

New-AzureAffinityGroup -Name pslab-group -Location "East US“

New-AzureQuickVM -ImageName $VMImage -Windows -Name $myVMName -ServiceName $myVMName -

AdminUsername $myAdminName -Password $myAdminPwd

-AffinityGroup pslab-grou

Stop-AzureVM -Name $myVMName -ServiceName $myVMName

Start-AzureVM -Name $myVMName -ServiceName $myVMName

Restart-AzureVM -Name $myVMName -ServiceName $myVMName

Deploy Hybris

HYBRIS

Use Case : Deploying Hybris on Windows Azure

Architecture : auto-scalable horizontal and vertical

DEPLOY HYBRIS

Azure Blob Storage : Medias, Files, Attachements, orders.pdf…

AZURE SQL

SERVER

Cloud Service F.O

VIP : windows Azure Load Balancer (Failover, Round Robin, Performance)

Cloud Service B.O

N1 N2 Ni N1 N2 Ni

CDNHTTP/HTTPS

Azure cloud Extension

HYBRIS-CLOUD

Windows Azure Blob provides a simple web services interface that can be used to store and retrieve any

amount of data. You can configure a specific MediaFolder to store binary data of a Media item directly in Windows

Azure Blob.

To configure your folder to use Windows Azure Blob you need to have:

Windows Azure account

Properly created Access Keys

For more details read http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/.

Azure cloud Extension

HYBRIS-CLOUD

Azure cloud Extension

HYBRIS-CLOUD

https://wiki.hybris.com/display/release5/Using+Windows+Azure+Blob+Media+Storage+Strategy

1.Import extension : azurecloud

2.Configure blob storage in local.properties:

Global settings :

media.globalSettings.accountKey=

media.globalSettings.accountName=

media.globalSettings.connection=UseDevelopmentStorage\=True

media.globalSettings.endPointProtocol=http

media.globalSettings.local.cache=true

media.globalSettings.public.base.url=http://127.0.0.1:10000/devstoreaccount1

media.globalSettings.secured=true

media.globalSettings.storage.strategy=windowsAzureBlobStorageStrategy

media.globalSettings.url.strategy=windowsAzureBlobURLStrategy

Azure cloud Extension

HYBRIS-CLOUD

3. How to create new blob storage folder :

……..

media.folder.invoices.accountKey=

media.folder.invoices.accountName=

media.folder.invoices.connection=UseDevelopmentStorage\=True

media.folder.invoices.endPointProtocol=http

media.folder.invoices.local.cache=true

media.folder.invoices.public.base.url=http://127.0.0.1:10000/devstoreaccount1

media.folder.invoices.secured=true

media.folder.invoices.storage.strategy=windowsAzureBlobStorageStrategy

media.folder.invoices.url.strategy=windowsAzureBlobURLStrategy

……..

Azure cloud Extension

HYBRIS-CLOUD

4. Storing Media Files :

final MediaModel media = modelService.create(MediaModel.class);

media.setCatalogVersion(catalogVersionService.getCatalogVersion("productCatalog

", "Staged"));

final MediaFolderModel folder = mediaService.getFolder("invoices");

media.setFolder(folder);

mediaService.save(media);

Secure media access

HYBRIS-CLOUD

Secure media access

HYBRIS-CLOUD

You can enable secure media access for specific Media folder by putting in your local.properties file the following

property set to true: media.folder.<folderName>.secured=true

It means that only secure URL will be rendered for each Media item stored in these folders. It also means that

access to these medias will be filtered only by the SecureMediaFilter.

Managing Permissions :

Use the MediaPermissionService

Using hMC

You can grant or deny access to a Media item for a give principal by opening specific Media item and going

to Security tab.

Using ImpEx

Below you can find the example of an ImpEx import script for granting access to a Media item with

code 1017895.jpg for the editor principal:

INSERT_UPDATE media; code[unique=true]; catalogVersion(catalog(id),version)[unique=true];

permittedPrincipals(uid);;1017895.jpg; clothescatalog:Staged;editor;

Azure cloud Extension

HYBRIS-CLOUD

http://hybrisazure.blob.core.windows.net/hybris/sys_master/root/h3e/hd7/8796157378590.jpg

Initialze or Update Hybris :

Keep in mind that even if name of custom container is myContainer, then prefix with tenantId is added

automatically, so finally container name is sys-master-myContainer. The pattern is sys-<tenantID>-<containerName>.

To control cleaning Windows Azure storage on fresh initialization use following global property:

media.globalSettings.windowsAzureBlobStorageStrategy.cleanOnInit={true or false}

Azure Cloud Service ?

DEPLOY HYBRIS

Azure Blob Storage : Medias, Files, Attachements, orders.pdf…

AZURE SQL

SERVER

Cloud Service F.O

VIP : windows Azure Load Balancer (Failover, Round Robin, Performance)

Cloud Service B.O

N1 N2 Ni N1 N2 Ni

CDNHTTP/HTTPS

AzureRunMe

DEPLOY HYBRIS

Packaging and Deploy Hybris

DEPLOY HYBRIS

Windows Azure Services are described by two important artifacts:

Service Definition (*.csdef)

Service Configuration (*.cscfg)

Your code is zipped and packaged with definition (*.cspkg)

Encrypted(Zipped(Code + *.csdef)) == *.cspkg

Windows Azure consumes just (*.cspkg + *.cscfg)

Devops : Azure PowerShell cmdlets

DEPLOY HYBRIS

# import Azure dll$env:PSModulePath=$env:PSModulePath+";"+"C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell Import-Module Azure

# Connexion Import-AzurePublishSettingsFile $pubsettingsSelect-AzureSubscription -SubscriptionName $selectedsubscriptionSet-AzureSubscription -CurrentStorageAccount $storageAccountName -SubscriptionName $selectedsubscription

# Create New deployement $opstat = New-AzureDeployment -Slot $slot -Package $packageLocation -Configuration $cloudConfigLocation -label $deploymentLabel -ServiceName $serviceName

# Upgrade deployement$setdeployment = Set-AzureDeployment -Upgrade -Slot $slot -Package $packageLocation -Configuration $cloudConfigLocation -label $deploymentLabel -ServiceName $serviceName -Force

# swap deployment, staging productionMove-AzureDeployment -ServiceName $serviceName

AzureRunMe

HYBRIS-CLOUD

Demo : AzureRunMe and Windows Azure Emulator

Elasticsearch

ELASTICSEARCH

Elasticsearch

ELASTICSEARCH

https://github.com/elasticsearch

1.Java

2.Apache Lucene

3.Plug and play

4.Document Oriented

5.Scalable

6.Clustering

7.Lucene

8.Sharding and replication

9.REST/ JSON Client

10.Apache2 license

SQL VS ES

ELASICSEARCH

Architecture

ELASTICSEARCH

b2 b1

Node 1 (Master)

a0 a1 b0

Node 2

b1 a1

Node 4

b0

b2

a0

Node 3

Réplication

Cluster

Shard primaire Shard répliqué

Instance ElasticSearch Instance ElasticSearch

Instance ElasticSearch Instance ElasticSearch

b3

b3b4 b4

a et b = index

Document à indexer

Route

Mapping fields types

ELASTICSEARCH

Core types : String, Integer, Long , Double, Boolean, Date, Binary ….

IP type : "address" : { "type" : "ip", "store" : "yes" }

{

"name" : "Tom PC",

"address" : "192.168.2.123"

}

Geo point type : "location" : { "type" : "geo_point"}

Attachement type : "my_attachment" : { "type" : "attachment" }

Token count type : The token_count field type allows us to store index information about how many words the given

field has instead of storing and indexing the text provided to the field.

"address_count" : { "type" : "token_count", "store" : "yes" }

Mapping fields types

ELASTICSEARCH

Object types : JSON documents are hierarchical in nature, allowing them to define inner "objects" within the actual

JSON.

"tweet" : {

"properties" : {

"person" : {

"type" : "object",

"properties" : {

"name" : {

"type" : "object",

"properties" : {

"first_name" : {"type" : "string"},

"last_name" : {"type" : "string"}

}

},

"sid" : {"type" : "string", "index" : "not_analyzed"}

}

},

"message" : {"type" : "string"}

}

}

Mapping fields types

ELASTICSEARCH

Nested Types : The nested type works like the object type except that an array of objects is flattened, while an array of

nested objects allows each object to be queried independently. To explain, consider this document:

Mapping :

{

"type1" : {

"properties" : {

"users" : {

"type" : "nested",

"properties": {

"first" : {"type": "string" },

"last" : {"type": "string" }

}

}

}

}

}

Mapping fields types

ELASTICSEARCH

Array types : JSON documents allow to define an array (list) of fields or objects."Product" : [

{

"id" : 12

"title" : "iphone",

"categories" : [1,3,5,7],

"tag" : ["iphone4", "iphone5","iphone6"],

"author" : [

{

"firstname" : "Francois",

"lastname": "francoisg",

"id" : 18

},

{

"firstname" : "Gregory",

"lastname" : "gregquat"

"id" : "2"

}

]}}

Relationnel vs denormalize

ELASTICSEARCH

Relationnel vs denormalize

ELASTICSEARCH

Elasticsearch : CRUD

ELASTICSEARCH

Insert Data:

$ cat data.json

{ "index" : { "_index" : "requests" , "_type" : "request" , "_id" : 33 } }

{ "client" : "client1" , "country" : "FR" , "id" : 1, "ip" : "100.1.1.3", "password" : "test" , "sensor" : "test" , "session" :

"EFRFR34344" , "success" : "OK" ,"timestamp" : "1414183085848", "username" : "test" }

$ curl -XPOST http://localhost:9200/requests -d @data.json

Update :

$curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{

"doc" : {

"name" : "new_name"

}

}'}

}‘

Delete :

$ curl -XDELETE 'http://localhost:9200/twitter/tweet/1‘

Query DSL

ELASTICSEARCH

$ curl -XPOST http://localhost:9200/_search?<YOUR_QUERY>

Type Exemple

Terms Apple iphone

Phrases « apple iphone»

Proximity « apple iphone»~5

Fuzzy Apple~5

Wilcards App*

Boosting Apple^10 safari

Range [2014/05/01 To 2014/05/30]

Boolean apple AND NOT iphone

Fields Title:iphone^5

Query DSL

ELASTICSEARCH

MULTI SEARCH API

ELASTICSEARCH

SearchRequestBuilder requestOne = node.client()

.prepareSearch().setQuery(QueryBuilders.matchQuery("name", "test1")).setSize(1);

SearchRequestBuilder requestTwo = node.client()

.prepareSearch().setQuery(QueryBuilders.matchQuery("name", "test2")).setSize(1);

MultiSearchResponse response = node.client().prepareMultiSearch()

.add(requestOne )

.add(requestTwo )

.execute().actionGet();

// You will get all individual responses from MultiSearchResponse#getResponses()

long nbHits = 0;

for (MultiSearchResponse.Item item : sr.getResponses()) {

SearchResponse response = item.getResponse();

nbHits += response.getHits().getTotalHits();

}

Bulk API

ELASTICSEARCH

The bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase

the indexing speed.

Example

$ cat requests

{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }

{ "field1" : "value1" }

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests; echo

{"took":7,"items":[{"create":{"_index":"test","_type":"type1","_id":"1","_version":1}}]}

Aggregations

ELASTICSEARCH

The following snippet captures the basic structure of aggregations:

"aggregations" : {

"<aggregation_name>" : {

"<aggregation_type>" : {

<aggregation_body>

}

[,"aggregations" : { [<sub_aggregation>]+ } ]?

}

[,"<aggregation_name_2>" : { ... } ]*

}

Aggregations : min, max, avg

ELASTICSEARCH

Aggregations : The terms aggregation

ELASTICSEARCH

Aggregations : The range aggregation

ELASTICSEARCH

Aggregations : Histogram aggregation

ELASTICSEARCH

Facet

ELASTICSEARCH

Use case : Faceting using Elasticsearch Aggregations

ELASTICSEARCH

Use case : Faceting using Elasticsearch Aggregations

ELASTICSEARCH

Facets Component - Facets help users to narrow down / or filter a search result, facet is built based on the search

context.

Sort Order - Sort order impacts the search results components, it defines in what order the results should be listed

on the page, for instance a user may sort by lowest to highest price or by product ratings.

Pagination of Results - Pagination component allows an user to navigate back and forth through a search results, this

also guides the number of records that should be returned in ES query.

Search Result - Restricted to number of records that should be displayed on the landing page, perhaps this will be

configurable based on your application needs.

Use case : Faceting using Elasticsearch Aggregations

ELASTICSEARCH

# Get search results and facets for men's category, filter by facet selection Brand = "diesel"

$curl -XGET 'http://localhost:9200/products/_search?pretty=true' -d '{

"from" : 0, "size" : 5,

"query": {"filtered": {"filter": {"term": {"Brand": "diesel"}}, "query": { "term" : { "categories" : "men" }}}},

"aggs" : { "offerprice" : { "range" : { "field" : "offerprice", "keyed" : true, "ranges" : [ { "to" : 5 }, { "from" : 5, "to" : 10}, {

"from" : 10, "to" : 20}, { "from" : 20, "to" : 30} ] }

},

"size" : {"terms" : {"field" : "size","order": { "_count" : "asc" }}},

"Deals" : {"terms" : {"field" : "offers","order": { "_count" : "asc" }}},

"Brand" : {"terms" : {"field" : "Brand","order": { "_count" : "desc" }}}

},

"sort" : [{"offerprice" : {"order" : "asc", "mode" : "avg", "ignore_unmapped":true, "missing":"_last"}},"_score"]

}'

Use case : Faceting using Elasticsearch Aggregations

ELASTICSEARCH

# Get search results and facets for men's category, filter by facet selection Brand = "diesel" and Size "small"

curl -XGET 'http://localhost:9200/products/_search?pretty=true' -d '{

"from" : 0, "size" : 5,

"query": {"filtered": {"filter": {

"and": [{"term": {"Brand":"diesel" }},{"term": {"size":"small"}}]}, "query": { "term" : { "categories" : "men" }}}},

"aggs" : {

"offerprice" : {

"range" : {"field" : "offerprice","keyed" : true,"ranges" : [{ "to" : 5 },{ "from" : 5, "to" : 10},{ "from" : 10, "to" : 20},

{ "from" : 20, "to" : 30}]}

},

"size" : {"terms" : {"field" : "size","order": { "_count" : "asc" }}},

"Deals" : {"terms" : {"field" : "offers","order": { "_count" : "asc" }}},

"Brand" : {"terms" : {"field" : "Brand","order": { "_count" : "desc" }}}

},

"sort" : [{"offerprice" : {"order" : "asc", "mode" : "avg", "ignore_unmapped":true, "missing":"_last"}},"_score"]

}'

ES Client

ELASTICSEARCH

Python : https://github.com/elasticsearch/elasticsearch-dsl-py

Ruby : https://github.com/printercu/elastics-rb

Javascript : https://github.com/fullscale/elastic.js

Scala : https://github.com/sksamuel/elastic4s

Clojure : https://github.com/clojurewerkz/elastisch

Nodejs : https://github.com/phillro/node-elasticsearch-client

Spring-data-elasticsearch : https://github.com/spring-projects/spring-data-elasticsearch

Introduction

KIBANA

https://github.com/elasticsearch/kibana

Angular JS

Responsive Design with Bootstrap

Nodejs Platform

Open-source

Dashboard KIBANA

KIBANA

Queries and filters KIBANA:

KIBANA

Query :

Filtering :

Panels : bettermap

KIBANA

Bettermap panel : The field that contains the coordinates, in geojson format. GeoJSON is [longitude,latitude]in an

array. This is different from most implementations, which use latitude, longitude.

Term panel

KIBANA

A table, bar chart or pie chart based on the results of an Elasticsearch terms facet.

Histogram panel

KIBANA

The histogram panel allow for the display of time charts. It includes several modes and tranformations to display event

counts, mean, min, max and total of numeric fields, and derivatives of counter fields.

Map

KIBANA

The map panel translates 2 letter country or state codes into shaded regions on a map. Currently available maps are

world, usa and europe.

Table

KIBANA

The table panel contains a sortable, pagable view of documents that. It can be arranged into defined columns and

offers several interactions, such as performing adhoc terms aggregations.

Text

KIBANA

The text panel is used for displaying static text formated as markdown, sanitized html or as plain text.

Trends

KIBANA

A stock-ticker style representation of how queries are moving over time. For example, if the time is 1:10pm, your time

picker was set to "Last 10m", and the "Time Ago" parameter was set to "1h", the panel would show how much the

query results have changed since 12:00-12:10pm

Analytics Dashboard

USE CASE 1 : MARKETING

DEMO

Example : Recommander Engine

MAKE SENSE OF YOUR DATA

ElasticSearch, Mongodb or Apache Solr

Apache Mahout , Apache Spark mllib or H2O…

Node 0Node 0 Node NNode N……

IN:{HTTP Requests}

Requests Collector Requests Collector {insert Data}

Hybris Cluster

{Data analysis}

OUT:{Recommendations}

Example : Amazon

MAKE SENSE OF YOUR DATA

Example : Zalando

MAKE SENSE OF YOUR DATA

Example : NetFlix

MAKE SENSE OF YOUR DATA

Q&A

top related