hybris – cloud - bigdata v1.0 19/11/2014 yassine mejri
Post on 21-Dec-2015
225 Views
Preview:
TRANSCRIPT
HYBRIS – CLOUD - BIGDATA
V1.0 19/11/2014Yassine MEJRI
Agenda
HYBRIS-CLOUD-BIGDATA
Cloud Windows Azure
Deploying Hybris on Windows Azure
Elasticsearch
Kibana
Use cases : Analytics, Machine learning.
Cloud Computing
CLOUD
A standardised IT capability (services, software or infrastructure) delivered via internet technologies in a pay-per-use, self-service way
A style of computing where massively scalable IT-related capabilities are provided “as a service” using internet technologies to multiple external customers
Cloud services are shared services, under virtualised management, accessible over the internet
History
CLOUD
1960 : John McCarthy’s Concept 1960 : John McCarthy’s Concept
1999 : Salesforce.com1999 : Salesforce.com
2000 : Microsoft 2001 : IBM2000 : Microsoft 2001 : IBM
2005 : Amazon2005 : Amazon
2007 : Google and IBM2007 : Google and IBM
2008 : Gartner Research2008 : Gartner Research
“Launch of Amazon web services”
“Start researching Cloud Computing”
“Computation may someday be organized as a public utility."
“Pioneered the concept of delivering enterprise applications via a simple website”
“Expanded Sass Concept through web service”
“Start using Cloud Computing in many organization”
Cloud computing providers
CLOUD
http://www.cloudscreener.com/
WINDOWS AZURE
CLOUD
WINDOWS AZURE LAYERS
WINDOWS AZURE
Cloud service model
WINDOWS AZURE
Geo-location Datacenter
WINDOWS AZURE
Building and running apps
WINDOWS AZURE
Building and running apps
WINDOWS AZURE
Windows Azure Blob Storage
Architecture
WINDOWS AZURE BLOB STORAGE
Azure Blob storage is a service for storing large amounts of unstructured data, such as text or binary data, that can
be accessed from anywhere in the world via HTTP or HTTPS. Common uses of Blob storage include:
Serving images or documents directly to a browser
Storing files for distributed access
Streaming video and audio
Performing secure backup and disaster recovery
Java API
WINDOWS AZURE BLOB STORAGE
Connexion String :
public static final String storageConnectionString = "DefaultEndpointsProtocol=http;" +
"AccountName=your_storage_account;" + "AccountKey=your_storage_account_key";
Create container :
CloudStorageAccount storageAccount = CloudStorageAccount.parse(storageConnectionString);
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
CloudBlobContainer container = blobClient.getContainerReference("images");
container.createIfNotExists();
Java API
WINDOWS AZURE BLOB STORAGE
Change permissions :
BlobContainerPermissions containerPermissions = new BlobContainerPermissions();
containerPermissions.setPublicAccess(BlobContainerPublicAccessType.CONTAINER);
container.uploadPermissions(containerPermissions);
Upload blob :
final String filePath = "C:\\myimages\\myimage.jpg";
CloudBlockBlob blob = container.getBlockBlobReference("myimage.jpg");
File source = new File(filePath);
blob.upload(new FileInputStream(source), source.length());
Download blob :
for (ListBlobItem blobItem : container.listBlobs()) {
if (blobItem instanceof CloudBlob) {
CloudBlob blob = (CloudBlob) blobItem;
blob.download(new FileOutputStream("C:\\mydownloads\\" + blob.getName()));
}
}
Tables NoSQL
WINDOWS AZURE BLOB STORAGE
Queue
WINDOWS AZURE BLOB STORAGE
Windows Azure Management Console
CLOUD
Azure SDK : Powershell, Node.js, Java …
CLOUD
Windows azure SDK :
Import-AzurePublishSettingsFile -PublishSettingsFile "full path to downloaded file“
New-AzureAffinityGroup -Name pslab-group -Location "East US“
New-AzureQuickVM -ImageName $VMImage -Windows -Name $myVMName -ServiceName $myVMName -
AdminUsername $myAdminName -Password $myAdminPwd
-AffinityGroup pslab-grou
Stop-AzureVM -Name $myVMName -ServiceName $myVMName
Start-AzureVM -Name $myVMName -ServiceName $myVMName
Restart-AzureVM -Name $myVMName -ServiceName $myVMName
Deploy Hybris
HYBRIS
Use Case : Deploying Hybris on Windows Azure
Architecture : auto-scalable horizontal and vertical
DEPLOY HYBRIS
Azure Blob Storage : Medias, Files, Attachements, orders.pdf…
AZURE SQL
SERVER
Cloud Service F.O
VIP : windows Azure Load Balancer (Failover, Round Robin, Performance)
Cloud Service B.O
N1 N2 Ni N1 N2 Ni
CDNHTTP/HTTPS
Azure cloud Extension
HYBRIS-CLOUD
Windows Azure Blob provides a simple web services interface that can be used to store and retrieve any
amount of data. You can configure a specific MediaFolder to store binary data of a Media item directly in Windows
Azure Blob.
To configure your folder to use Windows Azure Blob you need to have:
Windows Azure account
Properly created Access Keys
For more details read http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/.
Azure cloud Extension
HYBRIS-CLOUD
Azure cloud Extension
HYBRIS-CLOUD
https://wiki.hybris.com/display/release5/Using+Windows+Azure+Blob+Media+Storage+Strategy
1.Import extension : azurecloud
2.Configure blob storage in local.properties:
Global settings :
media.globalSettings.accountKey=
media.globalSettings.accountName=
media.globalSettings.connection=UseDevelopmentStorage\=True
media.globalSettings.endPointProtocol=http
media.globalSettings.local.cache=true
media.globalSettings.public.base.url=http://127.0.0.1:10000/devstoreaccount1
media.globalSettings.secured=true
media.globalSettings.storage.strategy=windowsAzureBlobStorageStrategy
media.globalSettings.url.strategy=windowsAzureBlobURLStrategy
Azure cloud Extension
HYBRIS-CLOUD
3. How to create new blob storage folder :
……..
media.folder.invoices.accountKey=
media.folder.invoices.accountName=
media.folder.invoices.connection=UseDevelopmentStorage\=True
media.folder.invoices.endPointProtocol=http
media.folder.invoices.local.cache=true
media.folder.invoices.public.base.url=http://127.0.0.1:10000/devstoreaccount1
media.folder.invoices.secured=true
media.folder.invoices.storage.strategy=windowsAzureBlobStorageStrategy
media.folder.invoices.url.strategy=windowsAzureBlobURLStrategy
……..
Azure cloud Extension
HYBRIS-CLOUD
4. Storing Media Files :
final MediaModel media = modelService.create(MediaModel.class);
media.setCatalogVersion(catalogVersionService.getCatalogVersion("productCatalog
", "Staged"));
final MediaFolderModel folder = mediaService.getFolder("invoices");
media.setFolder(folder);
mediaService.save(media);
Secure media access
HYBRIS-CLOUD
Secure media access
HYBRIS-CLOUD
You can enable secure media access for specific Media folder by putting in your local.properties file the following
property set to true: media.folder.<folderName>.secured=true
It means that only secure URL will be rendered for each Media item stored in these folders. It also means that
access to these medias will be filtered only by the SecureMediaFilter.
Managing Permissions :
Use the MediaPermissionService
Using hMC
You can grant or deny access to a Media item for a give principal by opening specific Media item and going
to Security tab.
Using ImpEx
Below you can find the example of an ImpEx import script for granting access to a Media item with
code 1017895.jpg for the editor principal:
INSERT_UPDATE media; code[unique=true]; catalogVersion(catalog(id),version)[unique=true];
permittedPrincipals(uid);;1017895.jpg; clothescatalog:Staged;editor;
Azure cloud Extension
HYBRIS-CLOUD
http://hybrisazure.blob.core.windows.net/hybris/sys_master/root/h3e/hd7/8796157378590.jpg
Initialze or Update Hybris :
Keep in mind that even if name of custom container is myContainer, then prefix with tenantId is added
automatically, so finally container name is sys-master-myContainer. The pattern is sys-<tenantID>-<containerName>.
To control cleaning Windows Azure storage on fresh initialization use following global property:
media.globalSettings.windowsAzureBlobStorageStrategy.cleanOnInit={true or false}
Azure Cloud Service ?
DEPLOY HYBRIS
Azure Blob Storage : Medias, Files, Attachements, orders.pdf…
AZURE SQL
SERVER
Cloud Service F.O
VIP : windows Azure Load Balancer (Failover, Round Robin, Performance)
Cloud Service B.O
N1 N2 Ni N1 N2 Ni
CDNHTTP/HTTPS
AzureRunMe
DEPLOY HYBRIS
Packaging and Deploy Hybris
DEPLOY HYBRIS
Windows Azure Services are described by two important artifacts:
Service Definition (*.csdef)
Service Configuration (*.cscfg)
Your code is zipped and packaged with definition (*.cspkg)
Encrypted(Zipped(Code + *.csdef)) == *.cspkg
Windows Azure consumes just (*.cspkg + *.cscfg)
Devops : Azure PowerShell cmdlets
DEPLOY HYBRIS
# import Azure dll$env:PSModulePath=$env:PSModulePath+";"+"C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell Import-Module Azure
# Connexion Import-AzurePublishSettingsFile $pubsettingsSelect-AzureSubscription -SubscriptionName $selectedsubscriptionSet-AzureSubscription -CurrentStorageAccount $storageAccountName -SubscriptionName $selectedsubscription
# Create New deployement $opstat = New-AzureDeployment -Slot $slot -Package $packageLocation -Configuration $cloudConfigLocation -label $deploymentLabel -ServiceName $serviceName
# Upgrade deployement$setdeployment = Set-AzureDeployment -Upgrade -Slot $slot -Package $packageLocation -Configuration $cloudConfigLocation -label $deploymentLabel -ServiceName $serviceName -Force
# swap deployment, staging productionMove-AzureDeployment -ServiceName $serviceName
AzureRunMe
HYBRIS-CLOUD
Demo : AzureRunMe and Windows Azure Emulator
Elasticsearch
ELASTICSEARCH
Elasticsearch
ELASTICSEARCH
https://github.com/elasticsearch
1.Java
2.Apache Lucene
3.Plug and play
4.Document Oriented
5.Scalable
6.Clustering
7.Lucene
8.Sharding and replication
9.REST/ JSON Client
10.Apache2 license
SQL VS ES
ELASICSEARCH
Architecture
ELASTICSEARCH
b2 b1
Node 1 (Master)
a0 a1 b0
Node 2
b1 a1
Node 4
b0
b2
a0
Node 3
Réplication
Cluster
Shard primaire Shard répliqué
Instance ElasticSearch Instance ElasticSearch
Instance ElasticSearch Instance ElasticSearch
b3
b3b4 b4
a et b = index
Document à indexer
Route
Mapping fields types
ELASTICSEARCH
Core types : String, Integer, Long , Double, Boolean, Date, Binary ….
IP type : "address" : { "type" : "ip", "store" : "yes" }
{
"name" : "Tom PC",
"address" : "192.168.2.123"
}
Geo point type : "location" : { "type" : "geo_point"}
Attachement type : "my_attachment" : { "type" : "attachment" }
Token count type : The token_count field type allows us to store index information about how many words the given
field has instead of storing and indexing the text provided to the field.
"address_count" : { "type" : "token_count", "store" : "yes" }
Mapping fields types
ELASTICSEARCH
Object types : JSON documents are hierarchical in nature, allowing them to define inner "objects" within the actual
JSON.
"tweet" : {
"properties" : {
"person" : {
"type" : "object",
"properties" : {
"name" : {
"type" : "object",
"properties" : {
"first_name" : {"type" : "string"},
"last_name" : {"type" : "string"}
}
},
"sid" : {"type" : "string", "index" : "not_analyzed"}
}
},
"message" : {"type" : "string"}
}
}
Mapping fields types
ELASTICSEARCH
Nested Types : The nested type works like the object type except that an array of objects is flattened, while an array of
nested objects allows each object to be queried independently. To explain, consider this document:
Mapping :
{
"type1" : {
"properties" : {
"users" : {
"type" : "nested",
"properties": {
"first" : {"type": "string" },
"last" : {"type": "string" }
}
}
}
}
}
Mapping fields types
ELASTICSEARCH
Array types : JSON documents allow to define an array (list) of fields or objects."Product" : [
{
"id" : 12
"title" : "iphone",
"categories" : [1,3,5,7],
"tag" : ["iphone4", "iphone5","iphone6"],
"author" : [
{
"firstname" : "Francois",
"lastname": "francoisg",
"id" : 18
},
{
"firstname" : "Gregory",
"lastname" : "gregquat"
"id" : "2"
}
]}}
Relationnel vs denormalize
ELASTICSEARCH
Relationnel vs denormalize
ELASTICSEARCH
Elasticsearch : CRUD
ELASTICSEARCH
Insert Data:
$ cat data.json
{ "index" : { "_index" : "requests" , "_type" : "request" , "_id" : 33 } }
{ "client" : "client1" , "country" : "FR" , "id" : 1, "ip" : "100.1.1.3", "password" : "test" , "sensor" : "test" , "session" :
"EFRFR34344" , "success" : "OK" ,"timestamp" : "1414183085848", "username" : "test" }
$ curl -XPOST http://localhost:9200/requests -d @data.json
Update :
$curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
}
}'}
}‘
Delete :
$ curl -XDELETE 'http://localhost:9200/twitter/tweet/1‘
Query DSL
ELASTICSEARCH
$ curl -XPOST http://localhost:9200/_search?<YOUR_QUERY>
Type Exemple
Terms Apple iphone
Phrases « apple iphone»
Proximity « apple iphone»~5
Fuzzy Apple~5
Wilcards App*
Boosting Apple^10 safari
Range [2014/05/01 To 2014/05/30]
Boolean apple AND NOT iphone
Fields Title:iphone^5
Query DSL
ELASTICSEARCH
MULTI SEARCH API
ELASTICSEARCH
SearchRequestBuilder requestOne = node.client()
.prepareSearch().setQuery(QueryBuilders.matchQuery("name", "test1")).setSize(1);
SearchRequestBuilder requestTwo = node.client()
.prepareSearch().setQuery(QueryBuilders.matchQuery("name", "test2")).setSize(1);
MultiSearchResponse response = node.client().prepareMultiSearch()
.add(requestOne )
.add(requestTwo )
.execute().actionGet();
// You will get all individual responses from MultiSearchResponse#getResponses()
long nbHits = 0;
for (MultiSearchResponse.Item item : sr.getResponses()) {
SearchResponse response = item.getResponse();
nbHits += response.getHits().getTotalHits();
}
Bulk API
ELASTICSEARCH
The bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase
the indexing speed.
Example
$ cat requests
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests; echo
{"took":7,"items":[{"create":{"_index":"test","_type":"type1","_id":"1","_version":1}}]}
Aggregations
ELASTICSEARCH
The following snippet captures the basic structure of aggregations:
"aggregations" : {
"<aggregation_name>" : {
"<aggregation_type>" : {
<aggregation_body>
}
[,"aggregations" : { [<sub_aggregation>]+ } ]?
}
[,"<aggregation_name_2>" : { ... } ]*
}
Aggregations : min, max, avg
ELASTICSEARCH
Aggregations : The terms aggregation
ELASTICSEARCH
Aggregations : The range aggregation
ELASTICSEARCH
Aggregations : Histogram aggregation
ELASTICSEARCH
Facet
ELASTICSEARCH
Use case : Faceting using Elasticsearch Aggregations
ELASTICSEARCH
Use case : Faceting using Elasticsearch Aggregations
ELASTICSEARCH
Facets Component - Facets help users to narrow down / or filter a search result, facet is built based on the search
context.
Sort Order - Sort order impacts the search results components, it defines in what order the results should be listed
on the page, for instance a user may sort by lowest to highest price or by product ratings.
Pagination of Results - Pagination component allows an user to navigate back and forth through a search results, this
also guides the number of records that should be returned in ES query.
Search Result - Restricted to number of records that should be displayed on the landing page, perhaps this will be
configurable based on your application needs.
Use case : Faceting using Elasticsearch Aggregations
ELASTICSEARCH
# Get search results and facets for men's category, filter by facet selection Brand = "diesel"
$curl -XGET 'http://localhost:9200/products/_search?pretty=true' -d '{
"from" : 0, "size" : 5,
"query": {"filtered": {"filter": {"term": {"Brand": "diesel"}}, "query": { "term" : { "categories" : "men" }}}},
"aggs" : { "offerprice" : { "range" : { "field" : "offerprice", "keyed" : true, "ranges" : [ { "to" : 5 }, { "from" : 5, "to" : 10}, {
"from" : 10, "to" : 20}, { "from" : 20, "to" : 30} ] }
},
"size" : {"terms" : {"field" : "size","order": { "_count" : "asc" }}},
"Deals" : {"terms" : {"field" : "offers","order": { "_count" : "asc" }}},
"Brand" : {"terms" : {"field" : "Brand","order": { "_count" : "desc" }}}
},
"sort" : [{"offerprice" : {"order" : "asc", "mode" : "avg", "ignore_unmapped":true, "missing":"_last"}},"_score"]
}'
Use case : Faceting using Elasticsearch Aggregations
ELASTICSEARCH
# Get search results and facets for men's category, filter by facet selection Brand = "diesel" and Size "small"
curl -XGET 'http://localhost:9200/products/_search?pretty=true' -d '{
"from" : 0, "size" : 5,
"query": {"filtered": {"filter": {
"and": [{"term": {"Brand":"diesel" }},{"term": {"size":"small"}}]}, "query": { "term" : { "categories" : "men" }}}},
"aggs" : {
"offerprice" : {
"range" : {"field" : "offerprice","keyed" : true,"ranges" : [{ "to" : 5 },{ "from" : 5, "to" : 10},{ "from" : 10, "to" : 20},
{ "from" : 20, "to" : 30}]}
},
"size" : {"terms" : {"field" : "size","order": { "_count" : "asc" }}},
"Deals" : {"terms" : {"field" : "offers","order": { "_count" : "asc" }}},
"Brand" : {"terms" : {"field" : "Brand","order": { "_count" : "desc" }}}
},
"sort" : [{"offerprice" : {"order" : "asc", "mode" : "avg", "ignore_unmapped":true, "missing":"_last"}},"_score"]
}'
ES Client
ELASTICSEARCH
Python : https://github.com/elasticsearch/elasticsearch-dsl-py
Ruby : https://github.com/printercu/elastics-rb
Javascript : https://github.com/fullscale/elastic.js
Scala : https://github.com/sksamuel/elastic4s
Clojure : https://github.com/clojurewerkz/elastisch
Nodejs : https://github.com/phillro/node-elasticsearch-client
Spring-data-elasticsearch : https://github.com/spring-projects/spring-data-elasticsearch
Introduction
KIBANA
https://github.com/elasticsearch/kibana
Angular JS
Responsive Design with Bootstrap
Nodejs Platform
Open-source
Dashboard KIBANA
KIBANA
Queries and filters KIBANA:
KIBANA
Query :
Filtering :
Panels : bettermap
KIBANA
Bettermap panel : The field that contains the coordinates, in geojson format. GeoJSON is [longitude,latitude]in an
array. This is different from most implementations, which use latitude, longitude.
Term panel
KIBANA
A table, bar chart or pie chart based on the results of an Elasticsearch terms facet.
Histogram panel
KIBANA
The histogram panel allow for the display of time charts. It includes several modes and tranformations to display event
counts, mean, min, max and total of numeric fields, and derivatives of counter fields.
Map
KIBANA
The map panel translates 2 letter country or state codes into shaded regions on a map. Currently available maps are
world, usa and europe.
Table
KIBANA
The table panel contains a sortable, pagable view of documents that. It can be arranged into defined columns and
offers several interactions, such as performing adhoc terms aggregations.
Text
KIBANA
The text panel is used for displaying static text formated as markdown, sanitized html or as plain text.
Trends
KIBANA
A stock-ticker style representation of how queries are moving over time. For example, if the time is 1:10pm, your time
picker was set to "Last 10m", and the "Time Ago" parameter was set to "1h", the panel would show how much the
query results have changed since 12:00-12:10pm
Analytics Dashboard
USE CASE 1 : MARKETING
DEMO
Example : Recommander Engine
MAKE SENSE OF YOUR DATA
ElasticSearch, Mongodb or Apache Solr
Apache Mahout , Apache Spark mllib or H2O…
Node 0Node 0 Node NNode N……
IN:{HTTP Requests}
Requests Collector Requests Collector {insert Data}
Hybris Cluster
{Data analysis}
OUT:{Recommendations}
Example : Amazon
MAKE SENSE OF YOUR DATA
Example : Zalando
MAKE SENSE OF YOUR DATA
Example : NetFlix
MAKE SENSE OF YOUR DATA
Q&A
top related