[rakuten techconf2014] [c-5] ichiba architecture on exalogic
Post on 03-Jul-2015
2.903 Views
Preview:
DESCRIPTION
TRANSCRIPT
ICHIBA Architecture
On Exalogic Oct/25/2014
Euncheol Kweon / Watanabe Ryu
EC Core Platform Group ICE Project Team.
1
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
2
About This Presentation
1. Exalogic Architecture
2. Exalogic Operation On OZ Manager
- Euncheol Kweon
- Watanabe Ryu
3
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
4
No.1 EC Site in Japan
5
No.1 EC Site in Japan
6
Biz App
Biz API
Biz API
Biz App
Biz APIBiz App
Biz App
7
Physical Servers
Database
Middleware
Business Application
Network
Service Components
Operation
8
Service A Service BService C
Service D
Service E
Service F Service G
Many Servers for Service
Many Engineers for Operation
9
Heavy LowHeavy
High
High
Heavy Low
Many Engineers for Operation
Operational Efficiency
10
0
1
2
3
4
5
6
00:0
001:0
002:0
003:0
004:0
005:0
006:0
007:0
008:0
009:0
010:0
011:0
012:0
013:0
014:0
015
:00
16:0
017:0
018:0
019:0
020:0
021
:00
22:0
023:0
0
A Service
B Service
Service Peak Time
System
Load
Time
11
Heavy
Low
Middle
If we can use servers more effectively
High
12
If it can be operated by 4 person.
13
14
What is Exalogic?
Oracle Engineered System
Specialized Product
for Middleware- WebLogic
- Coherence
- Tuxedo
Total Solution
15
Exalogic Elastic Cloud
Software
Exabus
Oracle Traffic Director
Storage Management
Software
Operating System
Exalogic Control
VM Server
Hardware
Computer Nodes- Max 30 Nodes
Xeon 2.7 8Cores * 2
256 GB Memory
200 GB SSD
Integrated Storage- ZFS Storage Appliance
60 TB Storage
InfiniBand Switches- 40Gb/s throughput
16
Migration
Operation Cost Down
Non-Stop Operation
Our Plan for Exalogic
Japan ICHIBA
Over 50 Services
About 400Servers
17
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
18
Exalogic
Optional
Middleware Structure
WebLogic
Coherence
OTD
Jennifer
19
OTD (Oracle Traffic Director)
OTDApplication A:XXX1
Instance 01 : XX.XX.0.1:0000
Instance 02 : XX.XX.0.2:0000
Instance 03 : XX.XX.0.3:0000
Instance 04 : XX.XX.0.4:0000
Application B:XXX2
…
Application C:XXX3
…
Application D:XXX4
…
Application B
Application C
Application D
Application A
Instance 01
Instance 02
Instance 03
Instance 04
20
Business Applications
WebLogic (Application Server)
Java
Fusion Middleware Installation - 11g
- 12c
bin
config
init-info
lib
pending
security
servers
startWebLogic.sh
stopWebLogic.sh
tmp
WebLogic Domain
Instance 02
- JRockit
- HotSpot
Instance 01
Instance 03
Instance 04
Exalogic Optimized
21
Coherence Cluster(having DAO)
Coherence (Data Grid Layer)
Java - JRockit
- HotSpot
Fusion Middleware Installation - 12c
DatabaseBusiness
Logic
Session
Object
22
Jennifer (Real Time Monitoring)
Application On Middleware
Instance 01
Instance 02
Instance 03
Instance 04
Jennifer ServerProcess Time
Method Profiling
System Status
IO Status
VM Status
23
System Design On Exalogic
24
Nonstop
Automated
Standardized
Design Keywords
25
Shared Resources
JavaMiddleware
Coherence Cluster
WebLogic Domain
ConfigurationOperation Tools
Application Logs
mount
mount
【 ZFS】
26
- bin
- config
- lib
- startWebLogic.sh
- stopWebLogic.sh
- servers
- instance 01/*
- instance 02/*
- instance 03/*
- log
- instance 01/*
- instance 02/*
- instance 03/*
SeparatedParameter for Private Resource
${weblogic.Name}
${tangosol.coherence.member}
ex) log, data
Separated Private Resources
Application On Middleware
Instance 01
Instance 02
Instance 03
WebLogic Domain
Shared
27
Rule Based System Configuration
Machines
IP Zone
Exalogic
WebLogic
Coherence
OTDJennifer
Two Parameters for Construction
Ports
Directories
App Name
App Number
28
Migrating ManagedServer
WebLogic Domain
WebLogic is Running on Floating IP
Instance 02 (10.10.1.2)
Instance 01 (10.10.1.1)
Instance 03 (10.10.1.3)
Instance 04 (10.10.1.4)
FailedInstance 01 (10.10.1.1)
Instance 03 (10.10.1.3)
Instance 04 (10.10.1.4)
Instance 02 (10.10.1.2)
Shutdown Moving IP Startup
29
Standardized Startup Scripts
- server_startup- startServer.sh
- scripts
- addon.sh
- automation.sh
- options.sh
- properties.sh
- options.conf
Jennifer Ready Remote Control
Customizing
Process Locking Log & Backup
JMX Ready
- Coherence Server / CUI Client
- WebLogic Admin & Managed Server
30
WebLogic Domain
nodemanager.properties
startScriptName=startServer.sh
….
….
….
NodeManager
NodeManager and Startup Scripts
- server_startup- startServer.sh
- scripts
- addon.sh
- automation.sh
- options.sh
- properties.sh
- options.confManagedServer
ManagedServer
AdminServer
31
Non-Stop Operation
32
Listener
Listen-Port : 8001
Main Feature for Nonstop Operation
OTD : rakuten.co.jp
Origin Server Pool
10.10.1.1:7003
10.10.1.2:7003
Origin Server Pool
10.10.2.1:7003
10.10.2.2:7003
33
10.10.1.1 : 7003
10.10.1.2 : 7003
OTD Configuration for Application
Listener
OTD : rakuten.co.jp
default-route : 8001
test-route : 9001
Origin Server Pool
10.10.2.1 : 7003
10.10.2.2 : 7003
34
A Domain
ManagedServer
Dual Domains for Application
B Domain
ManagedServer
8001 9001
Origin Server Pool B
10.10.1.1 : 7003
10.10.1.2 : 7003
Origin Server Pool A
10.10.2.1 : 7003
10.10.2.2 : 7003
35
OTD1.0
1.1OTD
1.0
1.1
Basic Concept
- Next Version Release
- Testing via Proxy
- Switching
- Current Service
- 20 Seconds
default-route
test-route
OTD1.0
0.9
- Current Service
36
6Hours
20Secs
Operating Time for Service Release
37
Sessionful Application
A Domaincoherence*web
B Domaincoherence*web
Getting Session Data
from Coherence Cluster
38
Application Package
Construction
WebLogic Domains
Jennifer Servers
1~2 WeeksOn Legacy Environment
A B
39
WebLogic Template
Automated
Tool
Jennifer Template
Application Name
Application Number
END Service INDeploy
Construction
40
Construction
5 MinutesOn Exalogic
Application Package
WebLogic Domains
Jennifer Servers A B
41
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
42
Release
Switching
TestingOperation
43
Release
Switching
TestingOperation
44
root/domain{A,B}
- app/foo-1.0.1.war
- configuration/foo.properties
- library/lib-0.2.2.jar
/root/domain{A,B}
- app/foo-1.0.0.war
- configuration/foo.properties
- library/lib-0.2.1.jar
Service Domain
Release Material
Release Material having Same Path
maven-assembly-plugin
45
Recipe for Release
Release Material
Release History Directory
Release Material
Standby Domain
Updating New or Modified Resource
Latest Release
46
Release Work Flow
New
Deployment
Release
FilesUndeployment
Restarting
WebLogic
47
Switching
Operation
Release
Testing
48
OTD Standby Domain
Testing Application via Proxy Server
intranet
Proxy
Service Domain
49
Problem Solving by Repetitive Testing
Updating
ApplicationTesting
Bug Fix
50
Operation
Release
Switching
Testing
51
Standby Domain
Real Time Service Switching
Service Domain
52
Release History Directory
root.20141025
Creating Operation Directory
Release Material
Operation Directory
${applicationName}/operation/root
renaming
link
53
Release
Testing
Switching
Operation
54
Service Domain
Standby Domain
Rollback to Previous Version
Release History Directory
Release Directory of Previous Version
If standby side is updated.. Release
Switching
55
ReleaseModifying Configuration
Normal Operation
Operation Directory
${applicationName}/operation/root
Service Domain
Standby Domain
56
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
57
WLST Operation Tools
Start / Shutdown
Deployment
Moving Instance
Undeployment
Getting Information
Modifying Configuration
58
Legacy System
Operation Documents
By System
By Material
By Time
A Reason Of High Operating Cost
Release Document
Release Document
Release Document
59
Exalogic
Operation Documents
Construction
Release
Switching
Operation
Recipe
Operation Recipe for All Application
60
Recipe Execution by Trigger
Trigger
Sequential Execution
Automatic Execution
Recipe
Arguments
61
Trigger System
62
Introduction
Middleware Architecture
Life Cycle
Cost Effective Operation
Exalogic Operation On OZ Manager
63
What is OZ Manager ?
64
Application Management Tool
scratchbuild
65
Why scratchbuild ?
66
because it’s Rakuten Ichiba
67
How is Rakuten Ichiba ?
Lots of teams
Lots of applications
Lots of instances
Lots of releases
68
How is Rakuten Ichiba ?
Lots of teams
Lots of applications
Lots of instances
Lots of releases
“team reliant operations”
more than
30teams
Diff on Rule
Diff on Judge
Diff on Procedure
69
How is Rakuten Ichiba ?
Lots of teams
Lots of applications
Lots of instances
Lots of releases
more than
150applications
“scattered apps & tools”
Diff on Language
Diff on Architecture
Diff on Monitoring
70
How is Rakuten Ichiba ?
Lots of teams
Lots of applications
Lots of instances
Lots of releases
more than
3000instances
“excessive alert mails”
10,000 mails on trouble
receivable only 700
Important mails get buried
71
How is Rakuten Ichiba ?
Lots of teams
Lots of applications
Lots of instances
Lots of releases
more than
1000releases in a year
“more trouble risks”
Required to attract
Bug occurs
Human error happens
72
What we needed
standardize basic operations
integrate and portalize management tools
collect log and pack alert mails
simplify and clarify for fast detection
73
OZ Manager
74
OZ Manager
1. Brief Overview
2. Plug-in configuration
3. Data relation
4. Monitoring for Exalogic
5. Demonstration
6. Future plan for Exalogic
75
OZ Manager
1. Brief Overview
Service Applications
Monitoring Tools
…
Log Collect Tools
LogCollector
operator
76
2. Plug-in configuration
plug-in manager
Checker Log Collect Application other
• Get status
• Stop
• Start
• Update
• Get results
• Get status
• Stop
• Start
• Update
• Get logs
• Get status
• Stop
• Start
77
nag
ios
Ex
alo
gic
logcollector
pan
do
ra
nagios
2. Plug-in configuration
plug-in manager
web
log
ic
Checker Log Collect Application
mo
du
le
log
ch
k
basic
log
perf
ch
k
perf
log
qp
slo
g
otd
orig
inch
k
CP
U %
ch
k
log
co
llecto
rad
min
ad
min
other
Ex
alo
gic
serv
ice
78
Host Application
exalogic
Cluster
rpage rpageA rpageA01 node1
rpageA02
rpageB01 node2
rpageB02
rpageB
orderapi
rpageA rpage-inst nodeX
Controlgroup
Filter Checker
error applog inst-chk cpu%
3. Data relation
Link
jennifer
exalogic
Component & Control
79
3. Data Relatoin
Authentication & Authorizatoin
User
Ryu
Usergroup
admin
Menu
User
Cluster
exalogic
Mall
Kweon-san
EC Core
Cluster Application Host
rpage
orderapi
userX
80
4. Monitoring for Exalogic
CPU usage
Load average
Filesystem usage
Burst process
Host - each node
Type Target Checker
Cluster
Weblogic service A/B
• Application log
• Performance log
• Alert log
Coherence A/B• Application log
• Stdout log
Application
system
Oracle EM• Admin server process
• Client agent process
Nodemanager • Process for each version
OTD • Process for each OTD configuration
weblogic instances • OTD origin online check
coherence instances • Coherence instance check
Host Each nodes
• Ping check
• Cpu usage
• Load average
• Filesystem usage
• Burst process
81
5. Demonstration
Demo Video
Basic Usage
82
5. Demonstration
Demo Video
Maintenance Tool
83
5. Demonstration
Demo Video
Command line Interface
84
5. Demonstration
Demo Video
Exalogic Special Features
85
6. Future plan for Exalogic
AB service side switch
instance location list
Instance relocation
Exalogic-coherence plugin
86
Operation Zero
87
top related