evolution of database services

26
Evolution of database services Eva Dafonte Pérez 2014 WLCG Collaboration Workshop 1

Upload: taini

Post on 21-Mar-2016

43 views

Category:

Documents


2 download

DESCRIPTION

Evolution of database services. Eva Dafonte Pérez 2014 WLCG Collaboration Workshop. Outline. CERN’s databases overview Service evolution HW Storage Service configuration SW New database services Replication Database On Demand (DBoD) service Hadoop + Impala Future plans Summary. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evolution of database services

1

Evolution of database services

Eva Dafonte Pérez2014 WLCG Collaboration Workshop

Page 2: Evolution of database services

2

Outline CERN’s databases overview Service evolution

HW Storage Service configuration SW

New database services Replication Database On Demand (DBoD) service Hadoop + Impala Future plans Summary

Page 3: Evolution of database services

3

CERN’s databases ~100 Oracle databases, most of them RAC

Mostly NAS storage plus some SAN with ASM ~500 TB of data files for production DBs in total

Example of critical production DBs: LHC logging database ~170 TB, expected growth up to ~70 TB / year

But also as DBaaS, as single instances 120 MySQL Open community databases 11 PostgreSQL databases 10 Oracle11g

And additional tests setups: Hadoop + Impala, CitusDB

Page 4: Evolution of database services

4

Our deployment model DB clusters based on RAC

Load balancing and possibility to growth High Availability – cluster survives to node failures Maintenance – rolling interventions

Schema based consolidation Many applications share the same RAC cluster Per customer and/or functionality

Example: CMS offline database cluster

RAC Instance 1

Clusterware

OS

RAC Instance 2

Clusterware

OS

Shared Storage

Public Network

Private Network

Storage Network

Page 5: Evolution of database services

5

Service evolution Preparation for RUN2

Changes have to fit LHC schedule New HW installation in BARN

Decommission of some old HW Critical power move from current location to BARN

Keep up with Oracle SW evolution Applications’ evolution - more resources needed Integration with Agile Infrastructure @CERN LS1: no stop for the computing or other DB services

HW Migration

SW upgrade

Stable servicesDecomission

New systems Installation

Page 6: Evolution of database services

6

Hardware evolution 100 production servers in the BARN

Dual 8 core XEON e5-2650 128GB/256GB RAM 3x10Gb interfaces

Specific network requirements IP1, ATLAS PIT, Technical Network, routed and non-routed

network New generation of storage from NetApp

Page 7: Evolution of database services

7

Storage evolution

FAS3240 FAS8060

NVRAM 1.0 GB 8.0 GB

System memory 8GB 64GB

CPU 1 x 64-bit 4-core 2.33 Ghz

2 x 64-bit 8-core 2.10 Ghz

SSD layer (maximum)

512GB 8TB

Aggregate size 180TB 400TB

OS controller Data ONTAP® 7-mode

Data ONTAP® C-mode

scaling up

scaling out

Page 8: Evolution of database services

8

Storage consolidation

56 controllers (FAS3000) & 2300 disks (1400TB storage)

…1 2 7

14 controllers (FAS6220) & 960 disks (1660 TB storage)

Page 9: Evolution of database services

9

Storage setup DB cluster A

DB cluster B

Page 10: Evolution of database services

10

Easy management More capacity Transparent volume move Caching: flash cache and flash pool Performance improvement

~2-3 times more of overall performance Difficulties finding slots for interventions

Storage setup

Page 11: Evolution of database services

11

Service configuration - Puppet Following CERN IT’s strategy, IT-DB group adopted Puppet Good occasion to re-think how the services are configured and

managed Rely on the same Syscontrol-LDAP* data source for Quattor-

managed services Developed custom modules for:

Private storage and network configuration Database installation Backups configuration

Removed ssh keys and service accounts in favour of kerberos+sudo Improves traceability & manageability

RHEL 5 8 RHEL 6

* Syscontrol-LDAP for IT-DB: stores configuration for IT-DB services

Page 12: Evolution of database services

12

New Oracle releases Production was running on 11.2.0.3 (Oracle 11g) 11.2.0.4

Terminal patch set Additional support fees from January 2016 Extended support ends January 2018

(Oracle 12c) 12.1.0.1 First release Next patch set 12.1.0.2 coming in Q3 2014 Educated Guess: users of 12.1.0.1 will have to upgrade to

12.1.0.2 or higher by 2016

No current Oracle version fits well the entire RUN 2

Page 13: Evolution of database services

Oracle upgrade Move IT-DB services to Oracle 12c gradually

Majority of DB services upgraded to 11.2.0.4 Few candidate services upgraded to 12.1.0.1

ATLARC, LHCBR, PDBR, LEMON, CSDB, CSR, COMPR, TIMRAC Compatibility kept to 11.2.0.3

12c Oracle clusterware deployed everywhere Does not conflict with 11g version of RDBMS

Newer 12c releases being/will be tested

13

Create new development DB

on 12.1.0.1 – devdb12

September 2013

Upgrade devdb11 DB to 11.2.0.4 October 2013

Move integration DBs to new HW November 2013

Upgrade test and integration DBs December 2013

and January 2014

Test restores & upgrades of prod DBs on new HW

Q1 2014

Move to new HW and/or upgrade of production DBs

Q1 and Q2 2014

Page 14: Evolution of database services

14

New database services QPSR

Quench Protection System Will store ~150K rows/second (64GB per redo log) 1M rows/second achieved during catch-up tests Need to keep data for a few days (~ 50 TB) Doubtful if previous HW could have handled that

SCADAR Consolidated WinCC/PVSS archive repository Will store ~50-60K rows/second (may increase in the future) The data retention varies depending on the application (from a

few days to 5 years)

Page 15: Evolution of database services

15

Replication Plan to deploy Oracle Golden Gate at CERN and Tier1s

In order to replace Oracle Streams replication Streams is phased out Some online to offline setups were already replaced by Oracle Active

Data Guard Replication Technology Evolution Workshop @CERN in

June Migration plan agreed with experiments and Tier1s Centralised GG configuration

GG software only at CERN Trail files only at CERN No GG management at T1s

AC

E R

Remote site replica

Downstream cluster

R

R

Page 16: Evolution of database services

16

Database On Demand (DBoD)• Openstack• Puppetdb (MySQL)• Lhcb-dirac• Atlassian databases• LCG VOMS• Geant4• Hammercloud dbs• Webcast• QC LHC Splice• FTS3• DRUPAL• CernVM• VCS• IAXO• UNOSAT• …

Page 17: Evolution of database services

17

DBoD evolution PostgreSQL since September 2013 Deprecated virtualization solution based on RHEL + OVM HW servers and storage evolution as for the Oracle database

services Migration to CERN Agile infrastructure

Customized RPM packages for MySQL and PostgreSQL servers High Availability cluster solution based on Oracle clusterware

4 nodes cluster (3 nodes active + 1 as spare) SW upgrades

MySQL currently migrating to 5.6 Oracle 11g migrating towards Oracle 12c multi-tenancy

Tape backups

Page 18: Evolution of database services

18

Hadoop Using raw MapReduce for data processing requires:

Abstract decomposition of a problem into Map and Reduce steps Expertise in efficient Java programming

Impala – SQL engine on top of HDFS Alternative solution to MapReduce Data cache available Easy access: JDBC driver provided

Performance benchmarks on synthetic data (early results) Test case: simple CSV table mapping Full scan 36 million rows (small sample): 5.4M rows/sec

Cached: 10x faster Full scan of 3.6 billion rows (100x more): 30M rows/sec

IO: ~3.7GB/s with storage throughput ~4GB/s Cache does not help – too large data set for cache

Page 19: Evolution of database services

19

Future plans New HW installations

Second cluster in BARN Wigner (for Disaster Recovery)

SW upgrades First Oracle 12c patch set 12.1.0.2 (Q3 2014)

More consolidation – run different DB services on the same machine Study the use of Oracle Golden Gate for near zero downtime upgrades Quattor decommission DBoD

High density consolidation Cloning and replication Virtualization as OpenStack evolves

Hadoop + Impala Columnar storage (Parquet) Importing data from Oracle into Hadoop Tests with production data (WLCG dashboards, ACC logging, …) Analyze different engines (Shark)

Page 20: Evolution of database services

20

Summary HW, Storage, SW and configuration evolution for the DB

service during last year Complex project Many people involved at various levels

Experience gained will be very useful for the new installations

Careful planning is critical Validation is a key to successful change New systems give more capacity and stability for RUN2 New services provided

and more coming Keep looking at new technologies

Page 21: Evolution of database services

21

Q&A

Page 22: Evolution of database services

22

Page 23: Evolution of database services

23

7-mode vs C-mode

Private network

client access

Private network

Cluster interconnect

Cluster mgmt network

client access

Page 24: Evolution of database services

24

Flash cache and flash pool Flash cache

Helps increase random IOPS on disks Warm-up effect

Controller operations (takeover/giveback) invalidate the cache Flash pool

Based on SSD drives Availability to cache random overwrites Heat map in order to decide what stays and for how long in SSD

cache

Flash Cache Flash Pool

Write to disk

read read

overwrite

Eviction scanner

Eviction scanner

Insert into SSD

Insert into SSD

read

write

Every 60 secs & SSD consumption > 75%

hot warm neutral cold evict

evictcoldneutral

Page 25: Evolution of database services

25

Hadoop Sequential data access with Oracle and Hadoop: a performance

comparison Zbigniew Baranowski – CHEP 2013

Test case – counting exotic muons in the collection Oracle performs well for small clusters but scaling ability is limited by

shared storage Hadoop scales very well However, writing efficient Map Reduce code is not trivial

Page 26: Evolution of database services

26

Hadoop + Impala 4 nodes each:

Intel Xeon L5520 @2.27GHz (Quad) 24GB RAM

> 40TB storage (sum of single HDDs capacity)

Pros: Easy to setup – works „out of the box” Acceptable performance Promising scalability

Cons: No indexes SQL is not everything on RDBMS!