scalable distributed database system: sd-sql server

31
1 Scalable Distributed Database System: SD-SQL Server Soror SAHRI Ceria, Paris-Dauphine University [email protected] Journées Académiques Microsoft Paris, April 2006

Upload: korene

Post on 14-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Scalable Distributed Database System: SD-SQL Server. Soror SAHRI Ceria, Paris-Dauphine University [email protected] Journées Académiques Microsoft Paris, April 2006. Overview. Objective SD-SQL Server Architecture Application Interface Implementation Performance Analysis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scalable Distributed Database  System: SD-SQL Server

1

Scalable Distributed Database System: SD-SQL Server

Soror SAHRICeria, Paris-Dauphine University

[email protected]

Journées Académiques MicrosoftParis, April 2006

Page 2: Scalable Distributed Database  System: SD-SQL Server

2

Overview

Objective SD-SQL Server Architecture Application Interface Implementation Performance Analysis Conclusion

Page 3: Scalable Distributed Database  System: SD-SQL Server

3

Objective

Most of DBSs have distributed/parallel versions SQL Server, Oracle, DB2…

DBSs do not provide dynamic partitioning of relational tables All require manual (distributed) repartitioning when tables scale-up

Page 4: Scalable Distributed Database  System: SD-SQL Server

4

Objective

http://ceria.dauphine.fr/CERIA-publications.html Research Report, December 2005

[Oracle Database 10g]

Page 5: Scalable Distributed Database  System: SD-SQL Server

5

Objective

Scalable Distributed Partitioning of Relational Tables

Scalable Distributed Database System (SD-DBS)

Litwin, Rich & Schwartz “Architecture for a scalable Distributed DBSs”

2nd Intl. Workshop on Cooperative Internet Computing, 2002

Page 6: Scalable Distributed Database  System: SD-SQL Server

6

SD-SQL Server

SD-SQL Server is the first prototype SD-DBSGeneralizes SQL Server

Shared Nothing Architecture Embeds SQL Server 2000 (at present) Up to 250 nodes (at present)

Page 7: Scalable Distributed Database  System: SD-SQL Server

7

SD-SQL Server Architecture

LinkedSQL

Servers

D1 D2 Di Di+1

S S PC D1_T

_D1_T _D1_T

User/Application

T

sd_create_table

User/Application

sd_insert

_D1_TSplit

NDBs

SD-SQLserver

SD-SQLserver

SD-SQLclient

SD-SQL Server

Managers

SD-SQLpeer

Page 8: Scalable Distributed Database  System: SD-SQL Server

8

Nodes, NDBs and SDBs

DB1 SDB

Node1 Node2 Node3 Nodei

DB1

……

DB1 DB1

DB2 SDB

DB2 DB2

MDB

Page 9: Scalable Distributed Database  System: SD-SQL Server

9

NDB Types

An SD-SQL Server NDB can be: Client NDB

Carries only images Application interfaces

Server NDB Carries only the segments

Peer NDB Both functions

Primary NDB First created for an SDB

Page 10: Scalable Distributed Database  System: SD-SQL Server

10

Scalable Tables

A scalable (distributed) table is a collection of segments Segments are SQL tables

A scalable table has, initially, only one primary segment At some server or peer NDB

The number of segments in a scalable table is variable Overflowing segments split

Page 11: Scalable Distributed Database  System: SD-SQL Server

11

Splitting

A split occurs when an insert overflows the segment capacity

Splits produce other segments for a scalable table. Each is located at a different NDB

Within the SDB

If there is not enough NDBs, splits dynamically append new ones

Page 12: Scalable Distributed Database  System: SD-SQL Server

12

SDB DB1SDB DB1SDB DB1

Splitting

Scalable Table T

sd_insert

N1 N2 N4N3

NDBDB1

NDBDB1

NDBDB1

sd_insert

NDBDB1

Ni

sd_create_node

sd_insert

N3

NDBDB1

sd_create_node_database

NDBDB1

…….

sd_create_node_database

Page 13: Scalable Distributed Database  System: SD-SQL Server

13

Splitting

S

b+1

S S1

pb+1-p

p=INT(b/2)

C( S)= { c: c < h = c (b+1-p)}

C( S1)={c: c > = c (b+1-p)}

Check Constraint?

b

SELECT TOP Pi * INTO Ni.Si FROM S ORDER BY C ASCSELECT TOP Pi * WITH TIES INTO Ni.S1 FROM S ORDER BY C ASC

Page 14: Scalable Distributed Database  System: SD-SQL Server

14

Images

Images hide the scalable table segments An image is a distributed updateable partitioned

view of a scalable table Union-all view with check constraints

An image presents the scalable table partitioning It do not address any new segments resulted from a

split

Page 15: Scalable Distributed Database  System: SD-SQL Server

15

Images

Primary image Resides at the creation node Has the name of the scalable table

Secondary images Reside at other client or peer NDBs of the SDB Have a specific name, other than that of the

table To avoid name conflict

Scalable Views Carry images at several levels

Page 16: Scalable Distributed Database  System: SD-SQL Server

16

Images

DB1 SDB

N1.DB1 N2.DB1 N3.DB1

T Scalable Table

PrimaryImage

CREATE VIEW T AS SELECT * FROM N1.DB1._Ni_TCREATE VIEW T AS SELECT * FROM N1.DB1._Ni_T UNION ALL SELECT * FROM N2.DB1._Ni_T UNION ALL SELECT * FROM N3.DB1._Ni_T

Page 17: Scalable Distributed Database  System: SD-SQL Server

17

Application Interface

Applications manipulate SD-SQL Server through its command interface.

SD-SQL Server commands start with ‘sd_’ To distinguish from SQL Server commands for static

tables.

INSERT sd_insert

CREATE TABLE sd_create_table

Page 18: Scalable Distributed Database  System: SD-SQL Server

18

Application Interface:The SkyServer Benchmark

Use of the SkyServer DB as benchmark http://research.microsoft.com/~gray/SDSS Use of the PhotoObj table as a scalable table. PhotoObj has 158,426 tuples (about 260 MB)

Page 19: Scalable Distributed Database  System: SD-SQL Server

19

Application Interface: Node Management

Node Creation sd_create_node ‘Dell1’ /* Server by default */ sd_create_node ‘Ceria’, ‘client’

Node Alteration sd_alter_node ‘Ceria’, ‘ADD server’ /* Becomes peer*/

Node Removal sd_drop_node ‘Ceria’

Page 20: Scalable Distributed Database  System: SD-SQL Server

20

Application Interface: NDBs &SDBs Management

SDB Creation sd_create_scalable_database

‘SkyServer’, ‘Dell1’, ‘Server’, 2/* Creates the primary SkyServer NDB as well at Dell1*/

NDB Creation sd_create_node_database

‘SkyServer’, ‘Ceria’, ‘Client’ SDB Removal

sd_drop_scalable_database ‘SkyServer’

Page 21: Scalable Distributed Database  System: SD-SQL Server

21

Application Interface: Scalable Tables Management

Scalable Table Creation sd_create_table ‘PhotoObj (objid BIGINT PRIMARY

KEY…)’, 10000 Scalable Table Alteration

sd_alter_table ‘PhotoObj ADD t INT’, 1000 sd_create_index ‘run_index ON Photoobj (run)’ sd_drop_index ‘PhotoObj.run_index’

Scalable Table Removal sd_drop_table ‘PhotoObj’

Page 22: Scalable Distributed Database  System: SD-SQL Server

22

Application Interface: Scalable Tables Management

Insert sd_insert ‘INTO PhotoObj SELECT * FROM

Ceria5.Skyserver-S.PhotoObj’ Search

sd_select ‘* FROM PhotoObj’ sd_select ‘TOP 5000 * INTO PhotoObj1 FROM

PhotoObj’, 500

Page 23: Scalable Distributed Database  System: SD-SQL Server

23

Application Interface

See “Architecture and Interface of Scalable Distributed Database System SD-SQL Server”, IASTED-DBA 2006

Page 24: Scalable Distributed Database  System: SD-SQL Server

24

Implementation

Commands are distributed stored procedures Processed as distributed transactions at Repeatable

Read isolation level See

“ Prototyping a Scalable Distributed Database System SD-SQL Server”, WDAS 2006

Page 25: Scalable Distributed Database  System: SD-SQL Server

25

Performance Analysis (1)

0

500

1000

1500

2000

39500 79000 158000

Segment Capacity (tuples)

Exe

cuti

on

Tim

e (m

s)

SD-SQL Server Peer With IA SD-SQL Server Peer

SQL Server Peer SD-SQL Server Client With IA

SD-SQL Server Client SQL Server Client

Query (Q1) execution time

(Q1) sd_select ‘COUNT (*) FROM PhotoObj’

Page 26: Scalable Distributed Database  System: SD-SQL Server

26

Performance Analysis (2)

9862.4

11578.6610898

13036.5 14071.66

11712

0

2000

4000

6000

8000

10000

12000

14000

16000

2 3 4

Number of Segments

Execu

tio

n T

ime (

ms)

IA IC

Query (Q2) with image checking only (IC) and with image adjustment (IA)

(Q2) sd_select ‘TOP 10000 x.objid FROM photoobj x, photoobj y WHERE x.obj=y.obj AND x.objid>y.objid’

Page 27: Scalable Distributed Database  System: SD-SQL Server

27

Performance Analysis (3)

sd_select ‘COUNT (*) FROM PhotoObj’

326250

220156

93

343256

226164

106

436

356

93

203283

16 76123

203 220

0100200300400500

1 2 3 4 5

Number of Segments

Exe

cutio

n T

ime

(ms)

SQL Server-Distr SD-SQL Server

SQL Server-Centr. SD-SQL Server LSV

Comparison between SD-SQL Server and SQL Server

Page 28: Scalable Distributed Database  System: SD-SQL Server

28

Conclusion

Scalable distributed partitioning of relational tables is now reality with SD-SQL Server No manual repartitioning

Unlike in any other DBS we know about

Performance analysis proves Efficiency of our design Immediate utility of SD-SQL Server

Witold Litwin
Acknowledgments svp comem chez Riad + Jim Gray pour SkyServerCurrent et future work ? N'oubliez pas EGov svp, vous etes payee par ce projet.
Page 29: Scalable Distributed Database  System: SD-SQL Server

29

Future Works

Management of replication, data failures.. Non-availability of a server….

Management of the data merging

Application on other DBMSs Oracle, DB2….

Witold Litwin
Acknowledgments svp comem chez Riad + Jim Gray pour SkyServerCurrent et future work ? N'oubliez pas EGov svp, vous etes payee par ce projet.
Page 30: Scalable Distributed Database  System: SD-SQL Server

30

Work partly supported by

CEE Project EGov MS Research CEE Project ICONS IBM Almaden Res. Cntr., CA HP Laboratories, CA

Page 31: Scalable Distributed Database  System: SD-SQL Server

31

Thank You

Questions?

Soror SAHRICeria, Paris-Dauphine University

[email protected]

Journées Académiques MicrosoftParis, April 2006