scaling with sql server and sql azure federations

41
Building Scalable SQL Azure Applications Using NoSQL Paradigms Michael Rys Principal Program Manager SQL Server Engine Team ([email protected] , @SQLServerMike)

Upload: michael-rys

Post on 20-Jan-2015

2.653 views

Category:

Education


4 download

DESCRIPTION

Slides for my presentation at the Seattle Hadoop/NoSQL Meetup (http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/events/40509972/). These slides are based on this earlier presentation: http://www.slideshare.net/MichaelRys/scaling-with-sql-server-and-sql-azure-federations.

TRANSCRIPT

Page 1: Scaling with SQL Server and SQL Azure Federations

Building Scalable SQL Azure Applications Using NoSQL Paradigms

Michael Rys Principal Program ManagerSQL Server Engine Team([email protected], @SQLServerMike)

Page 2: Scaling with SQL Server and SQL Azure Federations

Agenda

Scale-Out Application Pattern for SQL Server and SQL Azure: MySpace.com MSN Casual Games

Applications do all the work Databases provide platform support:

Reliable messaging infra structure Sharding in the data platform

Demo SQL Azure Federations Roadmap Discussion

Page 3: Scaling with SQL Server and SQL Azure Federations

MySpace: the Business Problem 223M users

4.4 million users concurrently

900 Terabytes of data Horizontally partitioned by

user Id 450 SQL Servers Required (eventual) data

consistency across databases E.g. show your updated

state in your friends’ profile pages

Page 4: Scaling with SQL Server and SQL Azure Federations

How to reflect change inmy friends’ DBs?- reliable- scalable

My DB gets updated

MySpace’s Data Consistency Problem

1-1000

1001-2000

2001-3000

3001-4000

4001-5000

5001-6000 Web Tier

Data Tier

I change my status

userId=1024

Page 5: Scaling with SQL Server and SQL Azure Federations

MySpace’s Solution

Propagate data changes from one DB to other DBs using reliable, async Message Service (Service Broker) Managing routes from each DB to every

other DB would be too complex Global Transactions would hinder scale

and availability And also used for

Clean-up state (e.g. on account close) Deploy business logic (stored procedures)

Page 6: Scaling with SQL Server and SQL Azure Federations

MySpace’s Service Dispatcher Coordination point between all SQL

Servers Centralizes route management Avoids routes explosion

Load-balanced across 30 SQL Servers Messages are sent randomly to these

Enables multicast/broadcast functionality Supports destination lists and wildcards

e.g. [DB1,DB3, DB4], DB% 18,000 ~2k msgs/sec per dispatcher

SQL Server

Page 7: Scaling with SQL Server and SQL Azure Federations

MySpace Architecture

1-1000

3001-4000

2001-3000

1001-2000

4001-5000

5001-6000

I change my status

userId=1024

Web Tier

Data Tier

ServiceDispatch

er

Service Broker

Service Broker

TX2

My DBgets updatedTX

1Service Broker

TX3

TX4

TX5

Page 8: Scaling with SQL Server and SQL Azure Federations

Many other customers using similar patterns

Online electronic stores (cannot give names )

Travel reservation systems (e.g. Choice International)

Page 9: Scaling with SQL Server and SQL Azure Federations
Page 10: Scaling with SQL Server and SQL Azure Federations

MSN Casual Games Architecture: Goals

Provide elastically scalable, high-available, agile online platform for integrated, social gaming experience Developed v1 to v3 with 7 devs in 3 months!

Support multiple gaming platforms: Windows Live Messenger Games MSN Games Bing Games

Integrate into social environments Favorite games, Friends’ highscores,

Compete against Friends, etc Social Networks: MSN, Facebook, etc

Page 11: Scaling with SQL Server and SQL Azure Federations

STS

Data Backend Services

MSN Casual Games Architecture: Overview

Ops

Bing Games

WLM Games

MSN Games

GameBar Host

Auth

l

l

MSN Games

Web Portal

WLM Games

Web Portal

Bing Games

Web Portal

Management Services

STS

Data Backend Services

Data Backend Services

Front Door Router

Services

Azure Data Centers

Social Networ

ks

Social Network

s

Page 12: Scaling with SQL Server and SQL Azure Federations

Why SQL Azure?

Faster than Table Storage Very low learning curve

Prototyped and written in less than 4 weeks by 1 dev

Testability: Easy to prepopulate

with millions of records Compatible with SQL

Server

Page 13: Scaling with SQL Server and SQL Azure Federations

How does it scale?• ~2 Million users at launch• ~86 Million services

requests/day • 135 Windows Azure Data

Services Hosting VMs • ca. 18K connections in

Connection Pools, this could grow with traffic

• Ca. 1200 SQL Azure requests/second spread across all partitions during peak load

• ~ 90% reads vs 10% writes (this varies per storage type)

• ~ 200 bytes of storage per user

• ~ 20% of database storage is currently used, but expect this to grow

Page 14: Scaling with SQL Server and SQL Azure Federations

Partitioning Strategies

• Built to scale: Functional and Data Partitioning

• 398 databases (10 Gig each)

• 100 social user information + 298 leaderboard databases• Social user

information partitioned by UserId

• Leaderboard partitioned by GameId

Page 15: Scaling with SQL Server and SQL Azure Federations

Data Backend Services

Front Door Router

Services

250 instances

STSSTS

DBUser …

Partitioned over 100 SQL Azure DBs

Social Service

Gamer Services

Game Ingestio

n

Social Services

Gamer Services

Game Ingestio

n

Game Catalog

Find Friends’ Profiles

DBLeaderboard

Partitioned over 298 SQL Azure DBs

Find Friends’ ProfilesGet my ProfilePublish feed, read feed

Last PlayedFavoritesGame PreferencesSocial Leaderboards

Disable/Enable Games from accessing services

Game binariesGame metadata

Get Friends highscores

DBUser …

Partitioned over 100 SQL Azure DBs

Write user specific game infos

250 instances

Page 16: Scaling with SQL Server and SQL Azure Federations

Common Data Services Features

• Fanout: Parallel calls to multiple database partitions

• Quorum: Able to tolerate a percentage of request failures during Fanout

• Retry: Retry on database requests error

Page 17: Scaling with SQL Server and SQL Azure Federations

Operational Games DashboardStatistics per game

Page 18: Scaling with SQL Server and SQL Azure Federations

Lessons Learned from both scenarios

Require high availability Be able to scale out:

Functional and Data Partitioning Architecture Provide scale-out processing:

Function shipping Fanout and Map/Reduce processing

Be able to deal with failures: Quorum Retries Eventual Consistency (similar to Read-consistent Snapshot Isolation)

Be able to quickly grow and change: Elastic scale Flexible, open schema Multi-version schema support

Move better support for these patterns into the Data Platform!

Page 19: Scaling with SQL Server and SQL Azure Federations

What is NoSQL? It is different things to different people But for all about App Availability, Scalability, Agility and low cost! Cost: Cheap to build, Cheap to operate Processing Paradigms

High Availability (scalable Replication, Fast Failover, DR/GeoDR, tunable latency)

Scale-out (Sharding, Map-Reduce, Elasticity) Performance (workload optimized stores, Caching, co-located compute with

partitioned state) Tunable/Eventual Consistency

Data Model Paradigms Data first: Flexible Open Schema Low-impedance mismatch between programming and data model:

Wide Tables: For “Relational” Data HyperTable, Windows Azure Tables, Hbase, SQL Server Sparse Columns

Graph stores: For “Relationship” Graphs, “Semantics” Key/document stores: For “Object” and “hierarchical data”

JSON: MongoDB, CouchBase, Riak etc. XML: Marklogic etc.

NoSQL is all about operational and developer agility at low CapEx and OpEx!

Page 20: Scaling with SQL Server and SQL Azure Federations

Operational Agility

You want: Availability of service (scalability) Global consistency Network Partition Tolerance

You can only get 2 of 3 (CAP Theorem) In Brave New World:

Online businesses need availability It is distributed, because it is big thus Network Partitioning is unavoidable Hence global consistency must be relaxed

Page 21: Scaling with SQL Server and SQL Azure Federations

Operational Agility

Performance and Elastic Scale on Demand Automate management lifecycle (or fail) Simple deployment lifecycle No DB or OS Admin telling me what to do

Page 22: Scaling with SQL Server and SQL Azure Federations

Developer Agility

Code First and revise quickly Application-model first (before database) Flexible open data models You don’t know exactly what you are

looking for Lower Pain of adoption and maintenance No DB or OS Admin telling me what to do

Page 23: Scaling with SQL Server and SQL Azure Federations

Primary Shard

Readable Replica

Readable Replica

Primary Shard

Readable Replica

Readable Replica

Primary Shard

Readable Replica

Readable Replica

OLTP Workloads(mostly touching 1 to low number of shards)

Dynamic OLAP Workloads(Scale-out queries, often using eventual consistent scale-out frameworks like Hadoop)

Generic “Web 2.0” Scale-Out Architecture

Page 24: Scaling with SQL Server and SQL Azure Federations

Introducing SQL Azure Federations Provides Data Partitioning/Sharding at

the Data Platform Enables applications to build elastic

scale-out applications Provides non-blocking SPLIT/DROP for

shards (MERGE to come later) Auto-connect to right shard based on

sharding keyvalue Provides SPLIT resilient query mode

Page 25: Scaling with SQL Server and SQL Azure Federations

SQL Azure Federation Concepts

25

Federation “Orders_Fed”

ShardedApplication

Azure DB with Federation Root

Federation Directories, Federation Users, Federation Distributions, …

Federation- Represents the data being sharded

Federation Root- Database that logically houses federations,

contains federation meta data Federation Key

- Value that determines the routing of a piece of data (defines a Federation Distribution)

Federation Member (aka Shard)- A physical container for a set of federated

tables for a specific key range and reference tables

Federated Table- Table that contains only atomic units for the

member’s key range Reference Table

- Non-sharded table Atomic Unit

- All rows with the same federation key value: always together!

Member: PK [min, 100)

Member: PK [100, 488)

Member: PK [488, max)

(Federation Key: CustomerID)

AUPK=

5

AUPK=25

AUPK=35

AUPK=105

AUPK=235

AUPK=365

AUPK=555

AUPK=254

5

AUPK=356

5

Con

nectio

n

Gate

way

Page 26: Scaling with SQL Server and SQL Azure Federations

Demo: Map-Reduce scale-out over SQL Azure Federations

• Sharded GamesInfo table using SQL Azure

Federations

• Use a C# library that does implement a

Map/Reduce processor on top SQL Azure

Federations

• Mapper and Reducer are specified using SQL

Page 27: Scaling with SQL Server and SQL Azure Federations

SQL Azure: A Not Only SQL Data PlatformSQL Azure is adding data platform support for NoSQL paradigms in the data platform:

No CapEx, Low OpEx (which should/will be even lower )

High-Availability (each DB has two replicas)

Sharding support with federations: Data platform provides online SPLIT/DROP Filtered connection to provide split resilient programming model

Flexible Data Models: XML support Sparse columns/Column sets

More to come in the future… More scale and tunable HA (to support OLTP/OLAP model) Taking Federations further (orthogonality, merge, fanout) Integration with Hadoop eco-system More data-first (data-driven columnsets, JSON)

Page 28: Scaling with SQL Server and SQL Azure Federations

Related Resources

Windows Gaming Experience Case Study: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000008310

Related Whitepapers: CACM: Scalable SQL: http://

cacm.acm.org/magazines/2011/6/108663-scalable-sql NoSQL and the Windows Azure Platform: http://

download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf

SQL Federation blog: http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx

Contact me: [email protected] @SQLServerMike http://sqlblog.com/blogs/michael_rys/default.aspx

Page 29: Scaling with SQL Server and SQL Azure Federations

Appendix

SQL Azure Federations Details

Page 30: Scaling with SQL Server and SQL Azure Federations

30

CREATE FEDERATIONExisting Database

Gat

eway

CREATE FEDERATION sales (customer_id bigint RANGE)

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

Page 31: Scaling with SQL Server and SQL Azure Federations

31

Federation with a Single Shard

Gat

eway

Existing Database

salesDatabase root contains:• Federation root = DB level object

containing federation scheme• Federation users• Federation metadata incl. federation

map

Federation Member

Range: Min...Max

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

CREATE FEDERATION sales (customer_id bigint RANGE)

Page 32: Scaling with SQL Server and SQL Azure Federations

32

Introducing Two Connection Modes

• Filtered Connection– Guarantees that any queries or DML will produce the

same results independent of changes to the physical layout of the federation members

– Scoped to an “Atomic Unit”• Unfiltered Connection– Scoped to a Federation Member– Management Connection

Page 33: Scaling with SQL Server and SQL Azure Federations

33

Create Schema on Member: Management Connection

© 2011 Microsoft Corporation. Microsoft Materials - Confidential. All rights reserved. CITA #

MSFT101120_A

Gat

eway

Existing Database

sales

Federation Member

Range: Min...Max

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=0) WITH FILTERING=OFF, RESET;

Customer Order Product

federated

federatedNon-

federated

CREATE TABLE …

Page 34: Scaling with SQL Server and SQL Azure Federations

34

DDL

CREATE TABLE customer ( c_id bigint PRIMARY KEY, … ) FEDERATED ON (customer_id=c_id);

CREATE TABLE order ( item_num int, customer_id bigint, date_sold datetime2, …, CONSTRAINT PK_Order PRIMARY KEY (item_num, customer_id, date_sold), CONSTRAINT FK_Cust FOREIGN KEY customer_id REFERENCES customer (customer_id) ) FEDERATED ON (customer_id=customer_id);

CREATE TABLE product ( product_name varchar(100) NOT NULL, unit_price money, item_num int PRIMARY KEY, … );

Page 35: Scaling with SQL Server and SQL Azure Federations

35

More Detail

• Supported data types for federation key : bigint, int, GUID, and varbinary (900)– Only range partitioning

• Federation key must be part of unique index• Foreign key constraints only allowed between federated tables and from

federated table to reference table• Not all Azure programmability features supported

– Sequence, timestamp• Additional surface area restrictions

– Indexed views, drop database (members)• Schemas are allowed to diverge over time

– Furthermore, in v1, schema updates to existing members must be done in each member (where the change is needed)

• USE FEDERATION “rewires a connection”– Connection is reestablished– All existing settings and context of the connection is lost (sp_reset_connection)– Must be in a batch by itself

Page 36: Scaling with SQL Server and SQL Azure Federations

36

Connect to Atomic Unit: Filtered

Gat

eway

Existing Database

sales

Federation Member

Range: Min...Max

customer order product

33

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=3) WITH FILTERING=ON, RESET;

When using into a specific key value, SELECT will only return records from federated tables that match that value. It will still return all records from non-federated tables.Inserts and UPDATES operating outside of the value will fail.

SELECT * from customer

SELECT * from order

SELECT * from product

Page 37: Scaling with SQL Server and SQL Azure Federations

37

More on Connection Filtering

• Most operations behave differently in filtered vs unfiltered connections

• Connection filtering is a property of the session– Filter injected dynamically at runtime– Cannot inspect source code to determine how it behaves

• E.g., running stored proc written for filtered mode on unfiltered connection could lead to unintended results

• There are several operations that will not work in filtered connection in v1– DDL, DML on reference tables, …

• Fan-out, bulk operations not efficient in filtered mode– For now, filter=off is our best offer

Page 38: Scaling with SQL Server and SQL Azure Federations

38

Support Matrix

Connection Type Filtered Unfiltered Named (unfiltered)Operation

Dynamic SELECT P P PDML* (federated tables) P P PDML* (reference tables) X P P

DDL X P PViews (not indexed) P P P

UDF - activate P P PStored Proc - activate P P P

Trigger (all modes) - activate P P PCREATE/UPDATE Stats X P P

Bulk Opsopenrowset bulk, bcp, bulk

insert X P P

* not including SELECT & modules

^ autostats will work on all connections

System stored procs, intrinsics will be unaffected (run unfiltered)

Page 39: Scaling with SQL Server and SQL Azure Federations

39

Splitting a Member

Gat

eway

Existing Database

sales

Federation Member

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDERATION ROOT WITH RESET

ALTER FEDERATION salesSPLIT AT (customer_id=50)

Using to the federation ROOT will pop you out of a member back into the database that hosts the federation

Range: Min...Max

customer order product

3

58

3

58

58

40

Page 40: Scaling with SQL Server and SQL Azure Federations

40

Two New Members

Gat

eway

Existing Database

sales

Federation Member

Range: Min...50

customer order product

33

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION ROOT WITH RESET

ALTER FEDERATION salesSPLIT AT (customer_id=50)

Federation Member

Range: 51...Max

customer order product

5858

5840

Page 41: Scaling with SQL Server and SQL Azure Federations

41

Two New MembersExisting Database

sales

Range: Min...50

customer order product

40

Conn

ectio

n:Server=az1cl321.db.windows.net;

Database=MyDB;

User=AppUser;

Passwd=****;

USE FEDEDERATION sales (customer_id=40) WITH FILTERING=ON, RESET;

Range: 51...Max

customer order product

5858

58

Gat

eway

40

SELECT *from customer

SELECT * from order