dbi316. building and maintaining applications with relational and non-relational data is hard...

48
Microsoft SQL Server Beyond Relational Landscape: Current and Future Michael Rys ([email protected], @SQLServerMike) Principal Program Manager, SQL Server Microsoft Corp. DBI316

Upload: meagan-patrick

Post on 19-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Microsoft SQL Server Beyond Relational Landscape: Current and Future

Michael Rys ([email protected], @SQLServerMike)Principal Program Manager, SQL ServerMicrosoft Corp.

DBI316

Page 2: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Data

Building and Maintaining Applications with relational and non-relational data is hard

Complex integrationDuplicated functionalityCompensation for unavailable services

Pain Points

Goals

Reduce the cost of managing all dataSimplify the development of applications over all dataProvide management and programming services for all data

Page 3: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

My favorite Beyond Relational Application

Structured and unstructured Search

Related/”Semantic” Search

Page 4: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

What is the Beyond Relational Vision?

Efficient storage for all dataTables, XML, Spatial, Documents, Digital Media, Scientific Records, Factoids…

Rich Data Processing Capabilities for all applicationsData formats and content natively understood for rich application and user experienceConsistent Application Model and Data Constructs to ease application development, migration and long-term retention

Rich Capabilities and Services over all dataProvide rich services, e.g.,

Query and Reason over data and extracted semanticsSearch across structural impedance of different data formatsIntegrated backup/restore for all data

Page 5: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Story

StructuredData

Query

T-SQL

B-trees

ManageabilityAvailability

Files

Programmability

Page 6: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Story

StructuredData

Query

T-SQL

B-trees

ManageabilityAvailability

Files

Programmability

Unstructured Data

Search

Page 7: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Story

StructuredData

Query and Type Operations

T-SQL/Data Types

B-trees

ManageabilityAvailability

Files

Programmability

Unstructured Data

Search

Filestream

Win 32

Semi-structuredData/XML

XML, FTS, SpatialIndices

XQuerySpatial ops

Spatial, XML, HierarchyID

Page 8: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Story

StructuredData

Query and Type Operations

T-SQL/Data Types

B-trees

Manageability& Availability

Programmability

Unstructured Data

Search

Win 32

Semi-structuredData/XML

Semantic Platform

Efficient Storage for BR Data

Rich Query and Search Services over all Data

Rich Data ProgrammingCapabilities

Files

Filestream

XML, FTS, SpatialIndices

XQuerySpatial ops

Spatial, XML, HierarchyID

Page 9: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational Feature OverviewSQL Server

2005SQL Server 2008

FileTable (Win 32 I/O)Scale-up FileStreamScale-up SearchSearch functionalitySemantic Similarity

Reliable Messaging

Service Broker

Full Text IndexingRich unstructured Data & Services

Remote BLOB Store APIFilestreamIntegrated FTS

Spatial

Fully supported Geometry and Geography data types and Functions

XML Data Type XQueryXML Schema

XML UpgradesLarge UDTsSparse ColumnsWide Table/ColumnSetFiltered IndicesHierarchyID

Semistructured Data & Documents

SQL Server Denali

Reporting Services support

FullGlobe2D ExtensionsPervasive Spatial

over FileStream

R2

Multi-castEnqueue time

Poison-Message handling

with RCSI

Page 10: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Beyond Relational in SQL Server Denali

Address important customer requests for Capabilities and rich services for Rich Unstructured Data (RUDS)

Scale Up for storage and search to 100mio to 500mio documentsEasy use/access to Unstructured data from all applicationsRich insight into unstructured data to make better decisions

We deliver what you asked for to build Spatial-aware Applications

Advanced 2D SpatialMake Spatial pervasive across platform

Page 11: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Rich Unstructured Data & Services Ecosystem

Fulltext Search

Semantic Key Phrase

Extraction

Rich

Serv

ices

Database

Disk1 Disk2 Disk3

Multiple Containers

Sca

le-u

p

Solu

tion

s

Database Applications

Transactional Access

Blobs

DBFileStreDB

FileStreams

Integrated Backup/Replication/AlwaysOn

Integrated AdministrationIntegrated Administration?

Win32 Applications

SMB Share Files/Folder

s

FileStream API

Streaming Win32 AccessStreaming Win32 Access??

Customer Application

Azure lib

Centera lib

SQL FILESTREAM lib

SQL RBS API

Azure Centera SQL DB

Remote BLOB Storage

FileStreamsFileTable

Page 12: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

RBS Example Workflow

Application

RBS Client Library

BLOB Store Provider Library

BLOB Store SQL Server

ClaimID ClaimDate PhotoRef

4390 6/5/2007 <Binary(20)>1

2

3

1 Write BLOB(Photo)

Return Blob ID

Write Blob ID to PhotoRef field

2

3

Machine Boundary

RBS Services• Create• Fetch• GC• Delete

Page 13: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

demo

Managing documents integrated in SQL Server Denali

Page 14: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

FILETABLE Overview…

DB_Directory1(Database1)

DB_Directory2(Database2)

LogFiles (FileTable)

Documents(FileTable)

Media(FileTable)

MSSQLSERVER

\\my_machine\MSSQLSERVER\DBDirectory2\Documents

Filestream Share

Database Directories

FileTable Directories

FileTable Namespace Root Folder Hierarchy

Page 15: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

FILESTREAM Read Performance (Remote)

240 KB 480 KB 1 MB 2 MB 4 MB 8 MB0

100

200

300

400

500

600

700

800

900

Filestream Win32 (Filesystem) Access

Filestream T-SQL

Varbinary

Filesystem Win32 Access Gain (%)

Th

rou

gh

pu

t (M

bp

s)

Measured with SQL Server 2008

Page 16: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

FILESTREAM Write Performance (Remote)

240 KB 480 KB 1 MB 2 MB 4 MB 8 MB

-200

-100

0

100

200

300

400

500

600Insert

Filestream Win32 (Filesystem) Ac-cess

Filestream T-SQL

Varbinary

Filesystem Win32 Access Gain (%)

Th

rou

gh

pu

t (M

bp

s)

Measured with SQL Server 2008

Page 17: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Some tips…

Reading bigger buffers gives better performance Volumes hosting FILESTREAM/FILETABLE data should have 8.3 name generation and LastAccessTime disabledFILESTREAM/FILETABLE containers to reside on dedicated volumes Have one volume per FILESTREAM/FILETABLE container

enables space management at volume level“Magic” SMB buffer size = ~60KB

Another “good” value is 480KB

ROWGUID unique index for aligned partitioning for FILESTREAMAntiVirus programs should be configured not to delete infected files but to quarantine themIf using compressed volumes, use cluster size 4 KB

Page 18: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scale-up: FileStreamMultiple Containers

SQL Server 2008 R2Only one storage container/FILESTREAM filegroupLimits storage capacity scaling and I/O scaling

SQL Server DenaliSupport for multiple storage containers/filegroup.

DDL Changes to Create/Alter Database statementsAbility to set max_size for the containersDBCC Shrinkfile Emptyfile support

Scaling FlexibilityStorage scaling by adding additional storage drivesI/O scaling with multiple spindles

Page 19: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scale-up: Multiple containers

• Use of multiple spindles for achieving better I/O Scalability

Page 20: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scale-up: FileStream Perf/Scale

Improved performance of T-SQL and File I/O accessVarious enhancements to improve read/write throughput

5 fold increase in Read throughputLinear scaling with large number of concurrent threads

Page 21: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Always On support for FILESTREAM

SQL Server 2008 R2 - No support for data mirroringHeavily requested feature

SQL Server Denali – Full support for SQL Always On New Always On solution will mirror FILESTREAM data alsoCTP1 supports Async mirroring and Readable SecondariesSync and Multiple Secondaries support coming later for Denali

Page 22: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Unstructured Storage In SQL Server 2008 & DenaliFile Stores /

External Blob Stores (CAS)

SQL BLOBs Remote Blob API FILESTREAM FILETABLE

Streaming Performance

Depends on external store

Depends on external store

Win32 App CompatDepends on external store

Depends on external store

Link Level Consistency

Data Level Consistency

Integrated Query & Management

Non-local Windows File Servers

n/a

External Blob Stores n/a

Page 23: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Full Text Search Challenges

In SQL Server 2005 and before:Indexes stored outside SQL Server lead to manageability challengesMixed query performance suffers from having to pull over complete full-text result setIndex creation has scaling issues on big boxes

In SQL Server 2008 and before:Query performance has scaling issuesSome useful search capabilities are not available: Property Search, customizable NEAR among them

Innovation in Search: Semantic similarity

Page 24: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Full Text Search Improvements in SQL Server 2008

Full-Text Engine and Indexes fully integratedCatalog, index and stopword lists now inside the database

Better performance in many common scenariosMake mixed queries perform and scaleOptimizer has knowledge about FT index

Exposing the FT Index content and any given WB behavioursNew word breakers (WB)

Better supportability toolsFrom 23 to over 40 word breakers/locales

SELECT * FROM candidates WHERE CONTAINS(resume,’”SQL Server”’) AND ZipCode = ‘98052’

Page 25: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Semantic SimilarityInput: Text such as varchar, Office, PDF, HTML, email…Output: Rowset functions with standard SQL queries

Illustrating example:

Key Title Document

D1 Annual Budget …

D2 Corporate Earnings …

D3 Marketing Reports …

… … …

---------------------------------------------------------

----------------------------------------------------------------------

----------------------------------------------------------------------

-------------

Source Table

ID Keyword Colid … compDocid CompOc CompPid

K1 revenue 1 … 10,23,123 (1,4),(5,8),(1,34) 2,5,6,8,4,3

K2 growth 1 … 10,23,123 (1,5),(5,9),(1,34) 2,5,6,8,5,4

… … … … … …

Keyword Index (Full-Text)

Keyphrases KeyphraseDocumentsID DocID

T1 (revenue) D1 (Annual Budget)

T2 (growth) D2 (Corporate Earnings)

T3 (Windows) D3 (Marketing Reports)

… …

T1 (revenue) D7 (Finance Report)

… …

T3 (Windows) D11 (Azure Strategy)

T4 (Azure) D11 (Azure Strategy)

ID Keyword

T1 revenue

T2 growth

T3 Windows

T4 Azure

… …

DocumentSimilarityDocID MatchedDocID

D1 (Annual Budget) D2 (Corporate Earnings)

D1 (Annual Budget) D7 (Finance Report)

D3 (Marketing Reports) D11 (Azure Strategy)

… …

Full-Text and Semantic Processing

quarter, record, revenue…

2b

3

2 a1

+ Language Models

Page 26: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

demo

Semantic Extraction and RelationshipsFullText Search in SQL Server Denali

Page 27: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Full-Text Property Search

--needed once per database instance to load the office filtersexec sp_fulltext_service 'load_os_resources',1goexec sp_fulltext_service 'restart_all_fdhosts'go

-- use property searchCREATE SEARCH PROPERTY LIST p1;

ALTER SEARCH PROPERTY LIST [p1] ADD N'System.Author' WITH (PROPERTY_SET_GUID = 'f29f85e0-4ff9-1068-ab91-08002b27b3d9', PROPERTY_INT_ID = 4, PROPERTY_DESCRIPTION = N'System.Author');

ALTER FULLTEXT INDEX ON fttable SET SEARCH PROPERTY LIST = p1Go

-- Querying DOCUMENT HAVING “John Doe” AS AUTHOR ONLYselect * from fttable where contains(property(ftcol, 'System.Author'), '"John Doe"')

Page 28: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Full-Text Customizable Near

--OLD NEAR SYNTAXselect * from fttable where contains(*, 'test near Space')

--NEW NEAR USAGES:

--SPECIFY DISTANCEselect * from fttable where contains(*, 'near((test, Space), 5,false)')

--REDUCE DISTANCEselect * from fttable where contains(*, 'near((test, Space), 2,false)')

--ORDER OF WORDS IS SPECIFIED AS IMPORTANTselect * from fttable where contains(*, 'near((test, Space), 5,true)')

Page 29: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Indexing Performance

Populating an index of 20 million rows of 1K data on identical hardware (time in minutes)

2 min 1 min

Page 30: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scale-up: Full-Text Search

Queries over 350M documents database and random DMLs running in background. Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput

Page 31: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scale-up: Full-Text Search

Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customer playback benchmark

Page 32: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

SQL Server Spatial Data Overview

Storage and retrieval of spatial data using standard SQL syntaxSpatial Data Types (geometry, geography)Spatial Methods (intersects, buffer, etc.)Spatial Indexes

Offers full set of Open Geospatial Consortium components (OGC/SQL MM, ISO 19125)Spatial Builder InterfaceSSMS VisualizationIntegration with Bing MapsIntegration with Reporting Services in SQL Server 2008 R2

Page 33: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Denali Spatial 2D Customer Values and Goals

Add to spatial feature set based on industry trend and customer asksFaster and more predictable Performance

SQL CAT Whitepaper what is new in Denali

Page 34: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

New Spatial Type Features

New relational methods added to geographyTo match geometry

Invalid objects are accepted for geographyMakeValid will convert them to valid objects

Bigger than a hemisphere objects for geographyFULLGLOBE spatial type is available“small” holes can be created

Increased precision from 27 to 48 bits for spatial computationsPersisted computed columns of a spatial typeSpatial Aggreators

Page 35: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

New Spatial Type Features

Circular arcs segmentsCircularStringsCompoundCurvesCurvePolygons

All methods support circular arcs

Circular arc support on a ellipsoidRed – CircularString with 4 pointsYellow – LineString with the same 4 points

Page 36: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Spatial Performance

New default Index tessellationMore stable performance

for windows of different sizefor data with different spatial density

For default values:Up to 2x faster for longer queries > 500 ms

More efficient primary filter Fewer rows returned

10ms slower for very fast queries < 50 msIncreased tessellate time which is constant

More query plans are supporting spatial index: Nearest NeighborNew Index Hint to select cells for query windowIndex compression is enabled for spatial indexes

Ca. 50% smaller indices, 0-15% slower queries

Optimized spatial query plan for STDistance and STIntersects like queriesFaster point data queriesOptimized STBuffer, lower memory footprint

Page 37: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Spatial Index Performance

New grid gives much stable performance for query windows of different sizeBetter grid coverage gives fewer high peaks

Page 38: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Spatial Nearest Neighbor

Main scenarioGive me the closest 5 Italian restaurants

Execution plan SQL Server 2008/2008 R2: table scanDenali: uses spatial index

Specific query pattern required

SELECT TOP(5) *FROM Restaurants rWHERE r.type = ‘Italian’ AND r.pos.STDistance(@me) IS NOT NULLORDER BY r.pos.STDistance(@me)

Page 39: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

demo

Spatial in SQL Server Denali

Page 40: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Nearest Neighbor Performance

NN query vs best current workaround (sort all points in 10km radius)

*Average time for NN query is ~236ms

Find the closest 50 business points (22 million in total)

Page 41: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Key Takeaways

Beyond Relational is targeting non-traditional database workloads that are growing rapidly in the enterprise.

Examples: Content and Collab, Spatial etc.

Beyond Relational is targeting key ISV asks in fast growing markets:

Examples: eDiscovery, Healthcare, GIS, Document management etc.

Beyond Relational is key strategy to enable you to build complex data applications that go beyond relational data!

Page 42: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Related Content

Denali Whitepapers and information:http://www.sqlserverlaunch.com/http://sqlcat.com/whitepapers/archive/2010/11/09/new-spatial-features-in-sql-server-code-named-denali-community-technology-preview-1.aspxSearch Demo: https://connect.microsoft.com/SQLServer/Downloads/DownloadDetails.aspx?DownloadID=35226

Find Me Later At…The SQL Server booth: Tuesday to Thursday in the afternoon On Twitter: @SQLServerMikeEmail: [email protected]: http://sqlblog.com/blogs/michael_rys

Page 43: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Database Platform (DAT) Resources

Try the new SQL Server Mission Critical BareMetal Hand’s on-Labs

Visit the updated website for SQL Server® Code Name “Denali” on www.microsoft.com/sqlserver and sign to be notified when the next CTP is availableFollow the @SQLServer Twitter account to watch for updates

Visit the SQL Server Product Demo Stations in the DBI Track section of the Expo/TLC Hall. Bring your questions, ideas and conversations!

• Microsoft® SQL Server® Security & Management • Microsoft® SQL Server® Optimization and Scalability• Microsoft® SQL Server® Programmability • Microsoft® SQL Server® Data Warehousing• Microsoft® SQL Server® Mission Critical • Microsoft® SQL Server® Data Integration

Page 44: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Resources

www.microsoft.com/teched

Sessions On-Demand & Community Microsoft Certification & Training Resources

Resources for IT Professionals Resources for Developers

www.microsoft.com/learning

http://microsoft.com/technet http://microsoft.com/msdn

Learning

http://northamerica.msteched.com

Connect. Share. Discuss.

Page 45: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Complete an evaluation on CommNet and enter to win!

Page 46: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

Scan the Tag to evaluate this session now on myTech•Ed Mobile

Page 47: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS

PRESENTATION.

Page 48: DBI316. Building and Maintaining Applications with relational and non-relational data is hard Complex integration Duplicated functionality Compensation