dealing with large content scenarios in sharepoint server 2007
DESCRIPTION
Dealing with large Content Scenarios in SharePoint Server 2007 . Architecture, Challenges, and Strategies Abrar Chisti, Microsoft Corporation. Agenda. Overview Manageability Planning Availability Case Study Takeaway’s. Content Database Growth. Use as Document Repository - PowerPoint PPT PresentationTRANSCRIPT
Dealing with large Content Scenarios in SharePoint Server 2007 Architecture, Challenges, and Strategies
Abrar Chisti, Microsoft Corporation
AgendaOverviewManageabilityPlanningAvailabilityCase StudyTakeaway’s
Content Database GrowthUse as Document Repository
Multiple versions of documents70-95% of size is File Stream
Storage of large Multi Media filesLack of Governance/Site Quotas
One Large Site CollectionLack of Planning
Is SharePoint the Right Solution?
SharePoint sites evolve organically.Database Capacity planning is often overlookedLimited or no GovernanceOne or more large content database(s)
Difficulty for IT to maintainIO Throughput and Latency is effected
Manageability
Plan for ManageabilityLimit Content Database Size to <= 100GIf Content DB Size is > 100G
Use Differential/Incremental BackupsSQL Server 2005/2008DPM 2007
Test & Baseline IO Sub-SystemSet DB Auto-growth to Fixed ValueSplit Sites in Content DB to multiple Content DB’s
Backup & Restore Options
How to Manage ContentSplit Content Database
Move Site Collections between DatabasesMove Sites into Site Collections (Re-Parent)
May need to promote sub sites to sitesMay need to move site collections between web applications
Use OOB or 3rd Party ToolsStsadm –o export/importStsadm –o backup/restoreStsadm –o mergecontentdbContent Deployment API (Selective)
How to Limit StorageDocument Libraries
Limit # of Versions.Archive or Delete Old SitesArchive or Delete Unused SitesImpose Site Quotas
Different types of quotas – Small/Med/LargeTake into Consideration Recycle Bin
Manage Lists for Performance
Upgrade Hardware/SoftwareEnsure Latest SP/PatchUse Dedicated SQL ServerUse 64 Bit Architectures and 64 Bit OSUse MS Hardware RecommendationsUse SQL Server connection alias when you configure your farmIncrease Bus Bandwidth
Take Advantage of SQL Server 2008 Capabilities
Performance - Implement database backup compression.Availability - Implement log stream compression.Security – Implement Transparent Data Encryption (TDE).Resource management – Use SQL Server 2008 Resource GovernorBe Aware of DB Migration Considerations
Content Archival/ReductionUse Database SnapshotsUse Records Repository ImplementationExternalize (BLOB) storage
Database SnapshotProvides “snapshot” of
Content DB at given instant.Requires Same DB Server Instance
Refers to the Original DatabaseUses “Copy on write” mechanismNeed to create Separate Web App.
Records Repository
Remote/External Blob Storage
Reduce Storage CostsExternal Blob Storage APIRemote Blob Storage APISQL Server 2008 has support
for RBSCan write BLOB directly using
RBIhttp://blogs.msdn.com/sqlrbs/
External Blob Based Solution-BLOB IO is moved toWeb Front End
-Supports Compression And Encryption Capability
Planning
Plan for Software BoundariesBottom Up Approach
Plan for SQL StorageSharePoint Performance Recommendations
# of Site Collections/Content DB50,000
# of Site Collections/Web Application150,000 Site Collections
100 Content DB’s Per Web ApplicationUse Multiple SQL Servers for Higher Scalability
Storage ArchitectureUse Appropriate Disk and SAN interface
SCSI vs IDE vs SATA vs SASConsideration – Hot Swap, Multiple IO, Speed, Capacity, Protocol
Use Appropriate Disks and RAID Arrays
Faster Disks/ArraysSeparate Disks for TempDB, ContentDB, and Trans LogsMultiple Data Files for Large Content and Search DB’s
Distribute files across Disks
Content Database AllocationSharePoint Allocation of Content DB’s
Pre-Allocate Pool of db’sRound Robin Scheme between DB’s
Based on Delta between Max sites and Current sitesExample
Site Collection Per DatabaseCreate Database with 100G (using ALTER DB Command)Leverage Managed Paths
Availability
ClusteringSAN or Shared Disks
Use Windows/SQL Clustering for HADedicated Disks or DAS
Use SQL Server Mirroring
Redundancy across Data Centers
Log ShippingSynchronous MirroringAsynchronous Mirroring
SQL Server 2008 Log Compression
High Availability Farm
Monitoring
MonitoringProcessor: % Processor Time: _Total. On the computer that is running SQL Server, this counter should be kept between 50 percent and 75 percent. System: Processor Queue Length: (N/A). 2 x #of core CPUs.Memory: Available Mbytes: (N/A). Monitor this counter to ensure that you maintain a level of at least 20 percent of the total physical RAM available.Memory: Pages/sec: (N/A). Monitor this counter to ensure that it remains below 100.
Disk CountersLogical Disk: Disk Transfers/sec Logical Disk:Disk Read Bytes/sec & Disk Write Bytes/secLogical Disk: Average Disk sec/Read (Read Latency)/Avg Disk Sec/WriteLogical Disk: Average Disk Byte/Read/WritePhysical Disk: % Disk TimeLogical Disk: Current Disk Queue LengthLogical Disk: Average Disk Reads/Sec and Logical Disk
Performance MonitoringPerfmon
Analyze Logs using codeplex toolsFavorite Web Monitoring (3rd Party) solution.System Center Operations Manager (SC-OM)
SharePoint Monitoring Toolkithttp://blogs.msdn.com/sharepoint/archive/2007/12/10/announcing-new-system-center-operations-manager-2007-packs-for-wss-3-0-and-moss-2007.aspx
Case Study
Large Automotive Loan Origination Application
Large Storage Scenario (Phase I)
Ability to house 10.5 million content items (1+TB).System input with "normal" input load, defined as 27,000 document per day (1 day = 10 hours). Simulate user load to represent 200 users simultaneously accessing the system to:
Use search to find elements of document metadata.View a document (scanned TIFF image).Update elements of document metadata.
Phase IIAbility to house 50 million content items (5+TB).
35 million TIFF images.15 million Microsoft Office documents
Determine the maximum number of users the solution could support.Users perform the following tasks:
Use search to find elements of document content (full-text) and metadata.View a document (scanned TIFF image or Microsoft Office document).
Architectural OverviewLogical Architecture – Phase I
Architectural OverviewLUN/DBMatrix
Takeaway’sOptimize Performance
Planning & MonitoringPlan for ScalePlan for AvailabilityPlan for Manageability
ReferencesSQL Server Database Optimization
http://technet.microsoft.com/en-us/library/cc263261.aspxPlan for Software Boundaries
http://technet.microsoft.com/en-us/library/cc262787.aspxMove Site Collections to new Content Database
http://technet.microsoft.com/en-us/library/cc825328.aspxEnable SharePoint 2010 to Use Remote BLOB Storage
http://technet.microsoft.com/en-us/library/ee748641(office.14).aspx/Content Deployment API (PRIME)
http://msdn.microsoft.com/en-us/library/cc264073.aspxIntegration of SQL Server 2008 and SharePoint
http://msdn.microsoft.com/en-us/library/cc264073.aspxUse Database Snapshots for Archiving Sites
http://technet.microsoft.com/en-us/library/cc706872.aspxConfigure Availability in SharePoint Farm
http://technet.microsoft.com/en-us/library/dd207311.aspxCase Study for Large Content Scenario
http://technet.microsoft.com/en-us/library/cc262067.aspxScaling Storage Architecture
http://www.knowledgelake.com/whitepaper/Scaling%20SharePoint%202007%20-%20Storage%20Architecture.pdf
Remember to fill out your evaluations for your chance to win a Zune HD and one of 12 copies of
Office 2007 (13 prizes will be awarded)
Evaluation Prizes Sponsored by:
Tools AvailabilitySPUsed Space InfoSPSiteInfoContent Deployment Wizard
Migrate from other source systems.Other tools in CodePlex3rd Party
Metalogix, Qwest, Tzunami, AvePoint, StoragePoint, Knowledge Lake