sharepoint content lifecycle management presented by: mary leigh mackie
TRANSCRIPT
SharePoint Content Lifecycle
ManagementPresented by: Mary Leigh Mackie
Content Lifecycle Management
Organization Workflow Creation Repository Versioning Publishing Archives
Design Produce Consume
Sites Work-flow Office Doc
Library Version Publish-ing Site
Record Center
SharePoint
Agenda
Content Organization &
Storage
Storage Optimization
Content Access
Archiving
Content Organization & Storage
Information Architecture
• Accountability of published content using workflows or approvals• Managing search scopes, security trimming, federation• Isolate intranet content from extranets• Testing for consistency and performance• Training your site/content owners and end users
http://technet.microsoft.com/en-us/library/cc262873.aspx#section2
• Determine the business goals• What will your site structure and
taxonomy look like?• Standardize branding with
templates and master pages
Other considerations
Source: Governance Resource Centre on Microsoft TechNet
0 1 2 3 4
Active Data
Total Data
Storage in a Content RepositoryIncrease in % of inactive data over time
Time in years
Dat
a in
SQ
L
Planning for SharePoint Storage
• Recycle bin• Versioning• Search and index information• Growth
Good rule of thumb for initial planning is: 3.5 x file system
Basic Storage Management Methods
• Set site quotas and alerts!– 10 GB quota, 8 GB alert is my favorite
• Monitor growth trends– Sites: slow over time or large jump in size?– Overall content DB size
• Split Content DBs if they get “too big”
How SharePoint “chooses” a Content DB for a site
• Highest remaining allotment rule– Content DB 1: 100 sites max– Content DB 2: 100 sites max
– Content DB 1: 100 sites max– Content DB 2: 200 sites max
SharePoint Site Content BD selection process: http://blog.jesskim.com/kb/293
Optimal Content DB Sizing
• Backup & Recovery operations(<50-100 GB)
• Performance (<500 GB… nervous at 300 GB)– # of objects– size of objects– Hardware (servers and storage)
• Storage Cost (as small as possible!)
So what is too big?
BLOBs-- What’s the Issue?
• BLOBs = Binary Large Objects • SharePoint Content = BLOB + Metadata• Content DB = database of … BLOBs + Metadata• SQL DB storage needs high IOPS (input/output
operations per second) and low latency• High IOPS + low latency storage = $$$$• BLOBs do not participate in query operations, so no
real reason to have BLOBs in a DB• DB full of BLOBs = wasted $$$
SharePoint WFE
SharePoint Object Model
SQL ServerBL
OBs
& M
etad
ata
Content DB Config DB
Default SharePoint Storage
Database Size ImplicationsBLOBs increase DB size, creating issues with:
• Backup & Recovery operations
• Performance
• Storage Costs
0 1 2 3 4
Active Data
Total Data
Issues with BLOBs Get much worse over time…
Increase in % of inactive data over time
Inactive sites, documents, list, libraries take up SQL storage, hindering performance
Time in years
Dat
a in
SQ
L
Storage Optimization
SharePoint Storage Optimization Methods
• Move the BLOBs out of the database
• Archive content
Planning for Data Use & Growth
What does SharePoint 2010 offer OOTB?• No native archiving tools• EBS extended to include RBS
– Available only in SQL Server 2008 SP2– Only accessible via API
• BCS (BDC in 2007) extended to allow for easier connectivity with legacy data systems
Storage OptimizationExtending BLOBs out of the database
Available APIs for Extending
SQL Remote BLOB Service (RBS)
SharePoint External BLOB Service (EBS)
EBS/RBS OverviewBlob Services to change BLOB storage locations
• EBS = External BLOB Service– SharePoint 2007 SP1+ API
• RBS = Remote BLOB Service– SQL Server 2008R2 Feature Pack API, with SharePoint 2010 support
• Both are interface specifications– Need a provider to actually work
• Cannot have both providers
EBS
• EBS provider can take ownership of the BLOB
• Provider gives SharePoint a token or a stub so SharePoint knows how to retrieve the object (context)
• Transparent to the end-user
SharePoint WFE
EBS ProviderBLOB
Metadata
SharePoint Object Model
SQL Server
Content DB Config DB
BLOB Store
EBS
• Implemented by SharePoint• Only 1 EBS Provider per SharePoint farm
• Orphaned BLOBs- no direct method to compare BLOB store and Content DB
• Compliance- what if I don’t want to allow SharePoint to delete the object?
RBS
• Not unique to SharePoint, available to any application
• A Provider Library can be associated with each database
SharePoint WFE
SharePoint Object Model
Content DBX
Content DBY
Relational Access
Provider Library X
Provider Library Y
BLOB Store
RBS Client Library
BLOB Store
BLOB Metadata
BL
OB
& M
eta
da
ta
SQL Server
RBS
• Implemented by SQL• Only 1 RBS Provider per Content DB
• Orphaned BLOBs much less of an issue• Can lock down operations, from a unified
storage perspective• Can be managed via Powershell
RBS: SQL Server 2008 Feature Pack APIHandled natively by database
Default Provider: FILESTREAM1. Enable FILESTREAM provider on SQL2. Provision data store and set storage location3. Install RBS on all SP Web and App servers4. Enable RBS
RBS versus SQL Filestream
• Filestream storage must be file system locally attached to the SQL server
• RBS is an API set that allows storage on external stores - physically separate machines that may be running custom storage code, for instance EMC Centera
EBS
Tighter integration
with application,
allows for more rules and settings
EBS versus RBS, which is better?
EBS
Tighter integration
with application,
allows for more rules and settings
RBS
Simpler, allows unified storage
architecture across
applications
http://www.codeplex.com/sqlrbs
EBS versus RBS, which is better?
It looks like RBS has won…
SQL Remote BLOB Service (RBS)
SharePoint External BLOB Service (EBS)
SharePoint 2007
SharePoint 2010
Future SharePoint Release (SPS 5?)
SQL Server 2005
Future SQL Releases
SQL Server 2008
Microsoft will provide a powershell solution to migrate from EBS to RBS
Benefits of Extending BLOBs
• Backup & Recovery operations– Databases are 60-80% smaller– Need a method to backup BLOBs synchronously
• Performance– Databases are 60-80% smaller– Performance improvement increase as the file/BLOB size increases. Microsoft
research indicates:• <256kb, SQL better• 256kb to 1mb, SQL and file system comparable• >1mb, file system better
• Storage Cost– “Not as expensive” storage– Archiving still needed for true savings
RBS is Completely Seamless for Users
31
• Users can access contents by:– Clicking and downloading directly through SharePoint– Opening the file using their Office client– Referencing the URL– Searching for contents natively in SharePoint
• Users can interact with contents by:– Modifying metadata and content types– Modifying permissions– Applying alerts– Using workflows or publishing templates– Using site Quotas and Locks
Cloud Storage Use CaseSharePoint “Overdraft Protection”
DB alert set at 80 GB, limit at 100 GB
0
80
100
Alert sent to admin
No action takenCloud
Storage
• Could be any storage
• Cloud is ideal “insurance”--cheap to setup, expensive to use
Content Access
Where is it in it’s lifecycle? Do you want to expose it in SharePoint?
• BCS is intended for connecting LOB’s (Databases, Windows Communication Foundation (WCF) or Web services, .NET connectivity assemblies, Custom data sources) into SharePoint, without migrating the data
• No OOTB solutions for getting content out of users desktops, file shares, or other ECM systems
Connecting Legacy Data
SharePoint 2010 Support
Options for Exposing Legacy Data(File Shares, Notes, Exchange Public Folders, eRoom Documentum, LiveLink…
etc?)
• Migrate– Manually download/upload, losing author, time, security, history,
other metadata– 3rd Party Tool
• Connect– BCS Mechanisms– Most major ECM Vendors– AvePoint’s DocAve Connector EBS/RBS API’s preferred
Which option is better?
Connecting vs. Migrating– Value add of legacy system– Maintenance costs
• Hardware• Licensing and support• Knowledge
– Migration costs• Migration process• Tools• Training
Migrating vs. Connecting
Migrating• Data is available in SharePoint• Data is moved into SharePoint• SharePoint replaced legacy
system• Burden of storage is on
SharePoint• Changes saved in SharePoint• Migrate and decommission
Connecting• Data is available through
SharePoint• Data is left in source (legacy)
system• Give legacy system second life by
increasing its value• Burden of storage is on legacy
system• Changes propagate to source• Connect and forget
Connect to SharePoint: BCS Mechanisms
• .NET Assembly Connector– Provided with Microsoft Business Connectivity Services (BCS)– Each .NET connectivity assembly is specific to an external content type– Provides no Administration interface integration
• Custom Connector– Connect to external systems not directly supported by Business
Connectivity Services– Agnostic of external content types that connect to a kind of external
system (all databases or all Web services)– Provides an Administration UI integration
http://msdn.microsoft.com/en-us/library/ee554911.aspx
Which BCS Mechanism Should I Use?
• The .NET Assembly Connector approach is recommended if the external system is static. Otherwise, for every change in the back end, you must make changes to the .NET connectivity assembly DLL. This, in turn, requires recompilation and redeployment of the assembly and the models.
• Custom connector approach is recommended if the back-end interfaces frequently change. By using this approach, only changes to the model are required.
http://msdn.microsoft.com/en-us/library/ee554911.aspx
Connecting: 3rd Parties
40
(File Shares, Notes, Exchange Public Folders, eRoom Documentum, LiveLink… etc?)
• Most major ECM Vendors• Other 3rd Parties
EBS/RBS API’s preferred
Options for Exposing Legacy Data: Migration
How much content needs to be migrated?How long will this take? How much downtime can you tolerate?
How much customization do you have?
Is this a “big bang” migration or can you migrate in a scaled/phased approach?Can you accept loss of metadata and securities?
Can you engage other members to assist in the process and arrange for proper
training?
What minimal requirements do you have for this migration?
Can you properly map non-SharePoint related assets into SharePoint?
Questions to ask yourself…
etc…
ConsPros
SharePoint Migration Strategies
• Environments retaining ample amounts of outdated information
• Moving to new hardware or new architecture
• Puts Power Users in charge to recreate and manage sites
• Migrate relevant content to avoid import of old data
• Completely retains old environment
• Virtually no downtime – requires user switch to new environment
• Manual process, very resource intensive
• Requires willing participants and intensive training
• Requires additional steps to retain original URLs
• Requires new server farm and additional SQL Server storage space for new content
Best For
User-Powered Manual Migration• SharePoint Administrator installs the new version on separate hardware or a
separate farm and allows Power Users to manually recreate content
ConsPros
SharePoint Migration Strategies
• Any size environment, from single server environments to large, distributed farms
• Granular migration
• Retains all metadata
• Virtually no downtime
• Applicable to non-SharePoint repositories
• Costs associated with purchasing of additional software
• Requires new server farm
Best For
Migration via 3rd Party Tool• SharePoint Administrator installs the new version on separate hardware or a
separate farm, and migrates content and users using 3rd Party Tool
What About Access for Geo-Dispersed Users?
• Centralized environment, accessed globally• Centralized environment plus local content (sites,
etc)• Fully distributed, replicated architecture accessed
locally– Centralized or cloud storage backup for high
redundancy
• Out of the box SharePoint
• Lowest complexity, least costly
• Varied User Experience
• Evaluate bandwidth and usage patterns
Global ArchitecturesSingle Centralized Environment
• Local services and sites, in addition to main farm
• Increased infrastructure complexity
• Governance can be an issue
• Relocating teams/users is a pain
Global ArchitecturesCentralized plus local content
• Fast local access to SharePoint content
• Replicate only what is relevant
• Ability to handle remote locations
Global ArchitecturesFully distributed
• Backup locally or to alternative sites
• Consider cloud storage
• Can be used for high redundancy
Cloud Storage
Global ArchitecturesDistributed w/
Centralized Backup
ArchivingAdding Lifecycle Management to the
picture
Time
Acce
ss /
SLA
Re
quire
men
ts
Low
High Initial content creation
Moderate content retrieval
Lifecycle of a Typical Item
0 1 2 3 4
Active Data
Total Data
Time in years
Dat
a in
SQ
L
Storage in a Content RepositoryIncrease in % of inactive data over time
Data Lifecycle Management
• Records Center– Another SharePoint site– Higher % inactive content– Consider separate Content DB, with an RBS provider
implemented for this DB
• Archiving– Backup and delete– Workflow (Expirations)– 3rd Party tools solutions
3rd Party Archiving Tools
• What rules are available?– Last modified time– Author– Versions
• What scope can I apply rules to? (farm to item)• Does it use RBS/EBS APIs?• Does it integrate with other infrastructure
management tools? (backup, replication, etc.)
1
2
3
4
SummaryThink carefully about organization and storageConsider where content will be stored and how it will grow over time
Leverage BLOB Services APIs to Optimize SharePoint StorageEBS/RBS API’s can be leveraged to store BLOBs outside of SQL with
little impact on end-users, to save $$ and optimize storage
Content access is keyDevelop strategies to handle access to legacy data and content access from remote locations
Archive contentPlan for long term growth and optimal system performance
AvePoint – Who we areGlobal Leader in SharePoint Infrastructure ManagementBackup & Recovery, Administration, Replication, Migration, Compliance, Storage Optimization
• Founded in 2001 • Headquartered in Jersey City, NJ, with global offices in:
– USA: Chicago, San Jose, Houston, Washington D.C., Redmond– International: UK, Germany, Australia, Japan, Singapore, Canada
• R&D team of 350+ Largest SharePoint team outside of Microsoft• Winner of 2008 Best of Tech Ed Award for Best SharePoint Product• Exclusive OEM relationships with IBM and NetApp• A Depth Managed Microsoft Gold Certified ISV Partner
– MTC Alliance Member; Notes Transition Partner; Office TAP 14 Member; BPOS TAP Member
Applicable Features of AvePoint Tools
• DocAve Report Center– Storage growth and trending– Server performance and monitoring
• DocAve Administrator– Manage site quotas and alerts– Move sites between Content DBs
• DocAve Replicator– Fully mapped, live or scheduled replication of all
SharePoint contents
Applicable Features of AvePoint ToolsConnecting• DocAve Connectors
– Leverage EBS/RBS APIs to expose File Share Content as fully functional SharePoint object– Content works with Office Applications, alerts, workflows, 3rd party application, etc…
Migrating• DocAve Migrators for SharePoint
– From previous versions of SharePoint, File Shares, Exchange Public Folders, Lotus Notes, Documentum eRoom, EMC Documentum, Livelink, Oracle/Stellant, Vignette
– Offers granular selection of content, full graphical user/domain/properties mapping• DocAve Content Manager
– Consolidates existing SharePoint instances (other sites or farms that are the same SharePoint version) into a single SharePoint instance, while maintaining all metadata
– Offers granular selection of content, full graphical user/domain/properties mapping
Demo?
Thank You!Q&A
Resources - www.AvePoint.com
61
Visit us: http://www.AvePoint.com
Email us: [email protected]@avepoint.com
Follow us: @AvePoint_Inc@mlmackie
Download a FREE, fully-enabled 30 Day trial of DocAve at www.avepoint.com/download
Additional Resources
• Storage Optimization for SharePoint Whitepaper :
http://www.avepoint.com/assets/pdf/sharepoint_whitepapers/Storage_Optimization_Technical_Advisor.pdf
• Configure Content Database for RBS: http://technet.microsoft.com/en-us/library/ee748641(office.14).aspx
• FILESTREAM RBS:http://blogs.msdn.com/opal/archive/2009/12/07/sharepoint-2010-beta-with-filestream-rbs-provider.aspx
• Whitepaper about FILESTREAM:http://msdn.microsoft.com/en-us/library/cc949109.aspx
Backup Slides
SharePoint Migration StrategiesEngage Power Users In Content Migration:
• Create a dedicated Power Users group - have a Power Users SharePoint Site so that all the power users can share best practices and lessons learned with one another
• Provide expensive training on SharePoint to all Power Users • Request Power Users to Migrate Content – they should be empowered
and proactive about content migration and administration • Request Power Users to train new SharePoint users to properly use their
specific sites – provide training materials, videos, etc. to new users to lower TCO for IT training
A Power User should be very familiar with SharePoint and have either Full Control or Design permissions (or their equivalent) for the site they will manage. (Restrict Site Deletion Permission)
TIP
Connecting to SharePoint: .NET Assembly
65
• Write code as Microsoft .NET Framework classes and compile the classes into a primary DLL and multiple dependent DLLs.
• Publish the DLLs into the Business Data Connectivity (BDC) service database.
• Use Microsoft SharePoint Designer to discover the .NET Connectivity Assembly and create a model.
• Map each entity to a class in the DLL, and map each BDC operation in that entity to a method inside that "Class".
At run time, when a user executes a BDC operation, the corresponding method in the primary DLL is executed.http://msdn.microsoft.com/en-us/library/ee554911.aspx
Connecting to SharePoint: Custom
66
• Implement ISystemUtility, IConnectionManager, and ITypeReflector interfaces. • Implementing IAdministrableSystem provides Administration UI property
management support and implementing ISystemPropertyValidator provides import time validation of LobSystem properties (not on the Microsoft Office client).
• Compile the code into a DLL and place it in the global assembly cache (GAC) on the server and clients.
• Author the model XML for the custom data source (SharePoint Designer 2010 does not support a model authoring experience for custom connectors).
At run time when a user executes a BDC operation, this invokes the Execute method in the ISystemUtility class. The responsibility of executing the back-end method is given to the Execute method.
http://msdn.microsoft.com/en-us/library/ee554911.aspx