repository scalability - comparing sharepoint 2010 with oracle ucm 11g

Post on 22-Nov-2014

1.653 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Report on testing comparing the performance and scalability of SharePoint 2010 and Oracle UCM 11g - presented at Collaborate 2011

TRANSCRIPT

Repository Scalability: Comparing Microsoft SharePoint 2010 and

Oracle UCM 11g

Raoul Miller and Brent Seaman,

TEAM Informatics, inc.

Outline

• Our ingestion rate experiments• Hardware and Software setup• Experimental design

• Observations and Conclusions from these tests• Implications for Repository Sizing and

Organization in SharePoint and UCM• Lessons Learned and Recommendations• Q and A

Overall aims of this research

• Apply real-world scenarios to ingestion testing• Rather than ultra high performance / ultra high cost

• Determine actual ingestion rates for different scenarios on identical hardware

• Expose weaknesses / issues in large imports• Derive recommendations for best practices in

importing existing content into new CMS repositories

Experimental Approach• Import existing files from file system into newly-installed

CMS– Standard configurations– Commodity hardware– No specialized tuning or optimizations– Vendor recommended OS and databases

• Four scenarios– 20,000 files @ 40kB– 20,000 files @ 100kB– 1,000,000 files @ 40kB– 1,000,000 files @ 100kB

Are these Scenarios Realistic?

• >80% of single instance CMS repositories contain 50-200,000+ items

• Average “document” size in most industries is ~100kB.

• Most projects need to import existing content from file shares or other systems

Commodity Hardware

• Dell PowerEdge R710s server• Dual Intel Xeon 5560 CPUs (@ quad core)

running at 2.8Ghz • 16GB RAM• Eight 146GB 10K RPM SAS drives

UCM Installation

• Operating System: RedHat Enterprise 5 (specifically the CentOS 5 build)

• Database: Oracle 11g Standard Edition database

• Web / Application Server: Weblogic 11gR1 (10.3.3)• Content Management UCM 11gR1 (11.1.1.4.0)

System:• Java Runtime Environment: Sun Hotspot SDK (1.6.0_11) &

JRockit R28• File storage: File system (default) and JDBC

(SecureFiles)

SharePoint Installation

• Operating System: Windows Server 2008 Std Edition for Partners

• Database: Microsoft SQL Server 2008 R2 Enterprise

• Web Server: IIS7 (Standard with Windows Server 2008 - specifically v 7.5.76)

• Content Management SharePoint Server 2010 Enterprise for System: Partners

• File storage: Database Storage in SQL server

Ingestion Approaches

• UCM– used BatchBuilder and BatchLoader

• SharePoint - – had to use third party tool (UploadZen by

Roxority)– Need to organize content before import– Limited flexibility in directory size

Supported SharePoint 2010 bulk import strategies

• Multiple file upload applet– Silverlight; supports up to 100 docs, does not support

subdirectories

• Windows Explorer view– Extension of webDAV– Limited performance

• SharePoint Workspace– Client integration– Only supports up to 500 documents

Differences between Import Strategies• BatchLoader

– Supported system tool– Allows automated file system crawl (BatchBuilder)– Storage / browse location in repository unrelated to source

location– Supports high volume

• UploadZen– Third-party application– Requires organization and sizing of import directories– Organization within repository reflects import location– Major challenges with high volume imports

Considerations for Repository Sizing

1. Should be primarily driven by business / infosec needs

2. Practicality– Import / migrate– Search / organize– Backup / DR

3. Flexibility– Growth in content volume / size– Leverage HSM / partitioning– Provide options for storage strategies

Ingestion Rate Testing

• Major things to test:– Overall rate of ingestion with different sized

files and different sized collections– Ease of use of import tools– Flexibility in organization of content during /

after import

20,000 files – each 40kB• First set of tests• Single directory for SharePoint source

• UCM – File System storage – 198,000 docs/hr• UCM – JDBC storage – 156,000 docs/hr• SharePoint – 153,000 docs/hr

20,000 files – each 100kB

• UCM – File System storage – 171,000 docs/hr• SharePoint – 138,000 docs/hr

• Ingestion rates fell 10-15% for larger file size• SharePoint RAM usage higher, primarily in

database

1,000,000 files – each 40kB• Need to organize files in directories for SharePoint

– 50 folders each with 20,000 items - failed– 2,000 folders each with 500 items – succeeded

• UCM – FS storage & Sun JRE 205,000 docs/hr• UCM – FS storage & JRockit JRE 212,000 docs/hr• UCM – JDBC storage & Sun JRE 171,000 docs/hr• SharePoint w/ 50 import folders failed• SharePoint w/ 2,000 import folders 217,000 docs/hr

1,000,000 files – each 40kB (contd.)

• Substantial work to organize content for SharePoint import

• SharePoint much more RAM intensive– Primarily with database process

• UCM more CPU intensive– Much more linear response

1,000,000 files – each 100kB

• UCM – FS storage & Sun JRE 179,000 docs/hr– 15% decrease in rate due to file size

• Unable to complete test with SharePoint

Conclusions

• SharePoint requires 3rd party tools and substantial work before import

• SharePoint has limited flexibility in terms of repository sizing, content organization, and import strategies

• With optimized import, SharePoint ingestion rates are comparable to UCM

• UCM has much more flexibility in import strategies• UCM has consistent import rates between 156,000 and

212,000 docs/hr (OOTB)

Conclusions (contd.)• Ingestion rates are dependant on average file size (10-

15% decrease in rate between 40kB and 100kB file size)• UCM can be deployed on commodity hardware for

repositories of 1,000,000 items• SharePoint has challenges importing 1,000,000 files on

commodity hardware• Both systems function well on this hardware after import.• SharePoint import is much more RAM intensive whereas

UCM import is CPU intensive

Q&A

Comments / Questions / Feedback

Contact us:Raoul Miller

Raoul.miller@teaminformatics.com

Brent SeamanBrent.seaman@teaminformatics.com

top related