repository scalability - comparing sharepoint 2010 with oracle ucm 11g

22
Repository Scalability: Comparing Microsoft SharePoint 2010 and Oracle UCM 11g Raoul Miller and Brent Seaman, TEAM Informatics, inc.

Upload: raoul-miller

Post on 22-Nov-2014

1.653 views

Category:

Technology


3 download

DESCRIPTION

Report on testing comparing the performance and scalability of SharePoint 2010 and Oracle UCM 11g - presented at Collaborate 2011

TRANSCRIPT

Page 1: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Repository Scalability: Comparing Microsoft SharePoint 2010 and

Oracle UCM 11g

Raoul Miller and Brent Seaman,

TEAM Informatics, inc.

Page 2: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Outline

• Our ingestion rate experiments• Hardware and Software setup• Experimental design

• Observations and Conclusions from these tests• Implications for Repository Sizing and

Organization in SharePoint and UCM• Lessons Learned and Recommendations• Q and A

Page 3: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Overall aims of this research

• Apply real-world scenarios to ingestion testing• Rather than ultra high performance / ultra high cost

• Determine actual ingestion rates for different scenarios on identical hardware

• Expose weaknesses / issues in large imports• Derive recommendations for best practices in

importing existing content into new CMS repositories

Page 4: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Experimental Approach• Import existing files from file system into newly-installed

CMS– Standard configurations– Commodity hardware– No specialized tuning or optimizations– Vendor recommended OS and databases

• Four scenarios– 20,000 files @ 40kB– 20,000 files @ 100kB– 1,000,000 files @ 40kB– 1,000,000 files @ 100kB

Page 5: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Are these Scenarios Realistic?

• >80% of single instance CMS repositories contain 50-200,000+ items

• Average “document” size in most industries is ~100kB.

• Most projects need to import existing content from file shares or other systems

Page 6: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Commodity Hardware

• Dell PowerEdge R710s server• Dual Intel Xeon 5560 CPUs (@ quad core)

running at 2.8Ghz • 16GB RAM• Eight 146GB 10K RPM SAS drives

Page 7: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

UCM Installation

• Operating System: RedHat Enterprise 5 (specifically the CentOS 5 build)

• Database: Oracle 11g Standard Edition database

• Web / Application Server: Weblogic 11gR1 (10.3.3)• Content Management UCM 11gR1 (11.1.1.4.0)

System:• Java Runtime Environment: Sun Hotspot SDK (1.6.0_11) &

JRockit R28• File storage: File system (default) and JDBC

(SecureFiles)

Page 8: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

SharePoint Installation

• Operating System: Windows Server 2008 Std Edition for Partners

• Database: Microsoft SQL Server 2008 R2 Enterprise

• Web Server: IIS7 (Standard with Windows Server 2008 - specifically v 7.5.76)

• Content Management SharePoint Server 2010 Enterprise for System: Partners

• File storage: Database Storage in SQL server

Page 9: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Ingestion Approaches

• UCM– used BatchBuilder and BatchLoader

• SharePoint - – had to use third party tool (UploadZen by

Roxority)– Need to organize content before import– Limited flexibility in directory size

Page 10: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Supported SharePoint 2010 bulk import strategies

• Multiple file upload applet– Silverlight; supports up to 100 docs, does not support

subdirectories

• Windows Explorer view– Extension of webDAV– Limited performance

• SharePoint Workspace– Client integration– Only supports up to 500 documents

Page 11: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Differences between Import Strategies• BatchLoader

– Supported system tool– Allows automated file system crawl (BatchBuilder)– Storage / browse location in repository unrelated to source

location– Supports high volume

• UploadZen– Third-party application– Requires organization and sizing of import directories– Organization within repository reflects import location– Major challenges with high volume imports

Page 12: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Considerations for Repository Sizing

1. Should be primarily driven by business / infosec needs

2. Practicality– Import / migrate– Search / organize– Backup / DR

3. Flexibility– Growth in content volume / size– Leverage HSM / partitioning– Provide options for storage strategies

Page 13: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Ingestion Rate Testing

• Major things to test:– Overall rate of ingestion with different sized

files and different sized collections– Ease of use of import tools– Flexibility in organization of content during /

after import

Page 14: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

20,000 files – each 40kB• First set of tests• Single directory for SharePoint source

• UCM – File System storage – 198,000 docs/hr• UCM – JDBC storage – 156,000 docs/hr• SharePoint – 153,000 docs/hr

Page 15: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

20,000 files – each 100kB

• UCM – File System storage – 171,000 docs/hr• SharePoint – 138,000 docs/hr

• Ingestion rates fell 10-15% for larger file size• SharePoint RAM usage higher, primarily in

database

Page 16: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

1,000,000 files – each 40kB• Need to organize files in directories for SharePoint

– 50 folders each with 20,000 items - failed– 2,000 folders each with 500 items – succeeded

• UCM – FS storage & Sun JRE 205,000 docs/hr• UCM – FS storage & JRockit JRE 212,000 docs/hr• UCM – JDBC storage & Sun JRE 171,000 docs/hr• SharePoint w/ 50 import folders failed• SharePoint w/ 2,000 import folders 217,000 docs/hr

Page 17: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

1,000,000 files – each 40kB (contd.)

• Substantial work to organize content for SharePoint import

• SharePoint much more RAM intensive– Primarily with database process

• UCM more CPU intensive– Much more linear response

Page 18: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

1,000,000 files – each 100kB

• UCM – FS storage & Sun JRE 179,000 docs/hr– 15% decrease in rate due to file size

• Unable to complete test with SharePoint

Page 19: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Conclusions

• SharePoint requires 3rd party tools and substantial work before import

• SharePoint has limited flexibility in terms of repository sizing, content organization, and import strategies

• With optimized import, SharePoint ingestion rates are comparable to UCM

• UCM has much more flexibility in import strategies• UCM has consistent import rates between 156,000 and

212,000 docs/hr (OOTB)

Page 20: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Conclusions (contd.)• Ingestion rates are dependant on average file size (10-

15% decrease in rate between 40kB and 100kB file size)• UCM can be deployed on commodity hardware for

repositories of 1,000,000 items• SharePoint has challenges importing 1,000,000 files on

commodity hardware• Both systems function well on this hardware after import.• SharePoint import is much more RAM intensive whereas

UCM import is CPU intensive

Page 21: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Q&A

Page 22: Repository Scalability - comparing SharePoint 2010 with Oracle UCM 11g

Comments / Questions / Feedback

Contact us:Raoul Miller

[email protected]

Brent [email protected]