documentum xplore: fundamentals and troubleshooting
DESCRIPTION
Documentum xPlore: Fundamentals and Troubleshooting. Chad Peek Designated Support Engineer Worldwide Technical Support EMC – Information Intelligence Group. xPlore: Fundamentals and Troubleshooting. Dial-in numbers: U.S. Dial-in number s (toll free) 888-643-3084 ( toll): 857-207-4204 - PowerPoint PPT PresentationTRANSCRIPT
Documentum xPlore: Fundamentals and TroubleshootingChad PeekDesignated Support EngineerWorldwide Technical SupportEMC – Information Intelligence Group
xPlore: Fundamentals and Troubleshooting
• Dial-in numbers:– U.S. Dial-in numbers
(toll free) 888-643-3084 (toll): 857-207-4204
– Passcode: 34856332
• Country-specific dial-in numbers:http://
www.emcconferencing.com/globalaccess/index.asp?bid=302
• Click the handouts icon to download the presentation.
• Separate Q&A session after the webinar
• The webinar is being recorded
• Follow us at:http://community.emc.com/blogs/iigsupportwebinars
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
Terminology
• Instance: An xPlore instance is one deployment of the xPlore WAR file to an application server container. You can have multiple instances on the same host (vertical scaling), although it is more common to have one xPlore instance per host (horizontal scaling).
• Domain: A separate, independent group of collections with an xPlore deployment that maps to a single Documentum repository.
Terminology
• Collection: A logical group of XML documents that is physically stored in an xDB library. It represents the most granular data management unit within xPlore.
• Category: Defines how a class of documents is indexed. The definition specifies the processing and semantics that is applied to an ingested XML document.
• XQuery: A language for finding and extracting elements and attributes from XML documents and anything that can appear as XML including databases.
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
Documentum xPlore Overview
• Documentum xPlore is the next generation search for Documentum– FAST is no longer supported. Existing
implementations must migrate to xPlore.
• Technology Foundation: EMC xDB (native XML database) and Lucene
• Compatible with D6.5 SP2 and later– Client's supported by D6.5 SP2 will work
without change (with a few exceptions)
• Dual mode migration– xPlore & FAST both active (index & query) on
the same repository
New Features in xPlore 1.2
• 1.2 is the latest version (Dec 2011)
• Thesaurus Support– Synonyms, alternate spellings, acronyms
• Indexing and query performance – Wildcard performance improvements– Automatic query warmup feature
• Language Certifications– Russian, Arabic, Hebrew, and Brazilian
Portuguese
• Administration Features– Silent installer
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
Architecture
Index Server
Index AgentContent Server
Repository
xDB/Lucene
CPS
High Level Data Flow
Index Server
Index AgentContent Server
Repository
xDB/Lucene
CPS
1. Content is added to or modified in the repository.
2. Index Agent polls the server for changes.
6. Content Server queries the index (DFC/plugin).
4. IS processes the XML document.
3. IA transforms metadata and content into XML that the Index Server can process.
5. IA reports on indexing status.
Content Server
Content Server
Repository
• Fulltext Objects– dm_ftengine_config– dm_ftindex_agent_config– dm_fulltext_index– dm_ftfilter_config– dm_ftwatermark
• For indexing, events like dm_save, dm_checkin and dm_destroy trigger the creation of queue objects
– dmi_queue_item– Monitored by the IA
Content Server (Cont.)
• These events are registered for object types. By default: dm_sysobject, dm_acl, dm_group and dm_folder
– All the subtypes of dm_sysobject inherit indexing setting– Indexing dm_acl and dm_group allow xPlore to enforce security– The dm_folder type must be indexed for location based queries
(e.g. FOLDER DESCEND)
• Enabling and disabling indexing for object types– You can use DA to enable or disable indexing.– The following query can be used to quickly review the registered
events:select registered_id, event, user_name from dmi_registry where user_name like 'dm_fulltext_index_user%‘
• The Content Server uses XQuery to query the xPlore index
– Query plugin is used pre 6.6
Index Agent
• Uses DFC to interact with the CS and the xPlore API to interact with xPlore• Migration Mode
– Repository is indexed in batches using the dm_ftwatermark object
• Normal Mode– Polls dmi_queue_item table
for new entries– dm_fulltext_index_user
• Selective Indexing– DQL query– Object file (a list of
r_object_id’s)
Index Server
Index Agent
xDB/Lucene
CPS
Index Agent (Cont.)
• Shared Content Storage– By default the IA pulls the content from the file system.– Mapping the filestore in the indexagent.xml allows CPS to
have direct read access which can improve performance.– The content storage area must be mountable as read-only
by the Index Agent and xPlore hosts.
• Filters– Exclude objects from the index by cabinets, folders or
object types.– Filters must be in place before migration.
Index Server
• Components– EMC xDB– Lucene– CPS
• xPlore Administration Tool– Enable/disable auditing– Monitor and increase logging– OOTB reports– Testing features
Index Server
Index Agent
xDB/Lucene
CPS
Index Server (Cont.)
• xDB– EMC’s XML database– Tracks indexing and update requests– Records the location of indexed content
• Lucene– Open source tool that is embedded into xDB– Performs query lookup and retrieval of facet and security
information
• CPS (Content Processing Services)– Retrieves indexable content from content sources– Determines the format and language– Parses the content into index tokens
Things to Remember
• Using DQL hints with 6.6 and 6.7 clients– Newer versions of DFC automatically generate XQuery– Turn off XQuery generation by adding the following
setting to dfc.properties file:dfc.search.xquery.generation.enable=false
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
• Read the entire manual before starting the install– Are you migrating or is this a fresh install?– If you are you using FAST, will you use dual mode migration?– Understand which set of steps apply to your situation.
• Work through the sizing calculator and the other performance tools available on the EDN– https://community.emc.com/docs/DOC-8922– Typically you can expect xPlore to use the same resources
as FAST, but you should still work through the sizing exercise.
• Use the latest xPlore patch– Match xPlore patch with Content Server patch
Installation Overview
• Migration Options– Immediate replacement of FAST – Variations of the dual mode feature
• Dual Mode– Migration feature for customers currently running
FAST– Allow you to run FAST and xPlore at the same time for
the same repository– Provides simultaneous indexing and query execution
(in some cases)
Migration Overview
Dual Mode
• Benefits– End-user experience remains unchanged.– Validation of xPlore index can occur without disrupting
business critical processes.– Can slowly migrate users over from FAST to xPlore and
eventually turn off FAST– Minimizes disruption to business and can be done with
limited downtime
• Drawbacks of Dual Mode– Added complexity– Could potentially require additional hardware
IS
IACS
Dual Mode Migration: Single Content Server
Webtop
• Single Content Server and client, both FAST and xPlore actively indexing.
• Queries are sent to one IS.
• Advantages– No additional Content
Servers or clients required– Easily toggle query
execution between xPlore and FAST
• Disadvantage– Requires a restart of the
repository to toggle query target
IS
IA
IS
IACS
Dual Mode Migration: Single CS Validation
Webtop
• During maintenance window toggle from FAST to xPlore
– ftengine_to_use parameter in the server.ini
• Validate the index – Execute queries– Run FTStateOfIndex job
• Toggle back to FAST if needed
IS
IA
IS
IACS1
Dual Mode Migration: Two Content Servers
Webtop 1
• Two Content Servers and two client, both FAST and xPlore actively indexing.
• Queries sent to FAST index and xPlore index.
• Advantage– No toggling for query
• Disadvantages– Additional CS and client– More maintenance
IS
IA CS2
Webtop 2
IS
IACS1
Dual Mode Migration: Two CSs Validation
Webtop 1
• Testers use Webtop 1 to validate searches against the xPlore index.
• Webtop 1 connects to CS1 which is configured to send queries to xPlore.
– ftengine_to_use
• Slowly migrate users.• Once validated, turn off
FAST and the extra CS and Webtop instance can be removed.
IS
IA CS2
Webtop 2
Migration Considerations• A single xPlore deployment can be
shared by multiple repositories. Data is separated into domains.
– Ties the future upgrades of the repositories at the hip
• By default, queries utilize native xPlore security [RECOMMENDED]
– ACLs & Groups replicated as part of IA Re-indexing operation
• All configuration options in terms of filters, special character configuration, etc. should be done before the migration to avoid re-indexing.
RepositoryA
Index Agen
t
xPlore
RepositoryB
Index Agen
t
Consolidated Deployment
Things to Remember
• Do not run setup scripts twice.
• Switch from migration mode to normal mode with IA completes the initial indexing of the repository.
• Stay current with the xPlore software if possible– xPlore 1.2 P01 is targeted for release at the end of
February.– Patches are generally available for all xPlore versions at
the end of each month.
Things to Remember (Cont.)
• Performance test your environment using Production data• Use the sizing calculator and other tools available
on the EDN– https://community.emc.com/docs/DOC-8922
• Do NOT mix migration and Production usage. Sending queries to xPlore during migration will slow down the process.• The resources required during migration are greater
than the resources required for normal indexing in most cases.• FAST was I/O intensive and xPlore is more CPU
intensive– Add more CPU’s during and then remove them afterwards
• SAN’s generally provide the best performance.
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
Troubleshooting Strategy
• Strategy to help troubleshoot the most common indexing issues• Start the indexing process with a test
document• Follow the document through the indexing
process all the way to the client• Identify the problematic component• Use troubleshooting concepts specific to the
component
Content Server
Content Server
Repository
• Import New Object• Execute the following query to view
the queue:select item_name, item_id, task_state, date_sent from dmi_queue_itemwhere name like 'dm_fulltext_index_user%’order by date_sent desc
• dmi_queue_item entry does not exist– This issue is CS-related– Is the type registered properly?– Use DA to check the type’s configuration
Index Agent
• If the queue object exists, what is the task_state?
• A blank state means the object is waiting to be processed
– Refresh the query– Are other dmi_queue_item objects
building up in the queue as you continue to execute the query?
• If the task state never changes, the Index Agent is source of the issue
– Is the IA started?– Check the log if it will not start– Is the IA running in migration mode?
Index Server
Index Agent
xDB/Lucene
CPS
Index Agent and Index Server• When the IA starts to process a queue object the task
state will change• There are multiple task states
– Acquired: Object is being processed– Warning: Failed to index the content, but the metadata was
indexed– Metadata will be searchable but not the content– Failed: Indexing failed
• If the object is successfully indexed the IA will remove the queue object
• For failures or warnings, check the Index Agent log and the Index Server logs
– dsearch_home\jboss5.1.0\server\DctmServer_Indexagent\logs– dsearch_home\jboss5.1.0\server\DctmServer_PrimaryDsearch\
logs
Index Server: Test
• Error messages in the log should give you a good indication of the root cause
– File type is not indexable– Content is larger than current settings
allow
• The xPlore Administration and Development Guide has a troubleshooting section that lists common errors.
• The Search Development Guide has a troubleshooting section as well that is specific to search
• Powerlink is also good resource for finding information related to the error codes.
Index Server
Index Agent
xDB/Lucene
CPS
Index Server: Test• In some cases the queue object will be removed as if
indexing was successful but you are not able cannot find the document
• Use the ‘Test Search’ option in the xPlore Admin tool to confirm whether or not the object exists in the index
– search using object’s r_object_id
Configuration and the Client
• If you see a result your focus should shift to the configuration of your Documentum environment and your client applications
– Does the user searching for the content have access to the object
– Check the repository log to see if the query plugin was loaded
Mon Jun 14 21:53:50 2010 031000 [DM_FULLTEXT_T_QUERY_PLUGIN_VERSION]info:"Loaded FT Query Plugin: ...C:\Documentum\product\6.5/bin/DSEARCHQueryPlugin.dll
• Is the associated format object fulltext enabled?
Configuration and the Client• Is the client using a fulltext query• Using Webtop review the query generated by your
search• On the search results page, the native query can be
viewed by clicking on the magnifying glass as shown below
• If the native query is DQL then fulltext searches could be disabled
– Check the following in your dfc.properties filesdfc.search.fulltext.enable = false
Agenda
• Terminology
• Overview of xPlore
• Architecture and Dataflow
• Installation and Migration
• Troubleshooting
• Resources
xPlore Landing Page on EDN:https://community.emc.com/docs/DOC-8945
xPlore Tools (sizing calculator, Bonnie, etc)https://community.emc.com/docs/DOC-8922
xPlore HA Setup and Administrationhttps://community.emc.com/docs/DOC-10111https://community.emc.com/docs/DOC-10978
Search for ‘xplore’ for several helpful articles, videos and presentations.
Posted on the Developer Network
• xPlore Administration (http://education.emc.com)– How to install, configure, and maintain a Documentum xPlore full-
text Index Server and Index Agent
• Several xPlore related white papers including migration, disaster recovery and best practices
− http://powerlink.emc.com/km/appmanager/km/secureDesktop?_nfpb=true&_pageLabel=default&internalId=0b0140668053a3a8&_irrt=true
• xPlore specific offerings from Professional Services
Other Resources
THANK YOUTHANK YOU