adapting awstats for ibm® websphere® portal 6.0.x and...

23
Adapting AWStats for IBM® WebSphere® Portal 6.0.x and virtual portals Markus Moltenbrey IT Specialist, IBM Software Group IBM Deutschland GmbH August 2008 © Copyright International Business Machines Corporation 2008. All rights reserved. Abstract: The functionality to track site usage is still provided as of IBM® WebSphere® Portal version 6.0.x, but the Tivoli® Web Site Analyzer that charts the resulting log files from version 5.1 was deprecated. As a result, the open-source solution AWStats has become increasingly popular with WebSphere Portal customers. The free software product creates extensive charts formatted for browser access with minimal effort; however, it cannot discern between virtual portals and does not display the human- readable part of the site URL string. This white paper explains how to extend AWStats to provide the missing functionality.

Upload: others

Post on 24-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

Adapting AWStats for IBM® WebSphere® Portal 6.0.x and virtual portals Markus Moltenbrey IT Specialist, IBM Software Group IBM Deutschland GmbH August 2008 © Copyright International Business Machines Corporation 2008. All rights reserved. Abstract: The functionality to track site usage is still provided as of IBM® WebSphere® Portal version 6.0.x, but the Tivoli® Web Site Analyzer that charts the resulting log files from version 5.1 was deprecated. As a result, the open-source solution AWStats has become increasingly popular with WebSphere Portal customers. The free software product creates extensive charts formatted for browser access with minimal effort; however, it cannot discern between virtual portals and does not display the human-readable part of the site URL string. This white paper explains how to extend AWStats to provide the missing functionality.

Page 2: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

Contents1 Introduction ................................................................................................................3 2 Colors & naming conventions ....................................................................................4 3 Process overview........................................................................................................5 4 Site analyzer ...............................................................................................................5 5 Prerequisites ...............................................................................................................7

5.1 Installing perl .........................................................................................................7 5.2 Installing Apache ...................................................................................................9 5.3 Installing Java® Runtime Environment (JRE) ......................................................9

6 AWStats installation and configuration......................................................................9 6.1 Installation .............................................................................................................9 6.2 Configuration.......................................................................................................10

7 Pre-AWStats script installation.................................................................................11 7.1 Script description .................................................................................................11 7.2 Installation ...........................................................................................................11 7.3 URLs and tasks ....................................................................................................11 7.4 Script configuration .............................................................................................12 7.5 Cronjob configuration..........................................................................................12

8 The script package....................................................................................................12 8.1 variable.sh............................................................................................................12 8.2 awstats_update.sh ................................................................................................14 8.3 prepare_parsing.sh ...............................................................................................15 8.4 parseSALog.pl .....................................................................................................17 8.5 exprequest.xml.....................................................................................................19 8.6 transform.xsl ........................................................................................................20 8.7 AWStats sample configuration (truncated)..........................................................20

9 Configuration alternatives ........................................................................................20 9.1 Non-Portal-Server hosting AWStats ....................................................................20 9.2 Anonymous usernames ........................................................................................21 9.3 XMLAccess export ..............................................................................................21

10 Conclusion................................................................................................................23 11 About the author .......................................................................................................23 12 Resources..................................................................................................................23

Page 3: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

1 Introduction Many customers require a basic site analysis of their IBM® WebSphere® Portal environment. Here we examine a cost-effective solution combining the integrated site analyzer engine--which provides information comparable to an Apache access.log --and AWStats, an open-source Perl-based application with which we can visualize the resulting protocols (see figure 1). Figure 1. AWStats example

In WebSphere Portal 6.0 the log entries of accessed pages consist of long, cryptic strings in which the page title is shown at the very end of the request URL. The resulting graphs and statistical data aggregated by AWStats are not really useful because only the first parts of the strings are displayed in the graphs; the page title is not displayed (see figure 2).

Page 4: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

Figure 2. AWStats output example for WebSphere Portal 6.0

Moreover, accesses to different virtual portal sites are logged in the same file and represented by one resulting graph. If the information is to be presented completely separate from each other and the accessed pages are to be human readable, then some adaptations are required. This white paper explains a solution that was implemented at one customer site. Note that it does not include detailed installation instructions for Apache, AWStats, or WebSphere Portal; for information on these topics, see the developerWorks article, “IBM WebSphere Developer Technical Journal: Using portal analytics with open-source reporting tools.”

2 Colors & naming conventions The color codes we’ll use in this white paper are as follows:

User Input Variable name Directory Filename Code/Command (Example)

Table 1 explains the site-specfic parameters we’ll use. You’ll need to replace them with your environment’s site-specific values. Table 1. Our site-specific parameters $vp_name Virtual Portal URI Context, e.g.

http://myportal.server.com/wps/portal/VP1 $awstats_install_root Installation directory of AWStats.

Default: /usr/local/awstats$script_root Root directory where the provided scripts are located, e.g.

/usr/local/scripts$portal_server_root Default product Location, e.g. C:\IBM\WebSphere\PortalServer$portal_admin_id WebSphere Portal Server Administrator User ID (Short) $portal_admin_pw WebSphere Portal Server Administrator User Password $portal_hostname The fully-qualified domain name of WebSphere Portal $wc_default_port WC_DEFAULT Port of WebSphere Portal Server Instance $portal_admin_id WebSphere Portal Server Administrator User ID (Short) $was_root Default product Location of WebSphere Application Server

Page 5: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

3 Process overview Basically, nothing was changed in the source code of AWStats, though it would have been possible. To provide compatibility with later releases and to keep the process flexible, we implemented a kind of pre-processor and scripts to automatically adapt the site-analyzer log files and deliver better data to AWStats (see figure 3). Figure 3. Process overview

In brief, the requirements for this customer’s configuration are:

1. Completely separated statistical information for the several different virtual portal sites. 2. The displayed WebSphere Portal pages must be human readable. 3. The statistics must be updated frequently and automatically. 4. Users accessing WebSphere Portal must be tracked to deliver data on how many users

accessed the site while keeping the users' names anonymous.

4 Site analyzer Site analysis is not enabled by default. To activate the data collection for WebSphere Portal, you must set the parameters shown in table 2 in the WP_SiteAnalyzerLogService. See the IBM WebSphere Portal information center for more information on how to set configuration properties.

Page 6: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

Table 2. Parameter values in WP_SiteAnalyzerLogService Parameter Value

SiteAnalyzerFileHandler.fileName log/sa_$APPSERVER_NAME.log SiteAnalyzerFileHandler.backupFileName log/sa_$APPSERVER_NAME_$CREA

TE_TIME.log SiteAnalyzerFileHandler.dateFormat yyyy.MM.dd-HH.mm.ss SiteAnalyzerFileHandler.daysPerLogFile 1 To activate selected loggings, set the corresponding values to true (see table 3). Table 3. Selected loggings SiteAnalyzerSessionLogger.isLogging true SiteAnalyzerUserManagementLogger.isLogging false SiteAnalyzerPageLogger.isLogging true SiteAnalyzerPortletLogger.isLogging true SiteAnalyzerPortletActionLogger.isLogging false SiteAnalyzerErrorLogger.isLogging false SiteAnalyzerApplicationActionLogger.isLogging false

Here is an example log entry:

10.15.100.76 – Mark Mustermann [20/Jun/2008:04:36:23 +0000] "GET /Page/[ObjectIDImpl_'6_QRQMLMFAUFUJ902T20FMDT1I92'_[-7351565280287691910:5300871843438323840@-1196_/_CONTENT_NODE],_Domain:_[Domain:_rel],_DB_representation:_54FB-7A6B5BED53FEF99980E8023CDB7A9049]/News HTTP/1.1" 200 -1 "" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)" "JSESSIONID=00018zhK4eeMpyo6GJsB-UK0Zhq:-3483ID;LtpaToken=pwTAXkGiuWs4MYE4FKKb7WCPfFnwThtigHGXlc/yASknHmWCKXp3v4IDkNapj2Ws2y8fbqx/TwRqtF/KxUBoys9cWOU8GYSUhM+OUizUXFcWrGVwX11vMGZU/lv66m64RjbX0znh85oWfFmK2LP7hG8G/cV8eINA0VdQvVVOw9FXkJ8qnyXoTOqNY+njGv1b4mVl9AJagoMR5pl5XTXav+nUMJ10bR+DSYgPX3Lirw0u4LuaQjuU/KMEArFGB3aA2kslPEUSapJBdvH4opq5FS0NsA0VHvMR+avG8BKh2Zsx/lHBLzu4oxndva7F9RsOBxmCrgv+jDBUQ523LOR/mhkJ3uwC3rb3uj5/4+reZG0="

The following data can be derived from the log entry:

• The request was made from IP 10.15.100.76. • The authenticated user for this request was Mark Mustermann • Handling the request was finished at 20/Jun/2008:04:36:23, GMT +0000. • The page with title "News" was requested. • The request was successful (HTTP response code 200). • The size of the returned markup is unknown to the logger (-1). • The request was made by a browser that identifies itself as "Mozilla/4.0 (compatible;

MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)" • The request was made within a session, identified by the id "00018zhK4eeMpyo6GJsB "

After modification with the scripts, the log entry looks similar to the following:

Page 7: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

10.15.100.76 – $1$Ved3sWIq$DL/0e1WN35lumC2RfHUIb/ [20/Jun/2008:04:36:23 +0000] "GET /Page/(news.unique.name)/News HTTP/1.1" 200 -1 "" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)" "JSESSIONID=00018zhK4eeMpyo6GJsB-UK0Zhq:-3483ID;LtpaToken=pwTAXkGiuWs4MYE4FKKb7WCPfFnwThtigHGXlc/yASknHmWCKXp3v4IDkNapj2Ws2y8fbqx/TwRqtF/KxUBoys9cWOU8GYSUhM+OUizUXFcWrGVwX11vMGZU/lv66m64RjbX0znh85oWfFmK2LP7hG8G/cV8eINA0VdQvVVOw9FXkJ8qnyXoTOqNY+njGv1b4mVl9AJagoMR5pl5XTXav+nUMJ10bR+DSYgPX3Lirw0u4LuaQjuU/KMEArFGB3aA2kslPEUSapJBdvH4opq5FS0NsA0VHvMR+avG8BKh2Zsx/lHBLzu4oxndva7F9RsOBxmCrgv+jDBUQ523LOR/mhkJ3uwC3rb3uj5/4+reZG0="

Table 4 explains the adapted sample log entry in more detail. Table 4. Detail of the sample log entry

Original String Replacement Mark Mustermann $1$Ved3sWIq$DL/0e1WN35lumC2Rf

HUIb/ GET /Page/[ObjectIDImpl_'6_QRQMLMFAUFUJ902T20FMDT1I92'_[-7351565280287691910:5300871843438323840@-1196_/_CONTENT_NODE],_Domain:_[Domain:_rel],_DB_representation:_54FB-7A6B5BED53FEF99980E8023CDB7A9049]/News HTTP/1.1

GET /Page/(news.unique.name)/News HTTP/1.1

5 Prerequisites To run AWStats, you must have a running Web server, for example, Apache, with enabled Perl module (mod_perl) to host the analyzer application. Also, besides an HTTP server, the provided scripts require a Linux/UNIX-based operating system and a Perl installation with the following additional modules (available from CPAN):

XML:XPath XML::Parser

5.1 Installing perl If you are running a Linux system (or most UNIX systems, including Mac OS X), you probably already have an installation of perl that was packaged with it. Determine the version by issuing the following command: perl -v You should get an output similar to the following:

Page 8: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

If perl is not installed on your current environment, install it using one of the following methods:

• Via precompiled packages of your Linux distribution (for example, RPM or DEB). The appropriate files should be located on the installation CDs or on one of the distributor's package servers. On SuSE try YAST, and on RedHat try YUM to install perl. For example, on RedHat, issue the following command as root user to install perl:

yum install perl

• Download and install ActivePerl:

http://www.activestate.com/store/productdetail.aspx?prdGuid=81fbce826bd549bca91508d58c2648ca

• Download, compile, and install the perl source code from CPAN:

http://www.cpan.org/authors/id/R/RG/RGARCIA/perl5.10.0.tar.gz Installation steps for additional perl modules:

1. Download the package from CPAN: wget http://download.server.url

Package Download URL

XMLParser2.36.tar.gz http://search.cpan.org/CPAN/authors/id/M/MS/MSERGEANT/XMLParser2.36.tar.gz

XMLXPath1.13.tar.gz http://search.cpan.org/CPAN/authors/id/M/MS/MSERGEANT/XMLXPath1.13.tar.gz

2. Extract the package on your server to a temporary directory:

tar zxvf packagename.tar.gz

3. Switch to the package directory and issue the following commands to compile and install the module:

cd package_directory perl Makefile.PL make make test su - make install

4. Repeat the above steps for each module.

Page 9: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

5.2 Installing Apache Some distributions include a prepackaged version of Apache on the installation discs or package server. To install Apache via the distributor's specific packages, use either yum, yast, or aptget (depending on your distributor) to install Apache HTTP Server, including the required perl module extension. If you do not have the option of installing Apache by this method, or the version provided is not the desired on download, then install the Webserver according to the instructions on the Apache Web site. 5.3 Installing Java® Runtime Environment (JRE) Either use the IBM JRE delivered with WebSphere Application Server located in $was_root/java or download an RPM package from either the Sun or the IBM developerWorks Linux Download information Web site.

6 AWStats installation and configuration AWStats is available for download from the AWStats Official Web site. 6.1 Installation Since AWStats is a perl application, no compilation is required. Just download it, extract the downloaded package, move the content to /usr/local/awstats, and run the configuration wizard from inside /usr/local/awstats/tools:

tar zxvf awstats-x.x.tar.gz -C /tmp/ mkdir /usr/local/awstats mv /tmp/awstats-x.x/* /usr/local/awstats cd /usr/local/awstats/tools ./awstats_configure.pl

Config Wizard Question Description Answer (example) Config file path ('none' to skip Web server setup):

Provides the full path to your http server's configuration file

/etc/httpd/conf/httpd.conf

Do you want me to build a new AWStats config/profile?

Answer this question with No since we will create the configuration profiles later

N

In the Apache Webserver's configuration file the following changes should be visible:

Page 10: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

NOTE: If you do not want the wizard to adapt your httpd.conf, for example, if additional configurations are located in /etc/httpd/conf.d and are only included via the httpd.conf, you do not need to run the configuration wizard for installation. Instead, copy httpd_conf from $awstats_install_root/tools to /etc/httpd/conf.d/awstats.conf and adapt the paths within the file, if you installed AWStats to a nonstandard directory. 6.2 Configuration To create an AWStats configuration profile, just re-run the awstats configuration wizard (awstats_configure.pl). If you want to create one report per virtual portal server, you must create one configuration profile for each virtual server. Repeat the following steps for each profile you want to create:

1. Start the wizard with the following command within the AWStats installation directory:

cd /usr/local/awstats perl /usr/local/awstats/tools/awstats_configure.pl

2. Answer the questions as shown in table 5: Table 5. Configuration wizard questions & answers

Question Description Answer Config file path ('none' to skip Web server setup):

Only specify a configuration file here if this is the first time your run the wizard.

None

Do you want me to build a new AWStats config/profile file (required if first install) [y/N] ?

Answer this question with 'Yes' to start the creation of a new configuration profile

Y

Your Web site, virtual server, or profile name:

Use the URI context name of your virtual portal(s) here instead of a fully qualified hostname, e.g. http://myportal.server.com/wps/portal/VPx

VPx

3. Now edit the configuration files (found in /etc/awstats) for each virtual portal, using the

values shown in table 6.

Table 6. Configuration file values Parameter Description Value

LogFile The directory where the log files that shall be analyzed are located. $vp_name must be replaced by the name of the corresponding virtual portal, e.g., VPx

"/usr/local/awstats/salogs/$vp_name/$vp_name.log"

LogType The type of log file that will be analyzed. ('W' means Web log file)

W

LogFormat The format of one log entry in the log file

"%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot"

DirData Only change this value if you do not want to place the AWStats in the default directory. (If the directory is not present, create it with 'mkdir /var/lib/awstats')

"/var/lib/awstats"

Page 11: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

7 Pre-AWStats script installation 7.1 Script description Table 7 lists the scripts necessary for installation and what each one does. Table 7. Script descriptions

Script Description

awstats_update.sh Main script; executes the other required processes.

variables.sh Stores required environment variables and properties, and is executed by awstats_update.sh

prepare_parsing.sh - Invokes the xmlaccess task to export the portal site structure of each virtual portal. - Transforms the xml using transform.xsl. - Merges the site analyzer log files; backs up and deletes old files. - Invokes sub-scripts parseSALog.pl

parseSALog.pl - Parses the original Site Analyzer log file and splits it into one file per virtual portal - Replaces the page URLs with a more human-readable version - Anonymizes the username component

exprequest.xml XML input file for xmlaccess request to export the page structure of the (virtual) portal

transform.xsl XSL to transform exprequest.xml into a more simple form containing only AWStats-relevant information

7.2 Installation To use a script, extract the package and copy the files to a directory, for example, /usr/local/scripts, on your WebSphere Portal server (see table 8). Table 8. Pre-AWStats scripts and their default installation directory

File Directory awstats_update.sh /usr/local/scripts prepare_parsing.sh /usr/local/scripts/awstats variables.sh /usr/local/scripts/awstats praseSALog.pl /usr/local/scripts/awstats exprequest.xml $awstats_install_root/salogs/ transform.xsl $awstats_install_root/salogs/

7.3 URLs and tasks To manually update the AWStats configuration of the new sites, enter the following command at the command prompt, ensuring that the required pre-scripts have been executed as well:

perl awstats.pl -update-config=VPx You can view the new statistics in a browser via the following URL:

http://myportal.server.com/awstats/awstats.pl?config=VPx

Page 12: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

7.4 Script configuration To use the scripts, replace the parameters marked with $parameter as described in Section 8 below. After the primary setup, you can add as many virtual portal instances as you want, by creating AWStats configuration for a new instance and adding the instance name (url context) to the VPS list in variables.sh. NOTE: If you modified the scripts in a Windows editor, execute the command dos2unix on the Linux/UNIX shell to remove Windows characters and convert the files to a Linux/UNIX format. 7.5 Cronjob configuration To run the scripts automatically, add the following lines to your cron configuration:

8 The script package In this section the source code of each script is listed and briefly explained. The parts that must be adapted for your environment are also marked. 8.1 variable.sh Table 9 explains the parameters of the variable.sh script. Table 9. Default values and descriptions of the variable.sh script

Parameter Description Default Value date Full path to system tool

'date' /usr/bin/date (UNIX) /bin/date (Linux)

mk Full path to system tool 'mkdir'

/usr/bin/mkdir (UNIX) /bin/mkdir (Linux)

rm Full path to system tool 'rm' /bin/rm (Unix & Linux)

java Full path to java installation /usr/java14/jre/bin/java (UNIX) /usr/bin/java (Linux)

perl Full path to perl installation /usr/bin/perl (UNIX & Linux)

tar Full path to GNU Tar /usr/bin/tar (UNIX) /bin/tar (Linux)

merger Full path to the AWStats log merger script

/usr/local/awstats/tools/logresolvemerge.pl

parsesa Full path to the perl script parseSALog.pl

/usr/local/scripts/awstats/parseSALog.pl

awstats Full path to AWStats /usr/local/awstats/wwwroot/cgibin/awstats.pl

# ------------------------------------------------------------------ # Extensions for AWSTATS # # Update awstats database every day at 01:00 00 01 * * * $script root/awstats update sh

Page 13: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

awstats_script_root Installation directory of the scripts preparing the SA logs for AWStats (except: awstats_update.sh)

/usr/local/scripts/awstats

WPBIN The binary directory of the local WebSphere Portal installation directory

/usr/IBM/WebSphere/Portal/bin

CP System library path /usr/lib

log_base_dir Log and working directory of the scripts

/usr/local/awstats/salogs

SALOGDIR Directory where the original site analyzer logs are located

/usr/local/awstats/salogs

CURRDATE Holds information about the current date

`${date} +%d%m%Y_%H%m`

WPADMIN WebSphere Portal Administrator

Wpsadmin

WPADMINPWD WebSphere Portal Administrator Password

Passw0rd

WPURL WebSphere Portal URL for the xmlaccess configuration interface

http://my.poral.server:10038/wps/config

INXML Full path to the XML file for issuing the xmlaccess page export

${awstats_script_root}/exprequest.xml

XSL Full path to the transformation XSL

${awstats_script_root}/transform.xsl

NEWSA Full path to the merged site analyzer log file

/usr/local/awstats/salogs/samerge.log

VPS List of the virtual portal context URIs separated by spaces (for the main portal enter 'root')

root

The source code is as follows: NOTE: Blue-colored elements in this script are variable replacements for this document only; you cannot keep these during script execution. Change the values according to your environment.

Page 14: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

#!/bin/bash #Setup environment variables... #Applications datefunc=/bin/date mk=/bin/mkdir rm=/bin/rm java=/usr/bin/java perl=/usr/bin/perl merger=$awstats_install_root/logresolvemerge.pl parsesa=$script_root/parseSALog.pl awstats=awstats_install_root/wwwroot/cgi-bin/awstats.pl tar=/bin/tar awstats_script_root=/usr/local/scripts/awstats #Directories WPBIN=$portal_server_root/bin CP=/usr/lib #Please do not use any variable replacements here, since these values are required for perl as well #Directory of new SA logs log_base_dir=$awstats_install_root/salogs #Directory of current SA logs SALOGDIR=$awstats_install_root/salogs #Other variables CURRDATE=`${datefunc} +%d%m%Y_%H%m` WPADMIN=$portal_admin_id #Portal Administrator WPADMINPWD=$portal_admin_pw #Portal Admin Password WPURL=http://$portal_hostname:$wc_default_port/wps/config #Websphere Portal configuration XML INXML=${awstats_script_root}/exprequest.xml #XML Request file and path XSL=${awstats_script_root}/transform.xsl #XSL Transformation filename and path logfile=/usr/local/awstats/log/log_${CURRDATE} #Path to and filename of merged SA log NEWSA=$awstats_install_root/salogs/samerge.log #Do not use any variable replacements here #Names of virtual Portals separated by a space character VPS="root $VPx" #entries must be entered case-sensitive

8.2 awstats_update.sh NOTE: Be careful when adapting the variable in green to your environment in the script:

Page 15: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

#!/bin/bash #set -x ####################################################################### # # This script initiates the pre-AWStats scripts and the update process # of AWStats # # Name : awstats_update # # PreReq : The appropriate configuration on awstats has to be prepared # # This productivity aid is provided on an "AS-IS" basis # WITHOUT ANY WARRANTY OF ANY KIND # ######################################################################## awstats_bin=$script_root/awstats datefunc=/bin/date currdate=`${datefunc} +%F_%H%m` logfile=$awstats_install_root/log/cronlog_${currdate} #creating log header echo "------------------------------------------" >> ${logfile} 2>> ${logfile} echo "Execution date: `${datefunc} +%F`" >> ${logfile} 2>> ${logfile} echo "Execution time: `${datefunc} +%T`" >> ${logfile} 2>> ${logfile} echo "Used config: $1" >> ${logfile} 2>> ${logfile} echo "" >> ${logfile} 2>> ${logfile} sh ${awstats_bin}/prepare_parsing.sh echo "Process ended at `date +%T` on `date +%F` " >> ${logfile} 2>> ${logfile} echo "------------------------------------------" >> ${logfile} 2>> ${logfile}

8.3 prepare_parsing.sh

Page 16: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

Adapt the second line in the script according to your environment (in green text):

#!/bin/bash source $script_root/awstats/variables.sh echo "Export site structure from ${WPURL}..." >> ${logfile} 2>> ${logfile} for VP in ${VPS} do echo "Connecting to ${VP}..." >> ${logfile} 2>> ${logfile} LOGDIR=${log_base_dir}/${VP} OUT=${LOGDIR}/${VP}.xml CURRDATE=`${datefunc} +%F_%H%M` if [ ! -e ${LOGDIR} ]; then ${mk} -p ${LOGDIR} fi ${rm} -f ${OUT} if [ ${VP} = "root" ]; then ${WPBIN}/xmlaccess.sh -user ${WPADMIN} -password ${WPADMINPWD} -url ${WPURL} -in ${INXML} -out ${OUT}~ else ${WPBIN}/xmlaccess.sh -user ${WPADMIN} -password ${WPADMINPWD} -url ${WPURL}/${VP} -in ${INXML} -out ${OUT}~ fi ${java} -cp ${CP} org.apache.xalan.xslt.Process -IN ${OUT}~ -XSL ${XSL} -OUT ${OUT} ${rm} -f ${OUT}~ done echo "Merging site SA log files..." >> ${logfile} 2>> ${logfile} ${perl} ${merger} ${SALOGDIR}/sa_WebSphere_Portal_*-*.log > ${NEWSA} echo "Deleting original files..." >> ${logfile} 2>> ${logfile} ${tar} -cf ${SALOGDIR}/logs_${CURRDATE}.tar ${SALOGDIR}/sa_WebSphere_Portal_*-*.log ${rm} -f ${SALOGDIR}/sa_WebSphere_Portal_*-*.log echo "Start AWStats pre-process..." >> ${logfile} 2>> ${logfile} ${perl} ${parsesa} echo "Run AWStats configuration..." >> ${logfile} 2>> ${logfile} for VP in ${VPS} do LOGDIR=${log_base_dir}/${VP} OUT=${LOGDIR}/${VP}.log echo "Updating awstats config for ${VP}..." >> ${logfile} 2>> ${logfile} ${awstats} -update -config=${VP} -LogFile=${OUT} echo "Deleting log for ${VP}" >> ${logfile} 2>> ${logfile} ${rm} -f ${LOGDIR} done echo "GENERATE_VP FINISHED ${CURRDATE}" >> ${logfile} 2>> ${logfile}

Page 17: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

8.4 parseSALog.pl Adapt the line in green to your environment: #!/usr/bin/perl use Digest::MD5 qw(md5 md5_hex); use XML::XPath; # Declare the subroutines sub trim($); sub ltrim($); sub rtrim($); $anonymous = md5_hex("Anonymous"); $variables = "$script_root/awstats/variables.sh"; print STDOUT "Opening properties file " . $variables . "...\n"; open (PROP, "<$variables") or die "an error occurred: $!"; @properties = <PROP>; close(PROP); foreach (@properties) { $propline = trim($_); if ( $propline =~ /^VPS="(.*)".*$/ ) { $vps = trim($1); } if ( $propline =~ /^NEWSA=(.*.log)\s.*/i ) { $filename = $1; } if ( $propline =~ /^log_base_dir=(.*)$/i ) { $logdir = $1; $xmldir = $1; } } print STDOUT "Properties read: \n"; print STDOUT "Logdir: " . $logdir . "\n"; print STDOUT "XMLdir: " . $xmldir . "\n"; print STDOUT "Logfile: " . $filename . "\n"; @portals = split(/ /, $vps); foreach $portal (@portals) { print STDOUT "VP: " . $portal . "\n"; } #Create Hash / Mappings new sa-log-name => Input Site XML print STDOUT "Creating list of virtual portals...\n"; foreach $portal (@portals) { $xmlvpmap{$portal} = $portal . ".xml"; } print STDOUT "Opening sa log " . $filename . "...\n"; open (SALOG, "<$filename") or die "an error occured: $!"; @lines = <SALOG>; # Read it into an array close(SALOG); # Close the file #Parse the SA Log for each XML File print STDOUT "Start processing each portal...\n"; foreach $key (keys %xmlvpmap) { $counter = 0; print STDOUT "Processing virtual portal " . $key . "...\n"; $logfile = $logdir . "/" . $key . "/" . $key . ".log"; $xmlfile = $xmldir . "/" . $key . "/" . $xmlvpmap{$key};

Page 18: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

$xp = XML::XPath->new(filename => $xmlfile) or die ("Cannot open" . $xmlfile); open (FILEHANDLE, ">$logfile") or die ("Cannot open" . $key); #Preparse XML file print STDOUT "Parsing xmlfile " . $xmlfile . "...\n"; $nodeset = $xp->find('/export/page'); foreach $node ($nodeset->get_nodelist) { $objectid = $node->find('@objectid'); $titles{$objectid} = $node->find('@title'); $unids{$objectid} = $node->find('@uniquename'); } print STDOUT "Parsing log file...\n"; #Go throug sa.log foreach (@lines) { $counter++; if ( $counter % 100 eq 0 ) { print STDOUT "Parsed " . $counter . " lines so far! \n"; } $line = trim($_); #Only GET Page Requests are required if ( $line =~ /"GET \/Page\//i ) { #Extract username from line $line =~ /.*-\s(.*)\s\[/i; $username = $1; #Value for logname/username is not empty or '-' if ( $1 eq '-' ) { #Replace empty entry with md5 hash for anonymous $line =~ s/\s-\s-/ - $anonymous/g; } elsif ( $1 eq '' ) { #Replace empty entry with md5 hash for anonymous $line =~ s/\s-\s-/ - $anonymous/g; } else { #Replace clear text username with a md5 hash $username_md5 = md5_hex($username); $line =~ s/$username/$username_md5/g; } #Extract objectID $line =~ /\[ObjectIDImpl_'(.*)'/i; $oid = $1; #Check for OID in XML if ( defined($titles{$oid}) ) { $title = $titles{$oid}; $uniquename = $unids{$oid}; #Replace spaces in title & uniquename $title =~ s/\s/_/g; $uniquename =~s/\s/_/g; $oidreplacement = '"("' . $uniquename . '")"' .$title; $line =~ s/(\"GET \/Page\/)(.*)($oid)(.*)(\sHTTP.*)/\1$oidreplacement\5/g; print FILEHANDLE $line . "\n"; } } } #Clear hashes containing OID and title mappings

Page 19: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

undef %titles; undef %unids; close (FILEHANDLE); } # Perl trim function to remove whitespace from the start and end of the string sub trim($) { my $string = shift; $string =~ s/^\s+//; $string =~ s/\s+$//; return $string; } # Left trim function to remove leading whitespace sub ltrim($) { my $string = shift; $string =~ s/^\s+//; return $string; } # Right trim function to remove trailing whitespace sub rtrim($) { my $string = shift; $string =~ s/\s+$//; return $string; } 8.5 exprequest.xml

<?xml version="1.0" encoding="UTF-8"?><request xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="PortalConfig_1.4.xsd" type="export"> <portal action="locate"> <content-node action="export" uniquename="wps.content.root" export-descendants="true"/> </portal> </request>

Page 20: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

8.6 transform.xsl

<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text" /> <xsl:template match="/"> <xsl:text>&lt;?xml version='1.0'?&gt; </xsl:text> <xsl:text>&lt;export&gt; </xsl:text> <xsl:apply-templates select="request/portal/content-node" /> <xsl:text> &lt;/export&gt; </xsl:text> </xsl:template> <xsl:template match="request/portal/content-node"> <xsl:text> &lt;page </xsl:text> <xsl:value-of select="request/portal/content-node[@type='page']"/> <xsl:text> uniquename=&quot;</xsl:text><xsl:value-of select="@uniquename"/> <xsl:text>&quot; objectid=&quot;</xsl:text> <xsl:value-of select="@objectid"/> <xsl:text>&quot; title=&quot;</xsl:text> <xsl:value-of select="localedata[@locale='de']/title"/> <xsl:text>&quot; /&gt;</xsl:text> </xsl:template> </xsl:stylesheet>

8.7 AWStats sample configuration (truncated) This is merely an excerpt of a configuration file. Only non-standard parameters are listed.

LogFile="/usr/local/awstats/salogs/VPx/VPx.log"LogType=W LogFormat = "%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot" LogSeparator=" " SiteDomain="VPx" HostAliases="VPx 127.0.0.1 localhost" DNSLookup=2 DirData="/var/lib/awstats" DirCgi="/awstats" DirIcons="/awstatsicons" AllowToUpdateStatsFromBrowser=0 AllowFullYearView=2

9 Configuration alternatives Let’s now discuss some alternative configurations. 9.1 Non-Portal-Server hosting AWStats In the example environment, both the HTTP server hosting AWStats and the Portal server instance are installed on the same physical server. If, in your infrastructure, Portal and Web server are hosted on different servers, you must modify the scripts a bit, since the prepare_parsing.sh script requires xmlaccess to export the page hierarchies.

Page 21: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

In this case, copy the scripts to the server hosting WebSphere Portal and comment the following lines in prepare_parsing.sh:

echo "Run AWStats configuration..." >> ${logfile} 2>> ${logfile} for VP in ${VPS} do LOGDIR=${log_base_dir}/${VP} OUT=${LOGDIR}/${VP}.log echo "Updating awstats config for ${VP}..." >> ${logfile} 2>> ${logfile} ${awstats} -update -config=${VP} -LogFile=${OUT} echo "Deleting log for ${VP}" >> ${logfile} 2>> ${logfile} ${rm} -f ${LOGDIR} echo "Now removing ${OUT}..." >> ${logfile} 2>> ${logfile} ${rm} -f ${OUT} done

Provide the new site analyzer logs to the AWStats server by establishing, for example, an NFS share, and then create another script on the server hosting AWStats, to execute the code above. 9.2 Anonymous usernames To display the usernames in clear text and not as md5 hash, comment the following lines in parseSALog.pl: if ( $line =~ /"GET \/Page\//i ) { #Extract username from line $line =~ /.*-\s(.*)\s\[/i; $username = $1; #Value for logname/username is not empty or '-' if ( $1 eq '-' ) { #Replace empty entry with md5 hash for anonymous $line =~ s/\s-\s-/ - $anonymous/g; } elsif ( $1 eq '' ) { #Replace empty entry with md5 hash for anonymous $line =~ s/\s-\s-/ - $anonymous/g; } else { #Replace clear text username with a md5 hash $username_md5 = md5_hex($username); $line =~ s/$username/$username_md5/g; } 9.3 XMLAccess export The frequent xmlaccess export of the Portal site structure can be skipped if (1) you provide the XML manually or (2) no virtual portals are used1. The following adaptations are required to switch off this part of the script: 1. Adapt parseSALog.pl as follows:

with URL replacement using xmlaccess:

1 Actually the whole URL string could be shortened to display only the page title, but if virtual portals are used and a

separate statistic for each portal shall be created, the xmlaccess export is required. Without the XMLs of each portal, the script cannot parse the logs correctly.

Page 22: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

#Extract objectID $line =~ /\[ObjectIDImpl_'(.*)'/i; $oid = $1; #Check for OID in XML if ( defined($titles{$oid}) ) { $title = $titles{$oid}; $uniquename = $unids{$oid}; $oidreplacement = '"(" . $uniquename . ")" .$title . "\"'; $line =~ s/(\"GET \/Page\/)(.*)($oid)(.*)(\sHTTP.*)/\1$oidreplacement\5/g; print FILEHANDLE $line . "\n"; }

without xmlaccess:

#Extract objectID $line =~ /(.*\]\/)(.*)(\sHTTP.*\")/i; $replacement = $2; $line =~ s/(\"GET.*)(\/Page.*)(\sHTTP.*)/\1$replacement\3/g; print $line . "\n"

2. Comment the following lines in parseSALog.pl :

$xmlfile = $xmldir . "/" . $key . "/" . $xmlvpmap{$key}; $xp = XML::XPath->new(filename => $xmlfile) or die ("Cannot open" . $xmlfile); #Preparse XML file print STDOUT "Parsing xmlfile " . $xmlfile . "...\n"; $nodeset = $xp->find('/export/page'); foreach $node ($nodeset->get_nodelist) { $objectid = $node->find('@objectid'); $titles{$objectid} = $node->find('@title'); $unids{$objectid} = $node->find('@uniquename'); }

3. Comment the following lines in prepare_parsing.sh:

${rm} -f ${OUT} if [ ${VP} = "root" ]; then ${WPBIN}/xmlaccess.sh -user ${WPADMIN} -password ${WPADMINPWD} -url ${WPURL} -in ${INXML} -out ${OUT}~ else ${WPBIN}/xmlaccess.sh -user ${WPADMIN} -password ${WPADMINPWD} -url ${WPURL}/${VP} -in ${INXML} -out ${OUT}~ fi ${java} -cp ${CP} org.apache.xalan.xslt.Process -IN ${OUT}~ -XSL ${XSL} -OUT ${OUT} ${rm} -f ${OUT}~

Page 23: Adapting AWStats for IBM® WebSphere® Portal 6.0.x and ...public.dhe.ibm.com/software/dw/websphere/AWStatsforPortalFinal.pdf · Figure 2. AWStats output example for WebSphere Portal

4. Set the value of the property VPS in variable.sh to root, to process the main portal only:

VPS="root“ #entries must be entered case-sensitive

10 Conclusion AWStats for WebSphere Portal 6.0 is a free and powerful solution to track user activities and display the information graphically. You have learned how to easily implement an add-on to extend AWStats functionality for virtual portals and human-readable page descriptions in the resulting statistical analysis.

11 About the author Markus Moltenbrey is an IT Specialist for WPLC infrastructure and architecture within IBM's Software Services for Lotus®. He began his career with IBM four years ago, working on Lotus Workplace and WebSphere Portal customer projects. The author can be reached at [email protected].

12 Resources • AWStats installation instructions:

http://awstats.sourceforge.net/docs/awstats_setup.html • developerWorks WebSphere Developer Technical Journal: Using portal analytics with open-

source reporting tools: http://www.ibm.com/developerworks/websphere/techjournal/0609_liesche/0609_liesche.html

• WebSphere Portal Information Center:

http://publib.boulder.ibm.com/infocenter/wpdoc/v6r0/index.jsp • Apache HTTP Server:

http://httpd.apache.org/ • CPAN:

http://search.cpan.org/ • IBM JRE Download page:

http://www.ibm.com/developerworks/java/jdk/linux/download.html

Trademarks IBM, Lotus, Tivoli, and WebSphere are trademarks or registered trademarks of IBM Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others.