Download - Presented March 2008 To SAIS 2008
![Page 1: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/1.jpg)
1
Web Analytics: A Brief Tutorialby
Dr. Robert J. BoncellaProfessor of Information Systems & Technology
School of BusinessWashburn University
Presented March 2008
ToSAIS 2008
![Page 2: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/2.jpg)
2
Introduction
• Web analytics is the study of the behavior of website visitors.
• In a commercial context, web analytics refers to the use of data collected from a web site to determine which aspects of the website achieve the business objectives
• Tutorial Outline– Web Analytics: Context– Web Analytics: Technology & Terminology– Web Analytics: Tools and Case Studies
![Page 3: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/3.jpg)
3
Context for Web Analytics• DSS – Decision Support System
– A conceptual framework for a process of supporting managerial decision- making, usually by modeling problems and employing quantitative models for solution analysis
• BI - Business Intelligence subset of DSS– An umbrella term that combines architectures, tools, databases,
applications, and methodologies
• BA - Business Analytics subset of BI– The application of models directly to business data– Assists in making strategic decisions
• WA - Web Analytics subset of BA– The application of business analytics activities to Web-based
processes, including e-commerce
![Page 4: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/4.jpg)
4
Web Analytics - Details• Relevant Technology
– Internet & TCP/IP– Client / Server Computing– HTTP (HyperText Transfer Protocol)– Server Log Files & Cookies– Web Bugs
• Data Collection – The Clickstream
• Server Log Files• Page Tagging
• Data Analysis– Data Preparation– Pattern Discovery– Pattern Analysis
![Page 5: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/5.jpg)
5
Client Server
This is a response
This is a request
Client/Server Computing
![Page 6: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/6.jpg)
Internet & TCP/IP
• The Internet– The infrastructure that provides for the
delivery of data between computer based processes
• TCP/IP– The protocols that provides for reliable
delivery of data on The Internet
6
![Page 7: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/7.jpg)
7
HTTP Protocol• Client sends a request to a server• Server sends a response to client• Connectionless
– Client: • Opens connection to server• Sends request
– Server• Responds to request• Closes connection
• Stateless– Client/Server have no memory of prior
connections– Server cannot distinguish one client request from
another client
![Page 8: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/8.jpg)
8
Cookies• Used to solve the “Statelessness” of the HTTP
Protocol• Used to store and retrieve user-specific
information on the web• When an HTTP server responds to a request it
may send additional information that is stored by the client - “state information”
• When client makes a request to this server the client will return the “cookie” that contains its state information
• State information may be a client ID that can be used as an index to a client data record on the server
![Page 9: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/9.jpg)
9
ClientBrowser
My_Brwsr
Server BServer C
WBS Server A
Cookie: My_BrwsrPg A - Server APg B - Server BPg C - Server C
1. Render page2. Click on URL
Page B cnts- URLs & Img Src- WebBug Img@ WBS. TRKSTRM.COM
Page A cnts- URLs & Img Src- WebBug Img @ WBS. TRKSTRM.COM
Page C cnts- URLs & Img Src- WebBug Img@ WBS. TRKSTRM.COM
Req: Page_B.html
Req: Page_A.html
Res: Page_A.html
Req:
WebBug IMG-Referer Header- Any cookie for TRKSTRM.com
Res:
WebBug Img-Cookie to client Browser on 1st Req.
Res: Page_B.html
Res: Page_C.html
Req: Page_C.html
Web Bug Process
![Page 10: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/10.jpg)
10
Common Clickstream Data Sources
• Server Log Files– Passive data collection– Normal part of web browser/ web server
transaction
• Page Tagging– Active data collection– Often requires a third party to implement – a
vendor
![Page 11: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/11.jpg)
11
Server Log Files
• The name & IP address of the client computer• The time of the request• The URL that was requested• The time it took to send the resource• If HTTP authentication used; the username of
the user of the client will be recorded• Any errors that occurred• The referer link • The kind of web browser that was used
Each time a client requests a resource the server of that resource may record the following in its log files:
![Page 12: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/12.jpg)
12
Server Log Files
• Example– 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700]
"GET /apache_pb.gif HTTP/1.0" 200 2326
• 127.0.0.1 – Remote host• frank - user name• [10/Oct/2000:13:55:36 -0700] - date & time• "GET /apache_pb.gif HTTP/1.0" - request• 200 - status• 2326 - bytes
![Page 13: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/13.jpg)
13
Server Log Files
• Technical issues for server log data– Data Preparation– Pageview Identification– User Identification– Session Identification
![Page 14: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/14.jpg)
14
Page Tags as Data Source
• Provided by Third Party - Vendor– Vendor Supplies Page Tags– Vendor Collects the Data– Vendor Analyzes the Data– Business Accesses the Data
• Online or• Reports sent to Business
![Page 15: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/15.jpg)
15
Web Data Abstractions
• Abstractions concerning Web usage, Content, and Structure
• Establishes precise semantics for the concepts – Web site– Users or Visitors– User Sessions– Server Sessions or Visits– Pageviews– Clickstreams
![Page 16: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/16.jpg)
16
Data Abstractions• Web Site - collection of interlinked Web pages,
including a host page, residing at the same network location.
• User or Visitors - principal using a client to interactively retrieve and render resources or resource manifestations– an individual that is accessing files from a
Web server, using a browser. • User Session - a delimited set of user clicks
across one or more Web servers
![Page 17: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/17.jpg)
17
Data Abstractions
• Server Session or Visit - a collection of user clicks to a single Web server during a user session
• Pageview - the visual rendering of a Web page in a specific environment at a specific point in time– a pageview consists of several items
• frames, text, graphics, and scripts that construct a single Web page
• Clickstream - a sequential series of pageview requests made from a single user
![Page 18: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/18.jpg)
18
Web Data Abstractions (High Level)
• Abstractions concerning Visitors• Establishes precise semantics for the concepts
– Unique Visitor– Conversion Rate– Abandonment Rate– Attrition– Loyalty– Frequency– Recency
![Page 19: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/19.jpg)
19
Data Abstractions
• Unique Visitor– A unique visitor is counted when a human being uses
a web browser to visit a web site.– A visitor may be “unique” for different periods of time.– The individual is defined by a cookie in the visitor’s
web browser
![Page 20: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/20.jpg)
20
Data Abstractions
• Conversion Rate– A conversion rate is the number of “completers”
divided by the number of “starters” for any online activity that is more than one logical step in length
– Starting and finishing any activity• Purchase• Download a research article• Etc.
![Page 21: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/21.jpg)
21
Data Abstractions
• Abandonment Rate– The abandonment rate for any step in a multi-step
process is one minus the number of units that make it to “step n+1” divided by those at “step n”
– The formula is (1 – ((n+1)/n)– Consider a 10 step process to acquire a resource
• How any quit after step 1 or 2 or 3 or 4 or …
– Consider a 5 step process to acquire a resource• How any quit after step 1 or 2 or 3 or 4 or …
![Page 22: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/22.jpg)
22
Data Abstractions
• Attrition– Attrition is a measurement of people you have been
able to successfully convert but are unable to retain to convert again
– Consider e-bay web site vs. web site for technical information
![Page 23: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/23.jpg)
23
Data Abstractions
• Loyalty– Loyalty is a measure of the number of visits any
visitor is likely to make over their lifetime as a visitor– Reported as number of visits per visitor
• 100 visitors made 3 visits each, 87 visitors made 4, etc.• Avoid double counting (i.e. do not count the 87 in with the
100)
![Page 24: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/24.jpg)
24
Data Abstractions
• Frequency– Frequency is a measure of the activity a visitor
generates on a web site in terms of time between visits
– Measured in terms of “days between visits”
![Page 25: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/25.jpg)
25
Data Abstractions
• Recency– Recency is the number of days since the last visit (or
purchase)– Reported as the number of visitors who returned after
“n” days.
![Page 26: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/26.jpg)
26
Pyramid Model of Web Analytics Data
Hits
Page Views
Visits
Unique Visitors
Uniquely Identified Visitors
Volume of Available Data
Incr
easi
ng V
alue
of D
ata
![Page 27: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/27.jpg)
27
Web Usage Mining
• Web usage mining is to apply statistical and data mining techniques to the processed server log data, in order to discover useful patterns
• Data mining methods and algorithms that have been adapted for the Web domain– Association rules– Sequential pattern discovery– Clustering– Classification
![Page 28: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/28.jpg)
28
Web Usage Data Mining• After discovering patterns from usage data, a
further analysis has to be conducted. • Common ways of analyzing such patterns
– Using a query mechanism on a database where the results are stored
– Loading the results into a data cube and then performing OLAP operations
– Visualization techniques are used for an easier interpretation of the results
• Using these results in association with content and structure information concerning the Web site there can be extracted useful knowledge for modifying the site according to the correlation between user and content groups.
![Page 29: Presented March 2008 To SAIS 2008](https://reader036.vdocument.in/reader036/viewer/2022081512/568157c2550346895dc54706/html5/thumbnails/29.jpg)
29
Web Analytics: Tools and Case Studies
• Tools– VisiStat - www.visistat.com
• Web Analytics Case Studies– Communications Provider - TuVox.com– Online Retailer - TicketsByInternet.com – Winery & Entertainment Venue - The Mountain Winery – Non-Profit Organization - SFBallet.org – Public Relations & Media Agency - BLASTmedia– Technology Provider for Real Estate Professionals - Pullan.com – Real Estate Agency - Intero Real Estate – Start-Up Online Business - GuruPrint.com