monitoring apache there are many ways to examine apache’s status and performance – apachectl...

23
Monitoring Apache There are many ways to examine apache’s status and performance apachectl –v – tells you the version number apachectl –V – gives you complete compiler settings apachectl status – gives you the server’s status in the form of a “scoreboard” where, for each apache child, you see its status as one of these characters: •_ waiting for connection • S starting up • R reading a request • W sending a reply • K keepalive • D performing DNS lookup • C closing connection • L logging information • G gracefully finishing • I idle cleanup • . open slot with no current process _____CCCCCCC_____RR……………… _CCCCCRR_________CC_CCC__....... _____CCCCCCCRW______.................. ____CCCCLLCCCCCR____..................

Upload: everett-parker

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Monitoring Apache• There are many ways to examine apache’s status and

performance– apachectl –v – tells you the version number– apachectl –V – gives you complete compiler settings– apachectl status – gives you the server’s status in the form of a

“scoreboard” where, for each apache child, you see its status as one of these characters:• _ waiting for connection• S starting up• R reading a request• W sending a reply• K keepalive• D performing DNS lookup• C closing connection• L logging information• G gracefully finishing• I idle cleanup• . open slot with no current process

_____CCCCCCC_____RR………………_CCCCCRR_________CC_CCC__......._____CCCCCCCRW______..................____CCCCLLCCCCCR____..................

Page 2: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Extended Status• You can obtain even more information (including PID)

using ./apachectl fullstatus – this gives you a snapshot of the current status of each child– to use fullstatus

• load the mod_info.so module (not needed in apache 2.2, part of the core)

• add the directive ExtendedStatus On to your httpd.conf file• add a <Location> container for the address /server-status in your

httpd.conf file that has the directive SetHandler server-status– Now, when you type ./apachectl fullstatus, the listing gives you

more details:• Srv – child server number & generation (in the form 5-1), and PID• Accesses of this connection for this child• Mode (as per last slide, _, C, R, W, etc)• CPU usage, number of seconds• Seconds since beginning of most recent request• Milliseconds required to process most recent request• Kilobytes transferred for the connection• Mbytes transferred for this child

Page 3: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

server-status and server-info• You can also obtain view this information via web

browser• Either server status information (as from the last

slide), or server information– for either/both of these, add a <Location /server-status> or

<Location /server-info> container• NOTE: the URL for these is simply http://ipaddress/server-status

or http://ipaddress/server-info– also to the container the proper handler, SetHandler

server-status or SetHandler server-info• Information available by server-info includes– version, compilation date– modules loaded, directives of each– hostname, port– timeout, keep-alive directives– server root, configuration file location

Page 4: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Security• Making this information available presents a security flaw

– by knowing the version of apache, it is easier to hack into the server and manipulate/destroy files

– yet this might be useful for a web administrator to check status or server information at any time either locally or remotely

• In the <Location> container from the previous slide, let’s add proper allow/deny statements to limit who can access this information– deny access to all except for specific IP address/port of the location

where our webadmin will access the server information from• Order deny,allow• Deny from all• Allow from 10.2.3.0/24

– by using 0 as the last octet, we are allowing access to anyone from this subnetwork (10.2.3)• the 24 is used to indicate a mask to indicate which octet to examine (8 for

first octet, 16 for first two, 24 for first three)– do this for both <Location /server-status> and <Location /server-

info> containers (if we use both)

Page 5: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Error Pages• Apache is configured to generate a generic page on

an error based on the status code– these response pages may lack useful information and so

apache allows you to alter the default configuration on errors

– you can• create your own error pages• create your own error scripts

– for instance, a php script• generate a short automated message• use a multi-language error page available in the errors directory • redirect the attempt to a local URL

– see for instance what happens at www.nku.edu when you specify any incorrect URL/filename

• redirect the attempt to an external URL– in your httpd.conf file, you set these up using the

ErrorDocument directive of the form:• ErrorDocument error-code document-name (or “message”)

Page 6: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Examples• ErrorDocument 401 /subscribe.html– here, presumably the user was not able to validly log in and

thus generated a 401 error, so we bring up the page /subscribe.html

• ErrorDocument 404 /cgi-bin/notfound.php– here, we run a script that we set up to handle any 404 (URL

not found) errors (this is what NKU does)• ErrorDocument 500 “Server Error!!”– here, we return a page with the text “Server Error!!”

• ErrorDocument 410 /var/web/errors/HTTP_GONE.html.var– here, we use one of the error pages made available in apache– these can respond differently based on several situations

• language of choice based on language negotiation, response includes environment variable(s) value(s) such as $HTTP_REFERER

• ErrorDocument 505 http://www.errors.org/error505.cgi– redirect to an external URL because of wrong HTTP version

Page 7: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Using the Multi-Language Files• To use the multi-language error document files available in

your error directory, there are several steps you will have to make– create an alias from /error/ to the actual location in your

filespace of your error documents• Alias /error/ “/usr/local/apache2/error/”

– notice the use of trialing / here!

– create a <Directory> container for that directory containing at a minimum• Options IncludesNoExec • AddOutputFilter Includes html• AddHandler type-map var

– the files in this directory end with a .var extension• Order deny,allow• Allow from all (this is needed since / (root) is denied to all)

– add your ErrorDocument directive• e.g., ErrorDocument 404 /error/HTTP_NOT_FOUND.html.var• these already exist the file httpd-multilang-errordoc.conf

Page 8: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

More on Multi-Language Pages• The nice thing about the use of the multi-language error

pages that are available in Apache is that, based on browser information, the actual language returned can be specialized– if you look at any of these files, you see entries for Content-

language for a number of different languages– based on the Content-language sent by the browser, the

matching Body is selected and returned• further, an if statement allows for a more specialized message as to

whether the page was reached directory or from a referer (a link)

• In order to get the language selected appropriately, you might want to include two additional directives in your <Directory> container from the previous slide:– LanguagePriority list (of languages here, e.g., en cs de es …)– ForceLanguagePriority Prefer Fallback

Page 9: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

External Redirects on Errors• An external redirect is not a matter of simply

“passing the buck”– recall from chapter 5 a redirect sends a response to the

web browser with a redirect status code (30x) and a new URL• the web browser then sends out a new HTTP request of the new

URL

– this can confuse crawlers and other agents who were expecting content back from their requests or error codes if the request could not be fulfilled, instead, they are given a new URL to pursue• the redirection can also cause problems if it arises during

authentication because the browser is not receiving a 401 code and so will not prompt the user for a password potentially leaving the user confused as to why the original request was not fulfilled yet taken to the wrong location

Page 10: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Automatic Logging• There are two forms of logging that are taken care of

automatically– access logging – logging every request sent by clients (browsers,

users, software)– error logging – logging any request that results in an error

• Either type of event will place a new entry into the appropriate log file

• Each entry will contain at a minimum– the time/date of the request– the URL– the IP address of the requester

• For errors, the status code will be included with the entry• For accesses, the command serviced (e.g., GET), the status

code, and the browser’s specification (type, OS, HTTP version) will be included with the entry

• Typically, Apache performs the logging itself – rather than invoking syslogd or klogd as with other Linux services

Page 11: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Error Logging• Errors can be written to a log file, sent to a pipe (that

is, piped to a Linux command) or written to the linux syslog service

• There are two apache directives to control logging– ErrorLog – specify the file or syslog

• if you do not set ErrorLog, it defaults to writing to the file error_log

• if you specify a filename, it is assumed to be under ServerRoot unless you specify the full path

• if the filename starts with | then the information is piped to the command that follows | – as in | cat which would display the error information to the terminal

window, probably a poor option• if you specify syslog, the syslogd service is used and follows the

action in the /etc/syslog.conf file for local7 messages– LogLevel – one of emerg, alert, crit, error, warn, notice,

info, debug (see table 7-2 on page 182 for more detail)

Page 12: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

IE Browser Error Pages• IE tends to ignore the error pages sent by Apache

and it displays its own, more generic page– MS considers their own pages to be more user friendly– the problem is that the error page sent by Apache might

include some useful content that an IE user will not see– IE will only display error pages for

• 403, 405, 410 errors if the page’s size > 256 bytes• 400, 404, 406, 408, 409, 500, 501, 505 errors if the page’s size >

512 bytes• but these pages, as generated by apache, tend to be smaller than

the byte size listed above– there is a way to force IE to display the sent error page

using the Windows Registry, but most users will not be aware of this

– or, you could create your own error pages and make sure that they are > 512 bytes to force IE to post your pages• I tried both of these and I could not get IE to post the apache

page so I’m unsure if its even possible!

Page 13: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

I/O Logging• Aside from logging requests and errors, you can log

regular apache I/O if desired– this requires the use of the mod_dumpio module

• this is not part of the base apache, so it must be separately compiled

– add the LoadModule statement to httpd.conf– there are three directives

• DumpIOInput on (or off, the default)• DumpIOOutput on (or off, the default)• LogLevel=value where value is one of emerg, alert, crit, error,

warn, notice, info, debug – here, you need to use debug

– the I/O logging is sent to your error log file, and because this generates an enormous amount of messages, you will probably not want to use this feature at all, or for a very long time

Page 14: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Access Logging• All http requests to your server are logged in the

access log– these include requests that result in errors

• Unlike error logging, these can only be logged to a specified file or written to a pipe – they cannot be sent to syslogd

• You can specialize the access log using the mod_log_config module which offers two directives– CustomLog allows you to specify a new place for the

output (a different file or a pipe)– LogFormat which allows you to specify how accesses are

logged in terms of what types of information (we will see details on this in the next slide)• in addition, the mod_sentenvif module can be used can be used to

set various environment variables based on attributes of a request

Page 15: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Log Formatting• The LogFormat directive allows you to specify how you want

your log entries to appear– you are able to define different formats and have them sent to

different files although this may not be useful• LogFormat “format” name• CustomLog location name

– format is a specification of the type of information to record and in what order it should be recorded (covered over the next few slides)

– location is the location in your file system where you want the log file to be written • if you specify a relative path, it is relative to ServerRoot

– name is the same on both lines used to link a specified format to a log file• you can shorten this by just doing CustomLog location “format” and omit

the second directive and the name, but this means that you cannot share a format between two or more different log files

• You can also specify under what condition(s) a format might be used (for instance, if the access resulted in a 200 status)– therefore, you can specify multiple logs, each with its own format

Page 16: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

More• The “format” will comprise a series of percent directives

(covered on the next slide) that specify what information should be logged (recorded)– these include such pieces of information as requestor’s IP

address, URL requested, time of request, etc– the entire format is placed inside of “”

• for example, “%a %U” means “IP address of client and URL requested

• Conditional directives allow you to specify what status code(s) you desire for that piece of information to be logged– multiple status codes are separated by commas, and the code(s)

appear between % and the directive• %200a means to log %a (IP addr of client) if the status code is 200• %400,401,402,403,404U means to log %U (URL) if the status code is

any of 400-404– you can also place ! in front of the number as in %!200a– if the condition is not met, the requested value is replaced by a

hyphen (- ) in the log file

Page 17: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Useful Percent Directives• The full set of percent directives is given in table 7-3 on page

188, here, we look at the most useful– %a – remote IP address, – %A – server IP address– %B – bytes sent excluding header– %c – connection status when complete– %D – duration of request– %f – filename (resource)– %H – request protocol– %m – request method– %P – PID of child servicing request– %s – status– %t – time of request– %u – remote user (only available if user has authenticated)– %U – requested URL– %{X}e – output the value of environment variable X– %{X}i – output X’s header (X might be User-agent or Referer)– %{format}t – output the time using the provided format

Page 18: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Examples• The Common Log Format is a standard format developed

for NCSA servers– this format string is “%h %l %u %t \“%r\” %>s %b” which is

• the host, remote logname (or – if not known/supported), user name (if known through login), date, request (inside of “” since the \ is an escape character), status (3 digit code) with a > prior to the status, and bytes of the transferred file including the header

• Imagine that your website is linked from other sites and you want to know how often a visitor has reached your site through one of those links (referers)– use “%{Referer}i -> %U”– this records into your log file the referer and the URL (how

they got here and where they tried to go)

• Or you might want to know the web browser of a visitor– use “%{User-agent}i”

Page 19: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Multiple Logs• Lets imagine that we want to have one log for all

successful operations, one log for redirections, and one log for 40x errors– LogFormat “%200a %200U %200t” success– CustomLog logs/success_log success– LogFormat “%301,302,303%a %301,302,303U

%301,302,303t” redirection– CustomLog logs/redirection_log redirection– LogFormat “%401,403,404,410a %401,403,404,410U

%401,403,404,410t” error40x– CustomLog logs/error_log_40x error40x

• We could change the format so that each log file logs different types of information– for instance we might want to know the specific error for the

error_log_40x file by adding %s• note that %s will return the original request’s status in the case of a

redirection (e.g., 30x), if we want the final status, use %>s– or the size of the file (%B) on a 200 success

Page 20: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

SetEnvIf Directives• This directive allows you to set an environment

variable which you can then use for your logging– the format of the directive is

• SetEnvIf attribute regex env-variable[=value]– you can set multiple variables if desired

– the attribute is usually a value from the request header (e.g., Method, Protocol, Host, User-Agent, Referer, Range) or it can be one of Remote_Addr, Remote_User, Request_URI or it can be an already defined environment variable• example: SetEnvIf Referer www\.nku\.edu internal

– this sets the variable local (to true)

• example: SetEnvIf Remote_Addr 127\.0\.0\.1 self• example:

– SetEnvIf Request_URI “\.gif$” type=gif – SetEnvIf Request_URI “\.jpg$” type=jpg – this will set the variable type to be of the type of image requested

Page 21: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Log Rotation• If you are running a web server for even a modest

sized web site, you may receive thousands of hits a day– each of these is logged in the access_log file and the

error_log file may become large as well– log rotation is the process of moving the current log file

into a “retired” log file• these typically appear with .# after their name as in access_log,

access_log.1, access_log.2, access_log.3 with the previous access_log.3 being deleted and the new access_log starting blank

• depending on how quickly a log file fills up, you may want to rotate the files every day, every week or every month

– while you might write your own script to handle this and then issue a crontab job, there is a built-in apache program called rotatelogs that does this for you• this program is typically in the same directory as apachectl• you run it as rotatelogs filename rotationtime (in seconds, 86400

is every day)

Page 22: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Favicon.ico• The favicon is an icon that is displayed in the

browser’s address bar next to the URL of the site’s “logo” (you can also see these in bookmarks)– the icon will reside in the web site’s home directory

(DocumentRoot)

• If a site does not have a favicon.ico in that directory, typically the error and access logs fill up with error messages– you have three ways to prevent this

• create an icon and put it in this directory• create a 0 byte file whose name is favicon.ico in this directory• suppress the log messages as follows:

– SetEnvIf Request_URI favicon\.ico favicon– CustomLog logs/access_log common env=!favicon

• this says “for any request for favicon.ico, set the variable favicon to true, and log anything when favicon is false

Page 23: Monitoring Apache There are many ways to examine apache’s status and performance – apachectl –v – tells you the version number – apachectl –V – gives you

Reporting Programs• You want to search your log files for useful information– how many people are visiting? what errors are arising? is the

same IP address sending numerous requests (e.g., denial of service request)?

• wading through thousands of entries can be time consuming– you have many choices such as using awk or writing your own

shell scripts• with awk, you could count the number of times each unique IP address

is found to see if you are being attacked• with your own script, you could generate a report that lists all of the

404 errors by URL so that you could see if there are URLs that are being misinterpreted by the users

– AWStats is a reporting tool that can dig through your file(s) for useful information like trends, that you might want to share with your marketing department – this is open source software