http /apache
DESCRIPTION
HTTP /Apache. Intro to Apache / HTTP / Modules. Introduction to Apache. Two thirds of the web servers apache Excellent tool for WWW hosting May not perform well in some bench mark tests – does perform in the field. It is stable, heavily tested in different environments and platforms and - PowerPoint PPT PresentationTRANSCRIPT
HTTP /Apache
Intro to Apache / HTTP / Modules
Introduction to Apache
• Two thirds of the web servers apache• Excellent tool for WWW hosting• May not perform well in some bench
mark tests – does perform in the field.• It is stable, heavily tested in different
environments and platforms and• It’s FREE!
A little bit of History
• Apache is based on NCSA’s httpd, a daemon that was first used in early years of the Internet.
• Rob McCool – creator of NCSA Web Server til 1994 – then no single programmer took his place.
• Server cont’d to grow in popularity but incompatibilities between versions began to develop.
History Cont’d
• Eventually – a group of administrators began working together to regain control.
• Single path developed project came to be know as “ a patchy server “ or Apache server.
• Today – Apache posseses a level of complexity that easily surpasses some OS’s.
• Not necessary to learn every feature but a functional understanding is needed.
TCP / IP
• Transmission Control Protocol / Internet Protocol
• Main protocol suite that allows computer on the Internet to communicate.
• Name to number translation is possible with Domain Name System (DNS)
• A group of distributed servers that exist to translate the string of numbers into names.
• Every time the name of a Web site is entered, the name must be resolved to an IP address through a look-up on a DNS server to determine where to send the request.
• Apache relies on TCP \ IP for :– Communication with Internet browsers– Depends on TCP / IP as a language between
web servers– TCP / IP Port 80 used to carry the request /
response protocol – known as http
HTTP
• Hyper Text Transfer Protocol protocol used to submit and serve requests.
• Uniform Resource Identifier (URI), Uniform Resource Locator (URL), and Uniform Resource Name (URN) – all part of the communication between clients and servers using HTTP.
• Each request is preceded by a bit of information about the request/response that is contained n a header.
Headers
• The client and the sever know they are actually dealing with hypertext when they transfer info back and forth.
• When they send data – start transmission with headers.
• It is a piece of info that tells the client or server what kind of info it is receiving.
• Knows the diff between text, html, jpeg images and …..
Headers Cont’d
• Apache makes significant work of headers.
• Headers can indicate what kind of Web browser is hitting the Web site.
• Apache can use this info to do specific things based on browser info contained within the headers.
• More later…
Apache
• On a Web server – server process listens for incoming requests on a specified port of the network interface – responds appropriately.
• At first – web sites consisted of static web pages in a directory tree and special functions could be handled by CGI scripts – executed on request.
• Today – variations have been developed making it necessary for developers to rethink and retool the original Web server definition.
Apache Cont’d
• Most expect a certain level of participation by the server.
• Dynamic content – is everywhere.• Serve web pages as well and create
them.• Greater demands on the server
Apache Conventions
• How does Apache work?– By replicating itself– Let’s the Apache copy – called a child
process – handle the process– These are the processes that dole out the
content– Apache rec’s a request, it chooses one of its
idle child processes – handles the request.– When it’s done – returns to an idle state
ready for more work.
• Every child handles a certain number of requests before dying.
• Killing a child process after it handles a # of requests also kills any side effects – exp memory leaks.
• Use can determine # of children Apache spawns at one time, as well as the life span of the child.
• Normally – child handles unlimited # of req’s.
Apache Configurations
• Good way to learn about apache is – apache configuration
• First level of configuration occurs during compilation.– Set :
• Whether Apache uses Dynamic Shared Objects (DSO’s) or statically linked modules
• If static modules – tell Apache which modules to compile into the server at this pt.
• Default locations• Layout of Apache’s files and directories
Http.conf
• Next level of configuration – httpd.conf file• After installation – is a global perferences
file that controls all aspects of the server:– Who can access the Web server– What dynamic modules the server loads– Where certain functions occur– What kind of info apache will log– The location of the virtual web sites apache
controls– What content is allowed
.htaccess file
• Last level of configuration occurs here• These are small, local files that contain
specific orders that apply to the directories in which they reside.
• Httpd.conf contains great deal of info about Apache’s users.
• Comments are marked using #’s.• Apache offers diff levels of control within
httpd.conf, based on user’s needs.
• These levels are called scopes.• A scope can apply to an entire system,
a virtual host, an absolute directory path, or even a single file.
• Wild cards are used with scopes to give greater freedom.
• Containers are used to define each scope
Four Main Containers
<Directory /path>#This container applies to the filesystem
location of a directory.#For example, /home/user/public_html
would#pt to user’s public_html directory.Directive</Directory>
<Location /~*>#Location applies to a url. For example
~/user would pt to user’s public_html directory
Directives</Location>
<VirtualHost host_name>#VirtualHost applies to a virtual web site
you are hosting, such as mysite.comDirectives</VirtualHost>
<Files file_name>#Applies to a file name, no matter where it
is located. If file_name were index.html, then all instances of index.html ( no matter what their location) would be affected by your directives.
Directives</Files>
Directives
• Is a command give to Apache that controls its behavior.
• They are a single line of text with the following format:– Directive [option [option] ]More than one option for your key word. Can
be more than one directive in a given scope. But each directive must occur on its own line.
Directives that do not reside in any container apply to all scopes
More on directives later…..
Handlers
• When a user requests a certain file, Apache decides what to do with it.
• If the user accesses a file called hiccup.cgi ---
Apache must know how to execute the .cgi or not try to display it as an HTML file.
Handlers help you direct Apache to the correct helper app.
Exp: add a handler called cgi-script and apply it to all files with an ext. of .cgi
Logging
• Apache logs almost everything that happens to it – errors to HTTP transactions and even cookies.
• User can set the level of logging and what is logged in httpd.conf
• More on logging later….
Modules
• Are chunks of software (perl or C) – allow customization of Apache.
• For instance – for Apache to use Server side Includes – user must install that module.
• Kernel modules might enable support for a nic card, some protocol, or other resource.
• Linux kernel modules can be loaded and unloaded for quick system customization.
• Apache was originally developed with above concept in mind – not quite there yet but can easily incorporate integration of dynamic module loading.
• Static Versus Dynamic Modules– Early versions – modules were statically linked
into the httpd app during compilation.– Specific configuration of a server was defined
during the config process before compilation– Using this method, provide for development
of numerous modules – customization of the server.
• Since version 1.3 – it is possible to compile the server and the modules to support dynamic loading.
• With this option compiled, configuration of the server can be changed by simply restarting the server after editing the config file – rather than recompiling the server.
• Today – two modules must be compiled into the server: http_core.c provides the core functionality of the server and mod_so.c which supports DSO.
Static Module Advantages
• It may be desirable to compile other modules into the server so that they are always available, giving a slightly faster response and guaranteeing that a specific version of the module is present at all times.
Static Module Disadvantages
• Compiling a module statically into the server takes up space within the server, even if its features are not used.
• Also – upgrading requires the user to recompile the server.
DSO Advantages
• Benefits are being able to add or remove a feature or to update a feater to a newer version without having to recompile the server.
DSO Disadvantages
• DSO files – cause the server to be approx 20% slower at startup – cause the module loader must find all the modules – load them – resolve relocation symbols.
• However – startup is infrequent.• On some platforms – performance is 5%
slower – due to relative addressing of position-independent code – slower than absolute addressing.
• Few other potential problems – apply to platforms other than Linux
Using DSO Files
• Since advent of DSO – more common for distros to include Apache server with binaries already installed and make installation of source code opt.
• This means that a full set of files necessary to compile a new module acquired from another source is not initially present on the system.
• To prevent this problem, authors developed Apache eXenSion (APXS)
APXS
• A Perl program• Created at installation and gives access to all
Apache header files – as well as platform-dependent compiler and linker flags.
• User is then able to compile Apache modules without the Apache source tree and without struggling with platform-dep linker and compiler flags.
• Using APXS$cd /path/to/the_module$apxs –c mod_new_module.c$apxs –i –a –n foo mod_new_module.so
HTTP
• Broken down – Hypertext is a format of content viewed by Web browser.
• Tranfer Protocol is means by which Web servers and clients share hypertext.
• Version HTTP/1.1• Lynx – www.slcc.edu/lynx/release• Info:• www.lynx-brower.org