web servers
DESCRIPTION
Guntis Bārzdiņš Artūrs Lavrenovs. Web servers. What web servers do?. What web servers do. Implement HTTP protocol Listen for HTTP requests from browsers Try to fulfill them with static content from file system Modern web servers also Forward dynamic content requests to other systems - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/1.jpg)
Web servers
Guntis BārzdiņšArtūrs Lavrenovs
![Page 2: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/2.jpg)
What web servers do?
![Page 3: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/3.jpg)
What web servers do
● Implement HTTP protocol● Listen for HTTP requests from browsers● Try to fulfill them with static content from file
system● Modern web servers also
– Forward dynamic content requests to other systems– Do lots of useful tasks using modules
![Page 4: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/4.jpg)
What are some of the web servers
![Page 5: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/5.jpg)
C10K problem
● Dan Kegel, 1999● Web servers should handle ten thousand
clients simultaneously● Operating system kernel limitations● Operating system provided functionality● Web server design flaws
![Page 6: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/6.jpg)
C10K problem solution – OS kernel
● Open source nature of unix kernels allowed to quickly identify all C10K bottlenecks and fix them
● Networking related algorithms and data structures in unix kernels originally implemented with complexities O(n|n^2|...) which where fixed to O(1|n)
● As a result networking capabilities of unix kernels are virtually limitless (limited by hardware resources)
![Page 7: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/7.jpg)
C10K - OS functionality
● Implemented new scalable I/O event notification mechanisms (epoll – Linux, kqueue - *BSD)– Better performance than traditional poll/select– Can receive all pending event using one system call
● AIO - The POSIX asynchronous I/O (AIO) interface allows applications to initiate one or more I/O operations that are performed asynchronously (i.e., in the background). The application can elect to be notified of completion of the I/O operation in a variety of ways: by delivery of a signal, by instantiation of a thread, or no notification at all.
![Page 8: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/8.jpg)
C10K – web server design
● Non-blocking I/O for networking and disk– Don't block waiting on action completion, serve
other requests and wait for notifications about I/O completion
● Many threads– Use all available CPU cores to achieve maximum
concurrency, avoid locking data structures● Each thread serves many requests
– Don't create thread per request, reuse threads, while some non-blocking action completes process other requests
![Page 9: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/9.jpg)
C10M problem – Next decade
● 10 million concurrent connections per server● Current unix kernels can't handle that
– Application thread locks in kernel– Hardware drivers (NIC)– Memory management
● Solution: new generation of high load unix kernels– 1 main application per server– Minimize system call ammount– Minimize kernel work
![Page 10: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/10.jpg)
Dynamic content
![Page 11: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/11.jpg)
Dynamic content
● Web servers can't create dynamic content themselves
● We need application created in some programming language
● We need some method how web server can communicate with application– CGI– Apache modules– FastCGI, SCGI, ...– WSGI, PSGI, JSGI, ...
![Page 12: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/12.jpg)
CGI - Common Gateway Interface
● Oldest method of getting dynamic content from web servers
● For each browser request web server defines set of environment variables derived from request and server configuration
● Web server starts application in prepared environment
● Send POST data as standard input (if any)● Waits for standard output from executed file
and returns it to browser
![Page 13: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/13.jpg)
CGI application
● Can be ANY script or binary file executable in UNIX
● No libraries required● Use request information from environment
variables● Or ignore it completely if not needed● Process standard input if needed● Output additional HTTP headers and then
generated document body in standard output
![Page 14: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/14.jpg)
CGI enivronment variables
● REQUEST_METHOD: name of HTTP method● PATH_INFO: path suffix, if appended to URL after program name and a
slash● PATH_TRANSLATED: corresponding full path as supposed by server, if
PATH_INFO is present● SCRIPT_NAME: relative path to the program, like /cgi-bin/script.cgi● QUERY_STRING: the part of URL after ? character. The query string may
be composed of *name=value pairs separated with ampersands (such as var1=val1&var2=val2...) when used to submit form data transferred via GET method as defined by HTML application/x-www-form-urlencoded
● REMOTE_HOST: host name of the client, unset if server did not perform such lookup
● REMOTE_ADDR: IP address of the client (dot-decimal)● Variables passed by user agent (HTTP_ACCEPT,
HTTP_ACCEPT_LANGUAGE, HTTP_USER_AGENT, HTTP_COOKIE and possibly others) contain values of corresponding HTTP headers
● Only few more
![Page 15: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/15.jpg)
CGI example
#!/bin/bash
echo "Content-type: text/plain"
echo ""
echo "Hello world!"
echo "Today is:" `date`
![Page 16: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/16.jpg)
CGI issues
● Each request forces to create new process, big overhead for process creation and destruction
● All script files must be interpreted on each request, another big overhead
● Not scalable● Not suitable for modern web servers● Still widely used in embedded systems (e.g. wifi
router web management console) which require occasional requests
![Page 17: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/17.jpg)
FastCGI● Multiple processes started● Web server communicate over sockets or TCP● Each process serves many requests● Good performance● Complete separation of web server and
dynamic content system● Great scalability – put FastCGI processes
across server farm
![Page 18: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/18.jpg)
Other communication methods
● Integrate dynamic content generation system with web server process (Apache modules)
● CGI derivatives (SCGI)● *SGI implement programming language specific
method of communication between web server and selected programming language (WSGI – Python, PSGI – Perl)
● Proxy requests to applications that implement communication via HTTP
![Page 19: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/19.jpg)
LAMP
● Linux Apache MySQL PHP● Most common web server stack● Simple to install and configure● Simple to develop web applications● Acceptable performance and security
![Page 20: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/20.jpg)
Apache● One of the oldest web servers● Still actively developed● Most popular web server today and in recorded
web server history● Highly configurable and extensible using
modules● All in one solution● Runs on many OS, most often on unix servers
![Page 21: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/21.jpg)
PHP
● One of the most popular web application programming language
● Easy to learn (bad coding practices)● Interpreted language● Functions from unix libraries and tools● Huge ammount of ready applications, libraries
and modules
![Page 22: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/22.jpg)
MySQL
● Unix distributions moving towards MariaDB (MySQL fork) after acquisition by Oracle
● Fast relation DB implementation● Fairly easy to user● Different storage engines (faster without
transactions, slower with, memory based, etc.)● Query caching● User quotas
![Page 23: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/23.jpg)
Historical installation
● Acquire source files for all required software (Apache MySQL PHP)
● Acquire all dependencies and install them● Configure make files via ./configure● Compile everything● Configure each piece of software so it works
with other● Use it
![Page 24: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/24.jpg)
Modern installation
● Use OS package manager– root@server# apt-get install libapache2-mod-php5
apache2 php5 mysql-server● Use it
![Page 25: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/25.jpg)
Simple web site example● Create database user, database, table structure
and maybe some data● Using MySQL command prompt accessed by
– $ mysql -u root -p– > CREATE DATABASE `example` COLLATE
'utf8_general_ci';– > CREATE TABLE `posts` (...)– > CREATE USER 'example'@'localhost' IDENTIFIED
BY PASSWORD '…'– > GRANT ... ON `example`.* TO 'example'@'localhost';– > INSERT INTO `posts` (`title`, `info`) VALUES ('a', 'a');
![Page 26: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/26.jpg)
Simple web site example II
● Or be lazy and use some web interface like phpMyAdmin or Adminer– Download single file adminer.php– Drop it into /var/www– Navigate your browser to
http://localhost/adminer.php– Do all the tasks in browser without really knowing
SQL
![Page 27: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/27.jpg)
Simple web site example III
● Create file example.php in /var/www● Write your HTML and PHP code inside
– Connect to database– Select data– Show data
● Your simple web site is ready● Navigate your browser to
http://localhost/example.php● Enjoy result
![Page 28: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/28.jpg)
Simple web site example - Source
![Page 30: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/30.jpg)
nginx
● Contestant for 2nd place in web server rating● Event-driven● High-performance (thousands req/s)● Small memory footprint per request● Efficient CPU usage● Advanced configuration and functionality via
modules● Often used as FrontEnd to big websites● CloudFlare built on top of it
![Page 31: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/31.jpg)
High-load web systems
● Big dynamic web site can't reside in only 1 server
● Need some strategy how to split load across multiple web servers
● One possible strategy– One entry point “FrontEnd” which receives all
requests and can handle the load (e.g., Varnish, nginx)
– Backends process requests from FrontEnd (nginx, Apache)
![Page 32: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/32.jpg)
Kas ir Varnish?
● Starpniekserveris (proxy server)– Reversais– Kešojošais– Programmējams
● Slodzes dalītājs (load balancer)● Dinamiskā satura ģenerētājs● Rīki – žurnalēšanas, atkļūdošanas,
monitorēšānas
![Page 33: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/33.jpg)
Kādēļ Varnish?● Fantastiska veiktspēja pat uz lētā gala
serveriem – no 1000 līdz 10000 pieprasījumu uz serveri sekundē tā ir norma● C + LABI C programmētāji● Izmanto Unix arhitektūras labumus
● Pēc tūninga desmitiem tūkstošu pieprasījumu sekundē, testēšanā pārsniegti 100k/s
● Brīva programmatūra (free open source)● Pieprasījuma orientēta domēnspecifiska
konfigurēšanas/programmēšanas valoda VCL● Gandrīz viss, kas nepieciešams augstas
noslodzes tīmeklim, vienā
![Page 34: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/34.jpg)
Kešošana● Jebkura dinamiskas tīmekļa lapas ģenerēšana
ir ļoti lēna - atkarībā no vides simtiem vai tūkstošiem reižu lēnāka nekā statiska satura atgriešana● Lētā gala serveris var ģenerēt pāris simtus šādu
dinamisku lapu sekundē● Jebkurš izstrādes ietvars padara dinamiskas
lapas ģenerēšanu vēl desmitiem vai simtiem reižu lēnāks (it īpaši Java EE, Zend Framework)● Jau tikai daži desmiti pieprasījumi sekundē
● Rupja matemātika 100x100=10 000 reižu lēnāk kā statiska lapa
![Page 35: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/35.jpg)
Kešošana II● Ideja – būtu ideāli atgriezt dinamisku saturu ar
veiktspēju līdzīgu statiskām lapām● Mēs varam saglabāt tās lapas, kas ir vienādas
lietotājam un būtiski nemainās noteiktā laika posmā
● Cietā diska izmantošana lēna, labā prakse izmantot tikai RAM vai servera SSD visa kešotā satura glabāšanai
● Katram konkrētam gadījumam jāveido kešošanas stratēģija, kas var būt ļoti subjektīva
![Page 36: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/36.jpg)
Varnish kešošana● Pēc pieprasījuma adreses (pilnas vai regulāras
izteiksmes) var noteikt, kurus pieprasījumus kešot, cik ilgi konkrētu elementu kešot vai nekešot – standarta kešošanas pieeja praktiski visur
● Lietotāji – Facebook, Twitter, WikiLeaks, ThePirateBay
● Izstrādāts Norvēģijā● Reklamējas, ka var paātrināt lapas atgriešanu no 300
līdz tūkstošiem reižu, t.i., tikai aptuveni līdz 10 reizēm lēnāk nekā statisks saturs
● Ātra salīdzinoši ar citām kešošanas pieejām
![Page 37: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/37.jpg)
DSL VCL● Vienkārša sintakse (līdzīga C), kas tiek notranslēta
uz C un tad nokompilēts uz mašīnkodu● =, ==, !=, ~, !~, !, &&, ||, +, “string”● if () {} else {}, set, unset, return
● 9 subrutīnas, kas ir dažādi katra pieprasījuma apstrādes posmi, kurās var kaut ko ietekmēt
● Tikai predefinēti objekti - client, server, req, bereq, beresp, obj, resp
sub vcl_recv {
if (req.request == "GET" && req.url ~ “\.js$”) {
return (lookup); }
}
![Page 38: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/38.jpg)
VCL apstrādes arhitektūra
![Page 39: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/39.jpg)
Integrēšana● Fiksētais kešošanas laiks var nebūt optimāls
● Saturs var mainīties biežāk par uzstādīto laiku - lietotāji dabū vecu informāciju
● Retāk – serveri veic nevajadzīgu darbu● Risinājums – jāpaziņo serverim, ka saturs ir jāatjaunina
acl purge { "192.168.0.0"/24; }
sub vcl_recv { if (req.request == "PURGE" ) {
if (!client.ip ~ purge) { error 405 "Not allowed."; } return (lookup); } }
sub vcl_hit { if (req.request == "PURGE") {
purge;
error 200 "Purged."; } }
![Page 40: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/40.jpg)
Dinamiskā satura ģenerēšana ESI● Bieži vien tīmekļa lapas sastāv no blokiem, kuru
mainība ir dažāda● Vai arī ir neliels informācijas bloks, kas atbilst katram
lietotājam (piemēram, “Sveiks, Jānis Bērziņš | Tev ir [0] jauns ziņas”)
● Mēs to varam ielādēt pēc lapas ielādes, izmantojot JSON vai arī ģenerēt saturu uz Varnish
<TABLE><TR><esi:include src=”sveiks.html”/></TR>
<TR><TD><esi:include src=”index.html”/></TD>
<TD><esi:include src=”article.html”/></TD></TR>
</TABLE>● Varnish parsē <esi> birkas un saliek elementus kopā, visi
elementi konfigurēti un kešoti kā neatkarīgi
![Page 41: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/41.jpg)
![Page 42: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/42.jpg)
Slodzes dalīšana● Vienu adresi var apstrādāt vairāki ar bakendi● Dažādus url var apstrādāt dažādi bakendi● Monitorēšana
● Beigto serveru atslēgšana (restart, upgrade, repair)● Atdzīvojušos serveru pieslēgšana atpakaļ (arī jauni)
● Faktiski nozīmē, ka var lietot kaudzi LĒTU desktop grade dzelžu dinamiskā satura ģenerēšanai
● Ja pievienojam vēl vienu frontend, tad iegūstam augstu, bet lētu bojājumpiecietība (fault tolerance)
● Ja izmantojam NoSQL vai kā savādāk iegūstam replicētu datubāzi, tad nav nepieciešami dārgi serveri vispār
![Page 43: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/43.jpg)
Varnish lietojums Latvijā$ curl -I www.tvnet.lv● HTTP/1.1 200 OK● Server: Apache● Last-Modified: Wed, 07 Nov 2012
20:09:08 GMT● Expires: Wed, 07 Nov 2012 20:10:08
GMT● Cache-Control: max-age=60● Vary: Accept-Encoding● Content-Type: text/html; charset=UTF-8● Content-Length: 185924● Date: Wed, 07 Nov 2012 20:10:15 GMT● X-Varnish: 2025605055 2025545136● Age: 67● Via: 1.1 varnish● Connection: keep-alive
● $ curl -I www.delfi.lv
● HTTP/1.1 200 OK
● X-Fe-Node: nuffy
● Content-type: text/html; charset=utf-8
● Server: lighttpd/1.4.31 (PLD Linux)
● Content-Length: 159097
● Date: Wed, 07 Nov 2012 20:20:58 GMT
● X-Varnish: 734492112 734450241
● Age: 58
● Via: 1.1 varnish
● Connection: keep-alive
![Page 44: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/44.jpg)
Nestandarta lietojumi - WAF
● Programmējamība ļauj veidot nestandarta lietojumus, piemēram, WAF
● Definējam pēc iespējas precīzākas saņemto pieprasījumu apstrādes adreses un metodes– req.url ~ “^/topic/([0-9])$” nevis “^/topic”– req.request == “GET”
● Beigās izmantojam return(error);● Ierobežojam piekļuvi backend serveriem (vai
atvienojam no interneta)● Uzbrucēji tagad uzbrūk frontendam, aizsargājam to● Nepalīdz pret loģiskām (un daudzām citām)
ievainojamībām
![Page 45: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/45.jpg)
New trend
● Web application is central thing● Develop application in some framework● No separate web server, it is now just a part of
application (it is library from used framework)● Extremely customizable
![Page 46: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/46.jpg)
Situācija šobrīd
● Standarta tīmekļa izstrādes risinājums ir HTTP serveris un kāda klasiska dinamiskā satura ģenerējošā sistēma (PHP, ASP, Python u.c.), pastāv problēmas:● Ilglaicīgie pieprasījumi un pastāvīgie savienojumi● Vienlaicīgi apkalpojamo klientu skaits● Savietojamība ar citām tehnoloģijām● Nākotnes attīstības iespējas
![Page 47: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/47.jpg)
Notikumvirzītie programmēšanas ietvari
● Ideja un realizācija nav jauni (Python Twisted, Perl Object Environment, Ruby EventMachine, Node.js)
● Maza izplatība tīmekļa risinājumos● Risina standarta tehnoloģiju problēmas● Reaktora projektējums, C10K problēma● Ļauj tīmekļa programmētājiem veidot tīkla risinājumus
![Page 48: Web servers](https://reader036.vdocument.in/reader036/viewer/2022070405/56814023550346895dab8143/html5/thumbnails/48.jpg)
Node.js● Bibliotēku kopums, kas ļauj veidot tīkla
risinājumus JavaScript programmēšanas valodā, darbojas V8 dzinī
● JavaScript dziņu veiktspējas novērtējums● Jaunas saistītās tehnoloģijas – Socket.IO,
CoffeeScript● Problemātiski aspekti - pakotņu pārvaldība,
lietotņu mitināšana