ISA 563: Fundamentals of Systems Programming

Spring 2013
Muhammad Abdulla
General Information | Textbooks | Schedule & Notes | Projects | Policies
Project 1
Due: 11:59 PM, May 7, 2013 
Reminders: Follow the submission instructions. Remember to comment your code!

Total Points: 500

   In this project you will implement a starndards-compliant, multi-threaded
   HTTP web server that supports dynamically loadable modules.

   Standards

   Your web server should be able to communicate with HTTP clients using
   HTTP 1.0 standard. 

   The protocol specification for versions 1.0 and 1.1 of HTTP are defined in
   RFC 1945 and RFC 2616 respectively. Note that while these specifications
   are quite detailed, you only need to be concerned with a small subset of
   the HTTP protocol. Besides, your web server is only required to support
   HTTP 1.0, which uses a simpler (and not optimal) non-persistent connection
   with HTTP clients.

   For this project, you will need to understand the formats of the request
   and response messages used by HTTP since the proxy server will need to
   parse the contents of the messages it receives from the clients, and send
   back an appropriate response that the clients can understand.

   You may want to read a basic overview to get familiar with the HTTP 1.0 
   protocol, specially basic request/response methods and formats. You may
   find the HTTP Protocol entry on Wikipedia useful with concrete examples.
   You should also be able to find a variety resources on the web.

   A full-fledged Web server supports many methods, such as HEAD, POST, GET,
   PUT, etc. Your web server only needs to support the GET and HEAD methods.
   To serve each request, we first need to parse the request line and headers
   sent by the client. The request line the web server typically looks like
   this:

   GET /somepage.html HTTP/1.1

   
   For a request such as the one above, your servers response may look like
   this:

   HTTP/1.0 200 OK
   Date: Thu, 04 Mar 2010 16:20:06
   Last-Modified: Mon, 15 Feb 2010 15:23:51
   Content-Length: 177
   Content-Type: text/html

   ...


   The first line is the status line. The web server should return appropriate
   status line based on whether a file is found or not.

   Besides the status line, your web server should support the following
   request headers:

   Date:
   Last-Modified:
   Content-Length:
   Content-Type:

   Your server will return a "Content-Type" header based on the file extension. 
   For content types, please refer to MIME specification. A multimedia MIME reference
   might be much easier to follow. Your web server should be able to handle
   .txt, .htm/.html, .jpg/.jpeg, .gif, and .png files.

   Note that the request was in HTTP/1.1, but you server returns a response marked
   HTTP/1.0, which should be understood by most HTTP clients.

   Multi-threading

   The server should be able to handle multiple simultaneous HTTP requests in
   parallel using threads. In the main thread, the server listens at a fixed
   port. When it receives a TCP connection request, it sets up a TCP
   connection socket and services the request in a separate thread. 

   Your web server should keep a log of incoming requests. Since separate
   threads are going to handle separate requests, you should synchronize
   logging among different threads, so that the log entries do not
   inter-mingled.

   Dynamically Loadable Modules

   Your web server should support dynamically loadable modules. Each module is
   a shared object file (*.so) placed in the modules directory of the web server.
   Each module is supposed to have the the implementation of a function "def_mod"
   with the following prototype:

   char *def_mod(char *arg);

   When a request comes in requesting a module in the following form:

   GET /mod/xyz?str1=foo&str2=bar

   the web server should load the def_mod() function from the shared library
   file xyz.so (if it exists), passing the string after the question mark as
   the argument to the def_mod() function, and printing the result string
   returned by def_mod() as text/html response. In this example, the argument
   string that have to be passed to function would be "str=foo&str2=bar".
   If no question mark is found, or there is no string after the question
   mark, the web server should call the def_mod() function with an empty
   string (""). The modules directory may or may not be named as "mod", but
   whenever a request with a URI starting with "/mod/", the server should
   look for the module in the modules directory, which, again, may or may not
   be named as "mod".

   A sample ".so" file, rev.so, is provided for testing. This module returns
   a "malloc"ed character array containing the reverse of the original argument
   string. The source code for this shared library file is listed here.

   Note:
   The def_mod() function is expected to return a malloced string, and
   the web server is then expected to deallocate the string using free().

   Miscellaneous

   The default web server document root would be the current execution
   directory of the web server. This should be configurable through --rootdir
   or -r switches.  No file outside the document root should be served to any
   request. The only exception is when the modules directory is specified to
   be outside of the document root. Even then, no file should be served
   outside of that modules directory.

   The web server should check if the requested resource is a file or a
   directory.  If it is a file, it should be served as specified in this
   project description. If it is a directory, the web server should serve
   "index.html" or "index.htm" files (in this order) if they exist. In other
   words, it should serve "index.html" file under that directory, and if it
   is not found, the "index.htm" file. If neither of these two files is
   available, an error should be returned.

   The default modules directory would be $DOCUMENT_ROOT/mod, where
   $DOCUMENT_ROOT is the document root directory. This should be configurable
   by --moddir or -m switches.

   The port number is 1080 (TCP), which should be configurable with --port or
   -p switches.

   The default log file should be "ihttpd.log" under executation directory.
   This option should be configurable with --logfile or -l switches.
   The log format should as follows:

   client_ip [time] method uri status bytes-sent

   The log file should be locked when the server starts.

   The web server should also be configurable through a configuration file.
   A configuration file is used only when specified through --config or -c
   command line switched, otherwise the default values described above should
   be used. The format of a log file entry is:

   parameter="value" 

   where parameter can take the values corresponding to the long options
   given above. Blank lines and lines starting with a # sign should be ignore,
   and any invalid lines should be ignored with an error message. An example
   log file is given as follows:

   # an example web server configuration file

   rootdir="/var/www"
   moddir="/var/www/dynamic/modules" 
   logfile="/var/tmp/ihttpd.log" 
   port="2080"

   # end of config file

   If both configuration file and other command line options are specified
   for the same parameter, command line options should override configuration
   file options. 

   
   If started with a configuration file option, the server should be able be
   to re-read configuration file upon receiving a SIGHUP signal, and update
   all of its parameters (rootdir, moddir, logfile, and port number) and
   update itself accordingly (close current port and open a new one if port
   is different, unlock and close log file if a new log is specified, etc).
   It should allow all requests currently being serviced to finish, change
   its state based on the new configuration options, and then begin spawning
   new threads. You may want to use a synchronized access/update to a thread
   counter to implement this feature/requirement.
   

   In summary, the following are the options that should be supported by
   your web server:

   --rootdir, -r        document root directory (default is current execution path)
   --moddir,  -m        modules directory (default is $DOCUMENT_ROOT/mod)
   --port,    -p        TCP port opened for the server (default is 1080)
   --logfile, -l        web server log file (default is ihttpd.log under execution path) 
   --config,  -c        configuration file name
   --help,    -h        help message

   Attacks

   The main attack channel is the network socket. Your attacker program should
   also excercise your program through command line options. You may write two
   different attacker programs for these two input channels.

   Examples

   A sample execution session may look something like this:

   $ ./ihttpd   # start the ISA 563 web server as a daemon (should go to background) 
   $
   $ telnet localhost 1080 
   Trying 127.0.0.1...
   Connected to localhost.
   Escape character is '^]'.
   GET /test.html
   HTTP/1.0 200 OK
   Date: Thu, 04 Mar 2010 17:56:18
   Last-Modified: Thu, 04 Mar 2010 17:05:08
   Content-Length: 240
   Content-Type: text/html

   <html>
   <body>
   <center>
   <h2>Test page for the ISA 563 web server.<h2>

   Let's try putting some images: 
   <br/>
   <br/>
   <img src="mason_logo.png" alt="GMU logo"/>
   <br/>
   <img src="research1.jpg" alt="Research I"/>
   </center>

   </body>
   </html>
   Connection closed by foreign host.


   Sample log file entries: 

   ...
   127.0.0.1 [04/Mar/2010:17:04:40 -0005] "GET /test.html" 200 237
   127.0.0.1 [04/Mar/2010:17:04:40 -0005] "GET /mason_logo.png" 200 12349
   127.0.0.1 [04/Mar/2010:17:04:40 -0005] "GET /research1.jpg" 200 110657
   ...



   You should also be able to test your web server with HTTP clients:

   $ lynx localhost:1080/test.html

   Output:

                                      Test page for the ISA 563 web server.

                                         Let's try putting some images:
                                                   GMU logo
                                                  Research I

   

   $ firefox localhost:1080/test.html

   Output:

   Firefox connecting to ihttpd

   __________________________________________________________________________
   Note:

   You may want to consider using GNU getopt library for processing command
   line options.

     
Date & Time
bullet
bullet (EST)
What is New?
Valid W3C XHTML
© 2008-2013 Muhammad Abdulla
Last Modified: January 21, 2013