ISA 563: Fundamentals of Systems Programming

Spring 2013
Muhammad Abdulla
General Information | Textbooks | Schedule & Notes | Projects | Policies
Homework 2
Due: 11:59 PM, Mar. 5, 2013
Reminders: Follow the submission instructions. Remember to comment your code!

Total Points: 100 

0) Stand-alone Spelling Checker Program (100 pts)

   Spelling checkers are used frequently in text editors to show spelling
   mistakes in documents and possibly provide a set suggestions. Although
   conceptually simple, the procedure of finding close matches for a string
   from a set of strings in a dictionary can be extended to other operations
   such as search suggestions, giving users a list of commands with a similar
   spelling if a command is not found by the shell, etc.

   In this assignment, you will implement a stand-alone spelling checker
   program that reads its input from standard input, and outputs results.
   By default, the program should use the standard dictionary provided
   by the system as the default dictionary. For example, /usr/share/dict/words
   under Linux, or /usr/share/lib/dict/words under Solaris. Your program
   should allow the use of -d or --dictionary command line switches to allow
   the user override the default dictionary requirement.

   The program should expect one word per line of input, and should skip
   blank lines. If the newly read word is found from the dictionary, the
   word itself should be returned. If the word is not found, your spelling
   checker should print out suggestions. The suggestion words should not
   have an edit distance of larger than 2 from the input word. If no word 
   that has an edit distance of 2 or less is found, the program should
   print out a blank line. If there is one or more words that have a
   distance of 2 or less, the spelling checker should print out those
   words as suggestions, sorted according to the edit distance (ascending). 
   No more than 12 words should be printed as suggestions by default.
   The logic is as follows:

   1. If there is an exact match (edit distance 0), only print that string in
      a line.

   2. If there is no match, print a blank line.

   3. If there are 12 or fewer matches (with edit distances 1 or 2) after
      searching, print all of them. Strings with edit distance 1 should be
      printed before strings with edit distance 2. No specific requirement
      for sorting strings with the same edit distance.

   4. If there are more than 12 matches:

      4.a) If there are at least 12 matches with edit distance 1, print 12
	   matches from among them. Order does not matter. Therefore, you can
	   stop searching for more matches once 12 matches with edit distance
	   1 are found.

      4.b) If there are fewer than 12 matches with edit distance 1, print
	   them first (if any), and print strings with edit distance 2 until
	   a total of 12 matches are printed. As above, order does not matter
	   among strings with the same edit distance.

   Your spelling checker should print out the following line to mark the end
   of its output (even if it is blank) after processing each input word:

   <end>

   After this output, the spelling checker should wait for more user input, until
   user enters an EOF character (by pressing Ctrl-D, for example), or enters the
   following line:

   <quit>

   The spelling checker should exit after such input. 

   A good spelling checker should consider the possibility of lower and upper
   cases in user input. However, for simplicity, the spelling checker is only
   expected to handle strings in case-sensitive manner.  You may want to
   improve your spelling checker to handle strings in a case-insensitive way,
   but this is not required.

   Your program should have the following command line options:

   1. -d, --dictionary    override default dictionary with another file
   2. -n, --count         override the default maximum number of suggestions
   3. -v, --version       print version information
   4. -h, --help          show help

   An example run of the program may look like this:

   $ ./speller
   rever                        # user input
   river                        # program output
   sever
   lever
   never
   fever
   ...
   <end>
   speling                      # more user input
   spelling                     # program output
   swelling
   ...
   <end>
   wrooongg                     # more user input
                                # program output (blank line)
   <end>


   $ ./speller -d ./my_dictionary
   ...


   $ ./speller -v
   My spelling checker, version 0.1


   $ ./speller -n 2
   rever                        # user input
   river                        # program output (only two suggestions)
   sever
   <end>
   

   You are encouraged to use hash tables for your implementation. Please make
   your program as modular as possible--you will modify this program later
   as a spelling checker daemon.

   
   Since the attacker programs is exptected to test input from both the
   command line and stdin, you may write two separate attacker programs for
   these two input channels. If you do this, please briefly document your
   attacker programs in the README file. Both can certainly be handled by
   single attacker program, but we have not covered how to accomplish it yet.

   __________________________________________________________________________
   Note:

   You may want to consider using GNU getopt library for processing command
   line options.

     
Date & Time
bullet
bullet (EST)
What is New?
Valid W3C XHTML
© 2008-2012 Muhammad Abdulla
Last Modified: Feb. 14, 2012