Programming Project 6

This semester you will be writing a program that models elements of recognizing and creating characters. Optical character recognition is an important area of research that allows photographs or printed documents to be digitized; by doing so, these documents are made available for machine-based searching. On the flip side, http://en.wikipedia.org/wiki/CAPTCHA is a system for differentiating between humans and computers: the goal here is to generate a non-machine readable image that a human could identify. CAPTCHA helps reduce the amount of spam on the Internet.

We will implement a highly limited type of image matching, processing, and creation this semester. Rather than write this project at once, we will break the project down into several two-week sub-projects that are due throughout the semester. The rest of this document will detail the first such assignment.

PROJECT UPDATES AND CLARIFICATIONS

This final assignment will ask you to write some code that will help you determine if a pixel is part of a number. You will need to copy your project 3, 4, and 5 solutions into the same directory where you save your files for this project.

You will have to do the following for this assignment:

Step 1: Writing Test Cases
Ideally, you should write your test cases before you write your code. However, testing objects, such as those of the Number class, are more difficult due to their stateful nature. We can no longer test all methods individually, without at least calling the constructor. For this reason, we will not be writing formal test cases for the project; we need a better testing infrastructure, which you will learn about in CS211 if you take it. Instead, we will some provide tests for you; the rest you will need to write on your own. For this project, we will not check your tests on Marmoset, to give you the experience of testing in the real world.

You will be creating a Banner class in a banner.py file with the following methods:
A constructor that takes a message. The constructor creates the message attribute, and sets it to the incoming argument. The constructor also initializes an empty dictionary.
A method called addNumber that takes a number object and a color as arguments, and adds the number to the dictionary of numbers. The dictionary uses the number type ("one", "two", etc) as keys to store the number objects. For example, if we have previously created a number object called num1, and a banner called banner, then calling banner.addNumber(num1,'G') will add the number with the color G to the banner. It will not change the color of the incoming number.

If a number already exists in the dictionary, it will overwrite the old value with the newer one.

To test that a number has been added to an existing banner, you can add the following to your tests.txt:

Banner.addNumber Number G Banner.__str__
'hello GGG\n G \nG  \nGGG\n '

The example above added a 3x4 number "two" to a banner previously created.
A to-string method that can be called with str(...), which returns the message of the banner, followed by a space, followed by the numbers in sorted order according to their keys (use the natural string sorting), using the draw method. Each number is followed by a space.

You will also write a main function in project5.py that has the following functionality:
The main takes as argument the name of a file in the current directory. It will process the contents of the file to create a list of Banner objects. These Banner objects are then represented as strings and returned.

The file contents will be as follows:
  • Each banner's message will start with a hashtag, followed by the message, followed by a newline.
  • A message will be followed by one or more numbers belonging to that banner. The type of the number (one, two, etc.) will be on the first line, followed by the color of the number of the next line. A number is added to the banner using its addNumber method. You may assume all numbers have a width and height of 3.
  • One or more banners may be in the file; a number belongs to the most recent banner above that number.
  • We will guarantee that each banner message will be well-formed, and each number will 1) belong to a banner and 2) will consist of two lines in the file, as mentioned above.
  • If the number type is not recognized, or the length of the color is not 1, the function stops processing the input file, and prepends the word EXCEPTION, followed by a space, to the returned string, before appending the banners read in (i.e. "EXCEPTION hello {.......".
  • If multiple banners exist in the file and need to be returned, they should be separated with a space in the string returned (i.e. "hello {'three':'G'} bye {'two':'B'}").

If the file cannot be opened, the function raises a ZeroDivisionError. Otherwise, the function prints out the banners in the order read in. Each banner's numbers are printed out in sorted order according to their keys in the dictionary, similar to the to-string method above. See the public tests for an example of formatting; in addition, if multiple numbers are being "printed", they are separated by commas, but no spaces, in the returned string (i.e. "hello {'three':'G','two':'R'}").

You should use the python documentation, or the Internet, to figure out how to open and read a file to place its contents into a list. Make sure to cite your resources. Please note, you may only look for code online for how to open and read a file, and NOT any other part of this project.

You will NOT be submitting your tests to Marmoset for this project. This also means that you cannot check them on Marmoset. Similarly, you should not ask the professor or GTAs to look over your test cases for you, as this defeats the purpose of not having access to Marmoset for testing your test suite. The public tests are in tests.txt, and you will need file1.txt in the same directory to run it.

Step 2: Writing Code
You should get started writing your python code as soon as all of you finish your test suite and it passes both the public and release tests on Marmoset -- do not wait!

You must come up with your own formula for the methods. You will need to derive this formula on your own; discussing it with other students (including Piazza) is considered an Honor Code violation.

You will need the return statement to get your functions to return a value - DO NOT use the print statement for this!


Step 3: Testing Your Code on Your Test Suite
Once you have finished writing your python code, you will first want to test your python code with the test suite you wrote, before testing your python code on Marmoset. Testing your python code at home will give you instant results, while you'll have to wait a bit to test it on Marmoset. Ideally, your test suite is well written, and it's testing for almost everything (or even everything!) the release tests for the python code on Marmoset are checking too. To help you use your test cases, I have included a new driver.py file that will use your tests.txt file and print out whether or not the tests you wrote passed or failed, on the python code you wrote. To use this driver, make sure your project3.py, project4.py, and number.py tests.txt, and driver.py are all in the same directory, and from a terminal in that directory, type:

python driver.py

If you want this driver to stop running after a certain test (so you don't have to scroll through everything when you debug), simply put the word stop on a single line after the last test you want run.


Step 4: Submitting Your Code to Marmoset.
Once your code passes all your tests at home, you're ready to submit to Marmoset. See the instructions at the bottom of this page for Marmoset submission.

In an effort to get students to do their testing at home using their test suite, rather than on Marmoset, you will need to comment out all of your debug print statements from your code before submitting to Marmoset. Do not delete your debugging statements (you may need them later); just comment them out using #. Note that you DO NOT have to comment out your print statements to run your test suite at home using driver.py. Marmoset submission should be a last and rare step!


Sample Input and Output

Your python code will be tested on Marmoset in the same manner as your test cases. Ten sample tests have been provided in the tests.txt above: these tests are the Public Tests and Release Tests on Marmoset.

Release tests for the coding portion will be made available the Thursday morning before projects are due. Coding projects are used as both learning and assessment tools, and in order to get students to test their own code, and come up with their own solutions in a timely manner, the professor or TAs will not answer questions about release tests until they are publicly posted. However, we are very happy to answer questions about why your code doesn't pass *your* tests, at any time. Please get started on projects early, and write and use a high-quality test suite of your own, so you pass most or all of the release tests before the Thursday the project is due.
Project Hints and Guidelines

Remember, when designing your own test cases, try to do so in a thoughtful and structured approach, as we have done in class. What is the smallest possible image you could call your functions on? What is the next smallest one? What are all the corner cases?

Other hints and guidelines:
Project Grading

The project will be worth 70 points:

Project Submission

Do not submit the same files multiple times to try and get Marmoset to grade them faster (because it works like this). You will only slow down the results for yourself and the rest of the class. All requests are handled in order.

There are two due dates for the two different parts of the project (test cases and code). Once a due date passes you cannot resubmit that part of the assignment. However, you may submit either or part of both assignments before the first due date, if you are done early.

DUE DATE 1: Friday 2/8/2013 at 4:55pm: Test cases due. Submit ONLY your DriverJava.java file (see above for how to convert your tests.txt file into this DriverJava.java file) on Marmoset following the link to CS112-4T. (The T stands for TEST)

DUE DATE 2: Friday 5/1/2015 at 4:55pm: Code is due. In order to have your code work with Marmoset, you must also submit two files code.py (new) and SystemCall.java. Again, you do not need to know how these files work (or even open them); just make sure not to change them, and submit them with your number.py. Normally, Marmoset is set up to work with Java files, which is why we have this workaround. Submit ONLY your project3.py, project4.py, number.py, banner.py, project5.py, code.py and SystemCall.java files on Marmoset following the link to CS112-6C. (the C stands for CODE). Find the link on Marmoset to submit Project 6. Once you pass all the public tests, use your tokens wisely to start examining the release tests. Do not change the name of the files.

You may make as many submissions to Marmoset as you like, before the due date, but we will only grade the highest score. Remember to read and adhere to all of the information regarding projects and their submission and grading on the course syllabus. Remember that Marmoset can be slow due to heavy load, and under no circumstances will project due dates be extended because of this: get stared early.
Allowable resources: Class textbook, Python standard documentation Lecture & Lab Instructors. You may not look at or share other students' code in any manner. You may not look at or share test cases with other students. You may NOT work together or talk to other people (including outside sources besides the professor and GTA/UTAs) about the project. All work must be your own.