Homework 6 CS 803, due: April 5th



  1. Implement the Normalized Cut algorithm from J. Shi and J. Malik to solve the following segmentation problem.
    Generate a synthetic data set to test your code, as follows. Your data should consist of two classes of 25 points each. Each class of data points should be drawn from a 2-D Gaussian distribution with random mean and variance. Each data point will be considered as a node in the graph. Use the Euclidean distance between nodes as the edge cost. Plot the data points from each class with different plot symbols (for example o and x). After running your segmentation code, visualize the results by coloring each data point with the color associated with the segmentation results. Repeat for three and four clusters of points (you may want to put some constraints on the pairwise distance between the clusters so that the segmentation algorithm will return reasonable results). Discuss how you determined the number of clusters.
Submit hardcopies (paper copies) of your code and results of the plots and email me a tar file with all of your code. Also, submit a short report that describes your code and the results. Homework is due on April 5th (late homeworks will not be accepted).