Program 3

CS 310
Program 3
Nordstrom's sections only

due:
Sections 002 and 003 (Monday-Wednesday sections): May 1, 2017
Sectio 004 (Tuesday-Thursday section): May 2, 2017

The program

In this program you will create a direct access file on a simulated disk. A number of sectors will be allocated for the file and records will be written to the sectors using hashing. The sectors will be buckets, that is the hash function will give numbers which identify sectors, not individual records. When buckets fill, overflow buckets (sectors) will be allocated to hold the overflowing records.

The Disk class

You will create a Disk class. The disk used for this program will then be an instance of this class. A disk will be a two dimensional array of char, i.e. an array of char arrays. You should view this as an array of sectors where each sector is an array of characters. The number of sectors in a disk and the size of the sectors is arbitrary, but a default size is 10,000 sectors on a (rather small) disk with 512 characters in each sector. The class definition for Disk is:

public class Disk { private int sectorCount; // sectors on the disk private int sectorSize; // characters in a sector private char[][] store; // all disk data is stored here public Disk() // for default sectorCount and sectorSize {} public Disk(int sectorCount, int sectorSize) {} public void readSector(int sectorNumber, char[] buffer) // sector to buffer {} public void writeSector(int sectorNumber, char[] buffer) // buffer to sector {} public int getSectorCount() { return sectorCount; } public int getSectorSize() { return sectorSize; } }

sectorCount is the number of sectors in the disk and sectorSize is the number of characters in a sector. char[][] store is a reference to the 2-dimensional array which must be allocated by the constructor.

Reading and writing to the disk

You can only read and write to the disk a sector at a time. A disk buffer which is a character array of the same size as a sector must be available. The writeSector method will copy the contents of its second parameter (which should be your disk buffer) to the sector whose number is the first argument. The previous contents of the sector will be overwritten. The readSector method will copy the contents of the sector whose number is the first parameter to the character array which is the second parameter. All access to the disk will be through these two methods. (Note: The disk buffer you will use will be part of the DirectFile class described below.)

Direct access files

A direct access file stores records in such a way that a record can be read or written to the file without reading or writing the entire file. Each record must contain a unique key field. The value of this key field (the "key") will be used to determine where the record is stored on the disk. The record can then be read given only the key. The file is implemented as a hash table using sectors for buckets. The class definition for DirectFile is:

public class DirectFile { public DirectFile(Disk disk, int recordSize, int keySize, int firstAllocated, int bucketsAllocated) {} public boolean insertRecord(char[] record) {} public boolean findRecord(char[] record) {} private int hash(char[] key) {} private Disk disk; // disk on which the file will be written private char[] buffer; // disk buffer private int recordSize; // in characters private int keySize; // in characters private int recordsPerSector; private int firstAllocated; // sector number private int bucketsAllocated; // buckets (i.e. sectors) originally allocated private int firstOverflow; // sector number private int overflowBuckets; // count of overflow buckets in use }

To insert a record (insertRecord method) the hash method takes the key for its parameter and returns a hash value in the range 0 ... (bucketsAllocated - 1). The record is then placed in the sector corresponding to the hash value. This is done by reading the sector from disk into the disk buffer (the buffer field of the directFile), writing the record into the first available position in the buffer, and then writing the buffer back to disk.

If the sector is full, the record is placed in the first overflow sector which is not full. Overflow sectors are allocated when needed and records are stored in them in the order in which the insertions occur. To allocate an overflow sector just increment the overflowBuckets field. If it is the first bucket allocated you must also assign the sector number to the firstOverflow field.

If a record already exists in the file with the given key no insertion will be done but insertRecord will return false. If an insertion is done true will be returned.

To read a record (findRecord method) given its key, hash the key, read the corresponding sector, and copy the record from its place in the disk buffer into the (char[]) parameter of findRecord and true is returned. If the record is not found in this sector, but the sector is not full, the record is not in the table and false is returned. If the record is not found in this sector and the sector is full you must search the overflow sectors.

Note that the parameter passed to findRecord is a reference to a character array which can hold an entire record. findRecord will only use the first part of this array (the key) to do its search. It will then copy the entire record (if found) into the array when it returns.

For reasons of mercy, no delete operation will be implemented. Thus you will not need tombstones in this program.

The test program

You will create an instance of Disk and an instance of DirectFile which uses that disk. The disk will have 2000 sectors, each having 512 characters. The file will be created initially allocating 600 (non-overflow) sectors and will begin at sector 1024. When needed the first overflow sector will follow the last non-overflow sector and when needed more overflow sectors will be created in the following sectors.

The records you store will be mountain records. A record will consist of 60 characters and will have three fields:

name: 27 characters (the key field)
country: 27 characters
altitude (in feet): 6 characters

The characters will be either normal printing characters or nulls ('\000') which will fill out the remainder of the fields. If a name or country is too long to fit in the 27 character field it will be truncated to 27 characters. Altitudes will be stored as character strings, not ints.

Your program will open a (real) file provided for you containing records in the format:


Shasta, Mount#United States#14162

with each record on its own line. The program will reformat each line from the input into the 60-character record format and insert it into your direct file. The program will then go into a loop displaying a menu offering the choices:

Insert new record
Find record
Quit

and accept the user's choice. If the user chooses "Insert new record" the program will prompt for the three fields, format a 60-character record with the data, And insert the record. The program will display whether the insertion was successful or not. If the user chooses "Find record" the program will prompt for the mountain's name (the key) and will search for a record with that key. If a record is found its fields will be displayed in the format:


Shasta, Mount, United States, altitude: 14162 ft.

If no record is found a message to that effect will be displayed. The program will run until the user chooses "Quit."

To hand in

You will hand in hardcopy of your source code, a sample terminal session, and a readme file stating which parts of the assignment (if any) are incomplete. For the terminal session you will use "mountain data" in the format described above. A file containing the initial data to store on your disk (a large number of mountains) will be made available for download from the class website. A list of commands to use for your terminal session will also be posted there.

CS 310 Program 3 Nordstrom's sections only

due: Sections 002 and 003 (Monday-Wednesday sections): May 1, 2017 Sectio 004 (Tuesday-Thursday section): May 2, 2017