next up previous
Next: Assignments Up: Worksheets Previous: Worksheet 4

Worksheet 5

The aim of this workshop is to investigate the function and operating characteristics of hash tables.

First of all, make a new special directory for your work for this workshop, for example

  workshop5/
This will help to avoid name conflicts.

Hash table demo
Now follow these instructions:
  1. copy the file HashTableDemo.java to your new directory
  2. compile and run the program
  3. try to understand what each line of code is doing.
You will see in the lines
LookupTable t = new QuadraticProbingTable();
//LookupTable t = new DoubleHashingTable();
that the code initially uses a QuadraticProbingTable. Now change these lines to read
//LookupTable t = new QuadraticProbingTable();
LookupTable t = new DoubleHashingTable();
so that the program uses a DoubleHashingTable. The behaviour should, of course, be identical. Think about why this should be so.

Probing statistics

The purpose of this part is to investigate the efficiency of various different probing methods for a probing hash table.

Remember that, using the open addressing method of collision resolution, a sequence of ``probes'' has to be made to find a vacant space to insert the item to be stored. Ideally the number of probes should be as small as possible (for a given amount of space).

We mentioned in lectures three probing methods

The aim is to investigate, experimentally, how these compare in a real case. Remember that the performance will also depend on the loadfactor.

To investigate these issues, we need to make a modification to the code in the DataStructures package. The reason is that the code developed there provides no way of counting the number of probes, because this would normally not be the concern of a client.

This means that you will need to copy a few more files to your working directory, to provide this facility. First of all, copy the file ProbingHashTable.java to your workshop directory. This is the same as the ProbingHashTable.java file from the DataStructures package, except that it provides an additional public method

    public int nrProbes();
which returns the number of probes made by the most recent call to the protected findIndex method of the ProbingHashTable class. Recall that the findIndex method is called whenever an item is added to the hash table, or when it is removed, or when a key is looked up in the table, or when a query is made about whether a given key is present.

Now copy the three files

LinearProbingTable.java
QuadraticProbingTable.java
DoubleHashingTable.java
to your workshop directory. There is nothing really new here, except that the linear probing method has also been implemented for comparison. The specific implementations of the findIndex method now also count the numbers of probes. Note that the number of probes is initialised to 1. So one probe means that the item could be inserted immediately.

Finally copy the file HashStatsDemo.java to your workshop directory, together with the data file words. Now compile and run this program. This will show how many probes were needed for insertion, and then how many probes would be needed on average for looking up each word in the table--why are they different?

Experiment with this program by modifying the code. For example, you could concurrently explore the statistics for all three probing methods. You could try the larger data file words.big. You could experiment with different load factors (by using specific arguments to the constructors--there is now a constructor which just requires the load factor to be given). You could try different initial sizes for the hash table, etc. Try to reach some conclusions about the relative merits of these different methods from various points of view.


next up previous
Next: Assignments Up: Worksheets Previous: Worksheet 4
Peter Williams 2005-06-07