Suppose the items are stored in an array for example. Then it would be sufficient to be able to calculate the array index where the item would be located if it were present. Addition of a fixed offset would then provide us with the actual address. What we are looking for is a function which associates an integer with every data item of interest. This is called the hash code of the object, and the process of forming it is called hashing. The hash code is understood to determine the proper place of the data item in the underlying array, which is now to be called the hash table.
Different items should hash to different integers. Although
theoretically possible, given unlimited storage space, this aim is
generally unrealistic in practice. For example, suppose we have to
store strings of up to 16 characters in length. Even assuming that
only the lower case characters
a..z can occur in the string, there are still
Nonetheless there is a useful idea here, provided we are willing to abandon the requirement that different data items always have different hash codes and map to different elements of the array. It might be sufficient if the most commonly occurring data items usually map to different indexes, provided we can devise a suitable way of dealing with the (hopefully) exceptional cases. The problem of dealing with these cases, where two objects to be inserted in the hash table initially map to the same array index, is called the problem of collision resolution. Note that we aim to have chosen a ``good'' hash function that will minimise the cost of overcoming this problem.