What is .NET’s Object.GetHashCode() Used For?

food-brk-02

Here is a great question from a visitor.

“What is the exact use of GetHashCode of an object in .net? Does it have any relation with garbage collection?”

Let’s answer the second question first. No, it has nothing to do with garbage collection.

According to the Microsoft documentation, “The GetHashCode method is suitable for use in hashing algorithms and data structures such as a hash table.”

I suppose that sentence isn’t very helpful if you don’t know what a hashing algorithm is or why you would use one.  So first let’s back up and examine those topics.

Hashing algorithms take a very large amount of data and reduce it to a small “number” that can be used to represent that data.  While the result is not THE data, it is statistically unique enough that we can use the number in place of the data for specific purposes.

One place where hashing is used consistently is in storing passwords.  This is not done all of the time, but in most secure systems, this is how the password is stored.  We compute the hash and store the hash of the password.  When the user logs in, we hash the password they give us and compare the hash we computed to the hash that is stored in the database.

Another place hashing is used is when we want to create a list of items in a collection that can be retrieved quickly without having to create a huge array to hold each possible item that might be added.

For example, we want to create a list of items that only has unique items in it.  For this we need to create a hashing algorithm that gives us statistically unique “numbers” for any element it is given.  Notice I say statistically unique.  This is because going from a large value to a smaller value will always mean there is a possibility that two hash values will be the same for two sets of data that are different.  That’s the nature of hashing.

But, assuming that all of our hash values ARE unique, we can compute the hash of the original object and check the location where that object would exist in our array and see if there is something already there.

This is, in fact, how most HashSet classes are implemented.

One way of producing the hash value is to line up all of the bytes or characters that make up the object to be hashed and assign them to the number, in this case an integer in a loop, bit shifting them prior to adding any new value.

public int GetHashCode()
{
    int hash = 0;
    char[] objectList = this.ToCharArray();
    foreach (char c in objectList)
    {
        hash = hash << 1;
        hash += c;
    }
    return hash;
}

One thing I mentioned above probably needs a bit of clarifying.  I’ve placed the word “number” in quotes for much of this discussion because when we hash we really aren’t too concerned with the fact that the item we are hashing actually ends up being a number.  For the GetHashCode() method, yes, it is important.  But if you wanted to hash a password, you’d want a much larger number than an integer to reduce the chance that any two passwords created the same hash.  In that case, you’d probably store the hash in a character array, which you could store into a string in your database.

Related Post

One Response to “What is .NET’s Object.GetHashCode() Used For?”

Leave a Reply

Comment Policy:

  • You must verify your comment by responding to the automated email that is sent to your email address. Unverified comments will never show.Leave a good comment that adds to the conversation and I'll leave your link in.
  • Leave me pure spam and I'll delete it.
  • Leave a general comment and I'll remove the link but keep the comment.

Notify me of followup comments via e-mail

Bear