Java Magic. Part 1: java.net.URL

Recently, I found on reddit very interesting Java code snippet (slightly modified):

HashSet set = new HashSet();
set.add(new URL("http://google.com"));
set.contains(new URL("http://google.com"));
Thread.sleep(60000);
set.contains(new URL("http://google.com"));

What do you think the output for lines 3 and 5 will be?

Definitely not true, true if the question was asked. Think for two minutes.

Ok. In most cases it will be true, false because you have internet connection (How else you can read this?). Turn off your network cable or wi-fi, and you’ll get true, true.

The reason is in implementation of hashCode() and equals() methods for URL class.

Let’s see how hashCode calculated:

public synchronized int hashCode() {
  if (hashCode != -1)
    return hashCode;
  hashCode = handler.hashCode(this);
  return hashCode;
}

We can see hashCode is an instance variable that calculates once. Makes sense, URL is immutable. What is handler? It’s an instance of one of URLStreamHandler subclasses, depends on protocol type (file, http, ftp), that have helper hashCode implementation. Just look at URL.hashCode() javadoc:

The hash code is based upon all the URL components relevant for URL comparison. As such, this operation is a blocking operation.

Stop! BLOCKING OPERATION?!

- Sorry, I couldn’t check email yesterday due to hashCode calculation.

or even better

- No, mom, I can’t watch porn, It’s hashCode, you know.

Ok, let it be blocking. Another exciting part, that handler resolves host IP address for hashCode calculation. Tries to resolve, to be honest. If it can not do this, it calculates hashCode based on host, which is google.com for our example. Shit happens when IP is dynamic, or host have request balancer that also changes host IP. In that case we got different hashCodes for one host name, and will have two (or even more) instances in HashSet. Not good at all. By the way, hashCode and equals performance is terrible because of URLStreamHandler opens URLConnection. But it’s another topic.

How to avoid this?

Finally, I’m pretty sure java.net.URL class has lot of useful applications. But not that way.

mishadoff 11 October 2012
blog comments powered by Disqus