Tuesday, August 9, 2011

Textual description of firstImageUrl

Diagnosing RMI Exception: java.rmi.ConnectException: Connection refused to host:

I have a service that runs on Linux under JBOSS. This service uses JMX (RMI) to talk to a windows box running a java service.

Everything was working ok, until both the Linux and Windows boxes were moved to a different subnet. Then, they started failing with the following exception:

Caused by: java.rmi.ConnectException: Connection refused to host:; nested exception is:
java.net.ConnectException: Connection refused
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110)
at javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source)
at javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2312)
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:277)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:248)
at com.zillow.core.jmx.ZillowJMXConnectorFactory.connect(ZillowJMXConnectorFactory.java:127)
at com.zillow.core.jmx.ZillowJMXConnectorFactory.connect(ZillowJMXConnectorFactory.java:53)
at com.zillow.bcpserver.BCPServerProxy.afterPropertiesSet(BCPServerProxy.java:123)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1369)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1335)
... 142 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at java.net.Socket.connect(Socket.java:469)
at java.net.Socket.(Socket.java:366)
at java.net.Socket.(Socket.java:180)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595)
... 154 more

I searched for this exception in the search engines, and the only thing I found was people saying that if the local box (i.e linux) did not have a correct IP address for localhost, it would send the address as the "ContactMe" endpoint to the remote destination, and that would fail.

For example, the following links explained that issue:


However, in my case, that turned out not to be the problem. My Linux boxes (even after the move) had the correct hostname, and the `hostname` command was giving back the correct hostname (and not or localhost).

At this point, there was no more information available through the search engines to diagnose the problem.

So, I decided to debug it myself. I first restarted the service, and used TCPDUMP to do a network sniff on the linux box.

Here, .175 is the Linux box, and .45 is the windows box.

The following is the packet disassembly of the JRMI/ReturnData response being sent by the Windows box to the Linux box.

As you can see, the Windows server is sending back "" as the CallMe endpoint to the Linux box. The linux box tries to connect to port 1099 on and fails.

Since the machines had been moved to different networks, it is possible that the java service might have lost their IP address and network registration settings.

So, I restarted the java service on the windows box. And viola!, that fixed the problem.