X11 Display Forwarding Fails After Some Time
Are you finding that an ssh session works fine to log in to a remote system and forward the display back to your local machine, but after a while stops being able to start new X11 applications? Specifically, the X11 display forwarding fails after some time, resulting in no usable display for the X11 application.
That’s what happened to me recently and through some debugging I found the solution, which I’ll get to in a minute. In keeping with the purpose of this blog, I’m writing this up because it took a bit of effort to solve, and I found virtually no useful information about it on web searches. Maybe this post will help someone else with a similar problem.
Symptoms of the Problem
My local system is a Mac running the latest Mac OS X (Lion, 10.7.2). I upgraded from 10.6 (Snow Leopard) a month ago, which may have been one source of the problem, but I can’t be certain there is a perfect correlation to that. The remote system is Ubuntu 8.04. The network is hardwired gigabit ethernet. Neither system runs a firewall. Other client systems, such as Ubuntu 11.10, don’t have this problem. It appears to be Mac-specific.
The symptoms are as follows:
- Log in from the Mac to the Ubuntu server using “ssh -X <remote system>”. This works fine and you get a new shell session on the remote system. Due to X11 forwarding, the Mac starts up its X11 server in response.
- From the ssh login shell, start a new X11 application, such as “xterm”. The application puts a window on the Mac’s screen, and it all works fine.
- Now wait a while… I had thought a couple of hours, but it may be less than that. (It may be 20 minutes, see below.)
- Try to open another xterm from the same ssh login shell. It fails with the message “xterm Xt error: Can’t open display: localhost:10.0”
Debugging Clues
After spending a few hours on this, the debug step that led me to the solution was to invoke ssh from the Mac with the highest level of debug logging: “ssh -vvv -X <remote system>”. Follow the same steps and at the time it fails to start a new xterm it will report:
Rejected X11 connection after ForwardX11Timeout expired
It’s an X11 Forwarding Timeout
So there is apparently a timeout for forwarding X11 display over SSH. Web searches for “ForwardX11Timeout” don’t help much; there is very little information about it. Even the Mac OS X’s ssh_config or ssh man page do not even list ForwardX11Timeout as a parameter. A few postings about this parameter are in the context of the Cygwin X server, but the symptoms reported are the same as what I saw. In that context, the default value of the timeout parameter is 20 minutes. It comes into play only for untrusted connections, apparently, but I don’t have a full trust authority system set up since I only do this locally.
Override the Timeout Default (and avoid a Mac OS X bug)
Even though the Mac’s man pages don’t list ForwardX11Timeout as a parameter, adding it to /etc/ssh_config does not cause an unrecognized option error, so it’s a legal option. Of course I would like the longest possible timeout setting, so I started with very long times, like 10000 weeks. This caused the X11 server on the Mac to open, and then immediately crash, sending a report to Apple. I tried a few other settings, for example 0 resulted in timeouts occurring immediately (as opposed to never, which I would have expected), and 10s caused new xterm invocations to fail after about 10 seconds. So this did appear to be the right parameter to fix this.
Finally through a binary search I found that a setting of 596 hours worked: it didn’t crash the Mac’s X11 server, and it doesn’t time out as described above. Why 596 hours? If you convert 596 hours to milliseconds, it’s just under 2^31, and 597 hours is just over 2^31, so there is some kind of signed 32-bit integer overflow problem somewhere along the line.
Solution Recap
The only change that needs to be made is to add the following line to the Mac client’s /etc/ssh_config:
ForwardX11Timeout 596h
It’s possible that using “ssh -Y <remote system>” may work as well, as it may not trigger the untrusted auth timeout. I haven’t tested this.