Saturday, 17 January 2015

ssh hangs at SSH2_MSG_KEX_DH_GEX_GROUP trying to connect to servers behind Cisco firewall

Today I was unable to ssh to some CentOS servers behind a Cisco firewall. I was connected using AnyConnect. When I ran ssh with -v, it showed me that it stopped at expecting SSH2_MSG_KEX_DH_GEX_GROUP.

A search on the Internet turned up this article: Natty Narwhal: Problems connecting to servers behind (Cisco) firewalls using ssh. Shortening the Cipher and MAC list as suggested solved the problem. Apparently due to overflowing some packet size limit somewhere. I'll leave it to the experts to work out what it is about Cisco and ssh.

Friday, 16 January 2015

NVIDIA driver, libglx.so and hardware acceleration crashing X server

My users on CentOS complained that Firefox would crash the X server after I updated the package xorg-x11-server-Xorg.

A search returned suggestions to disable hardware acceleration in Firefox. However a user mentioned that when he reinstalled the NVIDIA driver, it gave the warning libglx.so is not a symbolic link before setting things right.

An investigation showed that the NVIDIA installer replaces the file /usr/lib64/xorg/modules/extensions/libglx.so with a symlink to /usr/lib64/xorg/modules/extensions/libglx.so.X.Y where X.Y is the NVIDIA package version. So every time the xorg-x11-server-Xorg package is reinstalled, it replaces the symlink and hardware acceleration fails, crashing the X server.

The same problem for Debian is documented in this forum thread.

I could blacklist xorg-x11-server-Xorg in yum but I rather not do that. Since I may forget when an update of that package happens, I wrote a shell script to restore the symlink if removed and a cron job to call it periodically. But I'm also looking to make a Puppet stanza to do this.

Thursday, 15 January 2015

Squid failing to fetch IPv6 web resources and dns_v4_first

I had a strange symptom on my Chromebook on my home LAN. A particular Internet web page would not work because a jQuery file distributed by cdnjs.cloudflare.com could not be fetched. When I disabled the use of a proxy in my Chromebook, the web page worked. I have a squid caching proxy on my LAN, partly to be able to zap advertisments. So I knew it had something to do with squid.

Maybe it was a bad object in the cache? I knew there was a way to delete cached objects from the command line, and a search showed me that all I had to do was add these lines to /etc/squid/squid.conf:

acl purge method PURGE

http_access deny purge !localhost

and then use the command:

squidclient -m PURGE https://cdnjs.cloudflare.com/blah.js

to remove it. However when I ran

squidclient https://cdnjs.cloudflare.com/blah.js

to fetch it again, I saw that squid was trying to use the IPv6 address of cdnjs.cloudflare.com to get the resource and failing.

My LAN is fully IPv6 enabled because my Linux machines all support the IPv6 stack, I have a BIND server giving out IPv6 addresses, and in fact a lot of the internal traffic such as Apache goes over IPv6 transparently. I wish I had an IPv6 broadband connection but I don't so I cannot use IPv6 addresses in the outside world.

So the problem boiled down to: how can I prevent squid from trying to reach IPv6 sites. It turns out that the directive dns_v4_first is intended for this. Just adding:

dns_v4_first on

to /etc/squid/squid.conf worked and now I can view that Internet web page.

One symptom remained unexplained though, why didn't the Chrome browser on the host running squid suffer the same problem? I can only surmise that it's because in that case the proxy is specified as an IPv4 address and port so squid thinks, the request is coming in from a IPv4 host so I'll forward this request to an IPv4 origin. The Chromebook however, sends the request to the (internal) IPv6 address of squid so squid thinks it's allowed to forward the request to an IPv6 origin.

Friday, 9 January 2015

Apache won't start on CentOS 5 because self-signed certificate used by mod_nss expired

You are not likely to hit this error with the usual configuration since mod_ssl is the one normally used. However mod_nss is used by the Fedora Directory Server (aka 389 Directory Server now) in CentOS for the console for LDAP authentication (centos-idm-console). Possibly also used by RHEL.

One day I found that Apache would not start. The error log indicated that the certificate had expired. I searched and searched for how to generate a new certificate and tried various things. Too many steps and too hard.

Then I had an idea. The certificate must have been generated when the package mod_nss was installed. Let's try reinstalling it. First we move the old database directory out of the way:

mv /etc/httpd/alias /etc/httpd/alias.old

Then reinstall mod_nss:

yum reinstall mod_nss

and voila, after a while, a new database with a fresh certificate was generated and I could start Apache.