More than two weeks ago I blogged about my server being down. After multiple emails, phone calls, and even a fax trying to reach the support team, the server is still dead. But at least I know (a little bit) more now.

The Good
I managed to get someone from support on the phone and he fixed the system at least so far that I could ssh to it again. I was able to pull a complete backup of the system, including a database dump.
That means that Unmaintained Free Software and all other sites hosted on the server will eventually return, no data will be lost.

The Bad
After I created the backup, I wanted to reinstall the whole system and then install the backup to restore all services. As it turned out, the (automatic) reboot- and reinstall-script they use is obviously broken, I cannot reach the server anymore after I initated the reinstall. This is probably something more serious, as other people seem to be affected, too.

The Ugly
I have not the slightest idea what the hell happened on the server. There was something really, really strange going on. An example:

# ls -l /usr/bin/traceroute
-rw-rw---- 1 mysql mysql 310872 Jun 21 03:21 traceroute

Why the hell is traceroute not executable and belongs to user/group mysql? There are several other anomalies there: /usr/share/doc/apt is not a directory as it is supposed to be, but a Perl script. /usr/bin/id is a directory. Multiple system tools (awk, sed, ...) are not executable and partly directories with strange stuff in them. What gives?

One possible explanation is that the server was hacked and some rootkit wrecked havoc on the server. After a quick glance at the logs, I couldn't find any hints for a successful breakin, though. Another possibility is that the hard drive simply died and/or the filesystem was (heavily) corrupted. I don't know...

Has anybody ever seen something like this? Please enlighten me what could have happened...


user mode linux vserver?

Hi Uwe,

is your provider using UML (user mode linux) to host the vserver? I'm using UML for quite a long time on a server to seperate some services.

I had some problems like you described above. UML uses file images to host the filesystems for the UML child kernels (if you dont want to use seperate partitions for each UML child instance). When mounting this files on the main system while the associated UML child kernel still access the images you get some garbage like you described.

Perhaps the service guy tried to mount your UML filesystem and forgot to umount it again? Perhaps a fsck on the UML filesystem image file on the host system can rescue your data.


Hi Claus!

Yes, I think they use UML, and that would probably explain the problems... My data is safe — I was able to make a backup, fortunately. The problem that remains is that I cannot reinstall; that may have to do with their UML images being broken or something. I'll wait and see.

Thanks for your help, Uwe.