Sunday, November 7, 2004

New box

Well after looking at some more kernel panics with a friend we came to the conclusion the problem was not the usb kernel code as I had thought, but rather it looked like the hd. With that in mind I then disabled my software raid 1 setup, or at least I thought I did. It turned out that although I had disabled it at the software level when I rebooted it had "autodetected" the raid 1 configuration due to superblock modifications by raid and had booted up with raid anyway! I then rebooted passing the raid=noautodetect option to the kernel only to find that now lvm was seeing duplicate PVID's and was effectively using both HD's anyway! At least I know that I had a really resilliant setup :) Anyway after finally turning off all raid / balancing I was able to boot off one HD and see how that went under the impression that if it didn't fail it was the good HD. Ofcourse it promptly failed so I booted off the other HD now confident that it was the good HD. You can imagine my perplexion when it promptly failed even more spectacularly. Now I was left with the idea that the problem might have been one of the following:
* bad CPU
* bad motherboard (as mentioned previously there were some broken fans on the MB)
* bad ram - after all the problem was transient which is the best indication of ram, and often happened when compiling.
* bad hd - both were old scsi, in fact I have such an old scsi hd in it that its 4gb!
* bad compile options - I was using gentoo hardened which has some hardcore compile options, and then I had broken their guidelines and optimised that further. Finally I had reverted back off that and had cross compiled some binaries from my other amd box.

The more I thought about it the less sure I was that anything was fucking working as advertised, so I took the tried and tested option - I bought a new computer :) I briefly agonised about using this as an excuse to go the "full monty" and upgrade to 64 bit, but after thinking about it and talking to a friend I realised that it would be best to simply get an ultra cheap, but still very powerful "normal" upgrade. So as I write this the confirmation orders are coming in from komplett and ebuyers. The new terra will be a athlon xp 2800, 1 gig ram + other assorted goodies. I hope to have the parts and be building it by next weekend, now its over to the Brittish mail system. sigh

2 comments:

  1. RAM, Always do RAM first I have found, transient problems with hard drives will give you some indication it is the disk where as RAM will just fuck you around for ever doing strange stuff. If not ram I would say a more deeper rooted problem... to agressive kernel compile (try vanilla for a few days) or rooted MB or CPU (last place I would look)

    ReplyDelete
  2. [...] lo and behold its working perfectly! This is the system that gave me so much grief in the past. So now I have another system that I am bring into the network. Combine that with my fuji arriving and I am get [...]

    ReplyDelete