On Saturday 04 June 2005 13:12, Prateek Srivastava wrote:
> My machine is new and well configured.
> > Is your HD or memory starting to show its age?
I saw problems like this many times on new machines with load-intensive 
software. It is common practice to produce machines with very cheap 
components - why bother with ECC-RAM, good high-datarate cables, and 
server-grade Harddisks if 95% of these machines will run Windows anyway and 
users are used to blame it on the OS or their own inability?
Tips for diagnosis: 
*let a memory tester run over night - if it does find something then you've 
got VERY bad memory chips, if it doesn't you unfortunately won't know for 
sure (if you run SuSE: it is on the install DVD, just boot into memtest 
instead of into Linux)
*check that all disk cables are properly connected and don't have sharp 
bends or even lose connectors
*have you switched on ATA-133 or other high-speed modes? (Linux: hdparm) 
Switch them off and try again. 
*Often you get indications of problems if you do dmesg (Linux-specific 
though) - if you get lines like "hdb: lost interrupt" it means the 
communication between your mainboard and the disk is bad.
*does the same problem occur on a different system? (if you have a real 
server - not just a PC posing as one - use the server for this test) If 
yes: it is either a problem with the OS (faulty library) or coincidence 
(test on a third system) or really a problem with SVN. If no: you've got a 
problem with the hardware.
Tips for the next system:
*never buy pre-configured systems from big end-consumer retailers - these 
are Cheap[tm], I personally use very small computer shops, preferably those 
that regularly deliver small servers to business customers and do the 
support for those - these guys have the most experience with how to build a 
good system that is still in your budget
*use workstation or server boards - ok they are expensive, but they have the 
plus of requiring and supporting ECC-RAM:
*use ECC-RAM - DRAM cells are extremely susceptible to bit flips (the 
probability of a flip goes up exponentially with temperature and with the 
amount of bits in the system), ECC-RAM is twice as expensive because it a) 
uses good chips (which passed ALL tests instead of only the barely 
necessary ones) and b) is able to fix almost all bit-flip situations (I've 
NEVER had problems with ECC, but constantly have them on normal cheap 
chips)
*while we are on it: use server processors - they are also twice as 
expensive because they had to endure twice as many tests and have a twice 
as large margin (a processor that would be sold as a 2.5GHz consumer 
machine would be sold as a 2.0GHz server machine)
*use cables that actually support high-data-rates and use server-grade 
harddisks (the lower ranges of server-disks are not much more expensive 
than consumer disks), don't use spin-down or disk suspend, since 
server-disks are not optimised for that mode
*let your supplier/retailer built in powerful fans - there are fans that are 
both powerful and silent today (I'll never do that myself again after I 
ruined two mainboards with underpowered CPU-fans).
Ok, this all sounds terribly paranoid and expensive. Actually: it is. Both 
of it. It comes from a lot of bad experience with "consumer grade" systems 
and some great experiences with "server grade" workstations. You pay a lot 
for that, but at least I can be sure that if something goes wrong now it is 
the software.
Even if you don't want to go to these extremes: use one of those small shops 
that do daily support for small business people and talk to them. They know 
which kind of consumer-systems are often returned for repair and which 
aren't... ;-)
        Konrad
- application/pgp-signature attachment: stored
 
 
Received on Sat Jun  4 16:19:23 2005