Even Computers Make Mistakes
|Inside your PC, millions of electrical signals, traveling at nearly the speed of light, are sent and received every second.
Every minute, bit by the billions flow among your computer's CPU, RAM, static RAM (SRAM), adapter cards and peripherals. The
complexity of this traffic is beyond human comprehension.
Ever wonder what would happen if any of this data were somehow garbled during transmission? What would be the consequences if a single electrical pulse were relayed incorrectly?
|Its no big deal if a pixel on your screen appears one imperceptible shade bluer than it should be.
A moment later, the pixel is redrawn as screen information changes and the error disappears. But a one bit error, could easily change a $100 transaction to a $1 million one--or reduce it to zero. If either event happens, while you're balancing your checkbook, it's a very big deal. Data errors can also make programs crash or behave erratically, and cause your computer to display almost any error message in its repertoire.
The most common form of data error detection is parity checking.
This technique ensures the integrity of data stored on hard drives and floppy disks and sent via serial ports, and is often used to detect errors in RAM. Parity checking requires that each byte of data carry an extra bit, called the parity bit. The byte's original eight data bits determine the value of the extra bit whenever new data is stored.
For instance, if the number of l's , found in the original eight bits is odd (1, 3, 5 or 7), the parity bit will be set to 1. If an even number of 1's (0, 2, 4, 6 or 8) appears in the original eight bits, the parity bit will be set to 0. As a result, there is always an even number of l's in the entire group of nine bits (eight data bits plus the parity bit). That's why this parity checking technique is called 'even parity.' Another technique, 'odd parity," manipulates the parity bit to guarantee the number of l's in a byte is always odd.
You've probably already figured out how the parity bit allows you to detect errors. If the value of a data bit changes from 0 to 1, or 1 to 0 - the value of the parity bit must change, too. If your computer encounters a byte with a parity bit whose value is inconsistent with those of the other eight bits, it knows an error has occurred.
Unfortunately, that's all your computer knows. One bit is in error, but your machine can't determine which of the byte's bits has changed. Simple even and odd parity checking can also be fooled if more than one bit changes value. If the net effect of the changes doesn't invalidate the parity bit's setting, your computer will trust corrupt data.
To make matters worse, parity checking can actually increase the number of data errors that occur. Parity checking requires 12.5 percent more bits to store and transmit a given amount of data. And the more bits and corresponding circuitry, wires and connectors, the greater the chance that something will go wrong.
This, coupled with the fact that a computer can't recover from an error caught by simple parity checking, indicates this type of checking isn't worth the cost.
Thus the need for more elaborate error-detection schemes. All of them rely on more than one extra bit. Some schemes allow the computer to detect all errors involving 1 or 2 bits, and even correct single-bit errors by identifying the balky bit. Fancier techniques that require even more extra bits can detect and correct more extensive errors.
Unfortunately, all the extra bits make these error-detection and -correction methods expensive. For now, you'll find error-correction circuits only inside your hard drive and in the RAM circuits of certain very expensive computers. In an attempt to shave costs, manufacturers are removing even simple parity checking of RAM from many computer models. An extra 12.5 percent of RAM is enough to make a noticeable difference in a PCs final selling price.
That unchecked RAM can cause a variety of problems. You might experience random program crashes, strange messages or images appearing on the screen or printed pages, or mysterious random hangs. If this sounds like an episode of This Is Your Life, here are several things you can do (all with the power off and your computer unplugged):
- If your computer uses SIMMS, remove and then reinsert each unit.
- When you handle SIMMS, keep one hand or arm in constant contact with your computer's case to prevent static electricity from damaging the RAM.
- If your computer uses RAM chips, firmly press down on the top of each chip to make sure it's securely inserted in its socket.
- Adjust the BIOS settings in CMOS memory that affect RAM access. Increase the number of RAM wait states as well cache read and write wait states.
- Swap half of your SIMM's or chips with reliable RAM from another computer. If the problem goes away, one of the SIMM's or chips you've just removed may be bad. To find out which one, reinsert the chips, replacing a reliable one with a different chip each time until you find the culprit. If the problem still doesn't go away, the group of SIMM chips you didn't swap may include a troublemaker. Repeat the process, swapping the other half of the memory group. Continue until you've confirmed that all is working properly.
- Disable your RAM cache, if your computer's BIOS setup program allows you to do so
- Try each of these suggestions one at time until you've solved the problem
Adapter card culprits
Adapter cards are responsible for surprisingly large number of data errors. Besides being complex circuits in the own right, adapter cards act as a crossroads for many cables and plug into larger sockets. Both factors are likely sources data error.
In most cases, the data flowing to a from adapter cards isn't checked for errors. As a result, data miscues involving adapter cards often go undiagnosed for long time. Symptoms include disk corruption and random program crashed unreliable modem or printer connections, poor sound from your sound car stray pixels on your screen and other bizarre behavior.
If one or more of these problems affects your computer, here are some steps that may help:
- Remove and then reattach all cables. Remove and then reinsert all adaptor cards.
- While the cards are out of the sockets, use a slightly damp soft cloth tissue to remove lint and dirt from the cards and sockets. You can use water or alcohol to wet the cloth or tissue. Let everything dry completely before you reinsert the adapter cards.
- Clean the cards and cable connectors, and apply Stabilant 22 to card, cable and socket contacts
- Adjust the BIOS settings in CMOS memory that affect your adapter-card bus speed. All cards will work at 8 MHz or 10 MHZ, but some may not be able to respond quickly enough to operate reliably at higher bus speeds, such as 12 MHz or 16 MHz. In most cases, you can reduce the bus speed by increasing a divisor used to derive the bus clock from the CPU clock.
Handle the cards carefully, making sure they don't suffer damage from static electricity. And be sure to perform these steps only when your computer is off and unplugged. I don't want you to experience any high-voltage-induced data errors of your own!
Keep in Contact
Here are some ways to put the spark back in your system.
Electrical contacts occur when signal-carrying metal meets another piece of metal. In computers, these contacts include places where the edge of a SIMM meets the prongs of a SIMM socket, and where the pins of a chip meet the metal receptacles inside a chip socket. Other contacts occur where cords carrying electrical power plug into hard drives and other internal components; where cables connect to your drives and adapter cards; and where the edges of adapter cards meet those 8-, 16- and 32-bit card sockets along the back of your PC.
Normally, electrical signals jump from one metal surface to another without a hitch. But if a surface is dirty or corroded, the signal may become degraded. This can cause a number of problems, from intermittent data errors to the apparent failure of a drive, adapter card or chip.
If you suspect you're experiencing such a failure to communicate, there are ways to put the spark back in your system. First, clean the metal surfaces that meet at the corrupted contact point. The metal fingers on the edges of adapter cards and some hard drives are easy to clean. just take a rubber eraser and gently "erase" the contact surface. The white synthetic erasers sold in art supply stores are best - they don't leave as many crumbs. Avoid very abrasive erasers, and be careful not to rub too hard or for too long. You want to remove the dirt and corrosion, not the metal underneath.
You can also clean many contacts with alcohol. Pure isopropyl alcohol is best because it doesn't leave a residue when it dries, but it's hard to find. Ordinary rubbing alcohol (70%), the kind you find in any drugstore, will do in a pinch. If the contact surface is hard to reach or attached to your motherboard or a card, wet a cotton or foam swab and gently wipe the contact's surface. Repeat the procedure, using a new swab for each treatment, until the swab comes out clean. If you're cleaning cables or other contacts that you can remove from the computer, you may be able to rinse the contact with alcohol to remove large amounts of debris.
It's not always possible or prudent to clean a contact. For instance, the contacts inside adapter card sockets are hard to reach.
And the risk of damaging chips by removing them from their sockets, often outweighs the benefits of cleaning their contacts. But even when a contact bath is out of the question, there's still something you can do to improve a connection
A small company named D.W. Electrochemicals (905-508-7500) fax (905-508-7502) has developed a remarkable liquid called Stabilant 22 that allows even dirty contacts to perform properly. Stabilant 22 is an organic compound that allows electricity to flow where it should, not where it shouldn't. For instance, within your computer, Stabilant enables signals to travel from one contact surface to another, but not between adjacent pins on a chip,
Stabilant is a great conductor.
How does Stabilant pull off this trick? The explanation's a bit technical but for the hard-core techies and terminally curious, here goes:
Normally, Stabilant is an insulator. But in the presence of a large electric-field-gradient, it becomes an excellent conductor. An electric field gradient is the "slope" of an electric field. It indicates to what degree voltage change over distance (voltage difference between two surfaces, divided by the distance between surfaces). Within your computer, distance between a pin and a socket is so small that the gradient very large (on the order of thousands of volts per inch), causing the liquid to become a conductor. But the distance between adjacent contacts is great enough to keep the gradient low (on the order of tens volts per inch)-well below the level Stabilant needs to make t transition from insulator to conductor.
The diluted form of Stabilant 22, called Stabilant 22A, is best for most computer uses. Apply a drop to the pins of a chip while it is still in its socket, and the liquid will penetrate the contacts. Use an eyedropper or swab to apply Stabilant to contacts inside edge card sockets, cables, and drive power and cable connectors. You need only a single drop, just enough to cover the contact surfaces to a depth of 1 or 2 mills, (about 4 to 8 hundredths of an inch).
Contributing Editor Karen Kenworthy of Visual Basic for Applications, Revealed! (Prma Publishing, 1994) and the manager of WINDOWS Magazine forums on Amrica Online and CompuServe. Contact Karen's "Power Windows" topic of these areas or care of the editor at the Windows Magazine addresses, typically on page 18.