Ask The Oracle

...A Tech blog for Engineers, Software Devs, and Computer Geeks

Why you should Avoid Checksums and Use CRCs Instead

Written by BobW on April 20th, 2010

Error detection is an important consideration for the designers of embedded systems. When the computer in your spacecraft sends a message to the main engine to shut off, it is important to make sure the message is received correctly. You want to make sure it can tell the difference between commands to, turn on, turn off, and self destruct. You might choose to use a “standard CRC algorithm” to guarantee the integrity of your data transmission. However, you might be surprised to find out how many “standard algorithms” there are. Here is some information to help you choose your implementation wisely.

Checksums and why you should avoid them

Many non-critical systems use a simple checksum to detect a transmission error.  On the transmitting side, all of the bytes in the message are summed and stored in a checksum byte. The receiving end performs the same summing function and compares this value to the stored checksum. If it matches, the message is assumed to be OK.

A problem with this approach is that a corrupted message has a 1 in 256 chance (8 bits gives 256 possible checksums) of passing the checksum test.  For example, if a message has a checksum of 0xE3, there are many possible erroneous messages that also have a checksum of 0xE3.

You might think you can solve this by making the checksum wider, maybe 16 bits wide. You figure that should give you a 1 in 65,535 chance that a bad message gets through. Unfortunately, that is not the case because bytes in the message are 8 bits and don’t sufficiently affect the value of the checksum. Here is a simple example message with 2 corrupted bits to demonstrate this:

Original Message
0x00 0x00 0x00 0x02  (checksum = 0x02,   ENGINE_SHUTDOWN_MSG)

Corrupted Message
0x00 0x00 0x01 0x01  (checksum = 0x02, SELF_DESTRUCT_MSG)

Both the good message and the corrupted message have the same “correct” checksum. You could make the checksum 16 bits (0x0002) or 100 bits wide. It would still not detect the error in the bad message because the 8 bit data does not affect the value in the upper bits of the checksum.

That’s why we use CRCs

What you really want is an algorithm that creates large changes in the checksum value regardless of whether the data contains all zeroes or random data. CRCs (Cyclic Redundancy Check) are magical algorithms that provide a more robust error detection technique. These algorithms may seem compled, but it turns out that is not that hard to understand how and why CRCs work.  In his article, “A Painless Guide to CRC Error Detection Algorithms”, Ross Williams walks the reader through the details of CRCs. He explains about the “polynomial”, seeds, reflection and other parameters you will encounter in the various CRC implementations.

Quick Guide to CRCs

For those who are more interested in using CRCs than understanding how they work, here are the important points. There is no “standard CRC algorithm”, rather there are a great many standard CRC algorithms.  For example, there are many completely different, incompatible, CRC implementations that are commonly called CRC-16.

Here is a list of some of the standard CRC implementations. Here is another List.

Pick one of the established standards. There is no need to reinvent theses algorithms as the source code for many of these algorithms is available on the web (here is a C source code generator for CRCs ). Document in your code which algorithm you use. Don’t invent your own CRC algorithm unless you are an expert in mathematics. Not all implementations are equal.

0 Comments so far ↓

  1. […] that you can implement in your microprocessor at boot time to check the compiled CRC (see “Why You Should Avoid Checksums and Use CRCs Instead“).  This is where SRecord saves the […]

You must be logged in to post a comment.