What is MD5 and why do I care?
MD5

Lance Spitzner

When you send data over a network, there are three issues most organizations have, security, authenticity, and integrity. The security of your data ensures that no one can read your data. This is important for the military, where secrets have to be kept from enemy hands. Authenticity guarantees the originator of the data, you know for certain who sent the data. This is important for the legal world, such as digital signatures. Integrity guarantees that the data has not been altered in transit, that the data you received is the data that was sent. This is important for many industries, such as the financial world. MD5 is such a tool, it guarantees the integrity of your data.

MD5 can help you in a variety of ways. When you download files from the Internet, you can use MD5 to guarantee you downloaded the correct file. This protects you from Trojans or corrupted files. If you uses tools such as Tripwire to protect the integrity of your filesystem, you are most likely using MD5. You are most likely using MD5 if you are using a public/private key infrastructure.

Developed in 1994, MD5 is a one-way hash algorithm that takes any length of data and produces a 128 bit "fingerprint" or "message digest". This fingerprint is  "non-reversible", it is computationally infeasible to determine the file based on the fingerprint. This means someone cannot figure out your data based on its MD5 fingerprint. Here is an example of a MD5 output for the binary /usr/bin/ls:

homer $md5 /usr/bin/ls

MD5 (/usr/bin/ls) = 1eabd3dbc0746c8a4b5467f99a4f8823

The actual finger print is

1eabd3dbc0746c8a4b5467f99a4f8823

Basically, what MD5 did was apply a mathematical algorithim to the "ls" binary to produce the fingerprint (to learn the gory mathematical details about the algorithim, check out RFC 1321 at http://www.cis.ohio-state.edu/rfc/rfc1321.txt.) Everytime you do a MD5 hash of the binary /usr/bin/ls, you should get the exact same fingerprint. If you get a different fingerprint, then the binary has been altered, maybe the result of a system patch or the binary has been trojaned.

When you download a new file or patch, one of the first things you can do is a MD5 hash of the file. Compare the fingerprint to a known good fingerpint (usually posted on remote site). If the fingerprints match, you can be assured of the file?s integrity. This is how the tool Tripwire works. It builds a database of fingerprints for all your binaries, then later on compares the binaries to that database. However, tripwire uses a variety of hash algorithms in addition to MD5, such as snefru.

Since MD5 does not encrypt data, it is not restricted by any exportation rules. You can freely use and distribute this tool anywhere in the world. To learn the history of MD5, check out http://www.rsa.com/rsalabs/faq/html/3-6-6.html You can download MD5 at http://www.leo.org/pub/comp/general/security/md5/index.html.
 
 

 Author's bio
Lance Spitzner enjoys learning by blowing up his Unix systems at home. Before this, he was an Officer in the Rapid Deployment Force, where he blew up things of a different nature. You can reach him at lance@spitzner.net .
 
 

Whitepapers / Publications