Accessibility
 
Icon or Spacer
   
Understanding Encrypt, ToBase64, and Hash
by Mary Horvath, Allaire Documentation Group

Editor's Note: This article uses CSS properties that may not be rendered by all browsers. For best results, you should view this one in IE 4.0+. You can also download a ZIP archive of this page which contains the article in HTM and PDF format.


Tales from the Cryptographer: Encrypt, ToBase64, and Hash

Abstract

This Technical Note is designed to give you greater understanding of cryptography in general, and, in particular, the Encrypt and Decrypt functions, the ToBase64 function, and the Hash function. It examines the algorithms used by each function and explains how to use each one.

Most of the general information about cryptography comes from Bruce Schneier's text Applied Cryptography, which I would highly recommend if you are working in the realm of cryptography and have not read it already.

Contents

Cryptography

Cryptography is the art and science of keeping messages secure. Cryptography at its best provides the following:

  • Confidentiality: the sender of a message should be able to limit access to a message.
  • Authentication: the receiver of a message should be able to ascertain the origin's of the message. See the documentation of CFAUTHENTICATE in CFML Language Reference for related information.
  • Integrity: the receiver of a message should be able to ascertain whether a message has been modified in transit.

Encryption plays a key role in providing confidentiality. Encryption disguises the contents of a message by applying a mathematical algorithm to it. Not only is encryption valuable in ensuring the confidentiality of a message, it is also useful in maintaining the privacy of any data: data stored in files and data stored in databases.

There are many encryption algorithms providing varying degrees of security. The ColdFusion Encrypt function uses a symmetric key-based algorithm in which the key used to encrypt a string is the same key as the one used to decrypt the string. The security of the encrypted string hinges on maintaining the secrecy of the key. For more details on the encryption algorithm, see Algorithm Selection.

The one-way hash function is important in securing information and in providing some measure of data integrity. A one-way function is one that provides a conversion, however, there is no means of reversing the conversion. In particular, the MD5 hash algorithm that Allaire uses converts a variable-length string to a fixed-length string. The resulting string is always the same length, 32 bytes. Although you cannot use a hash function to determine whether two strings are equal, you can use it to get a reasonable assurance of accuracy.

A hash function can be applied to a password, thus creating a "fingerprint" of the password that can be stored in a database. Because no one can reverse the conversion, storing a password in this fashion is very secure. Likewise, you can apply the hash function to a document that is to be stored in a content management system. Whenever someone adds a new document, the content management system compares the hash values of existing documents with the hash value of the new document to ensure that it is not already in the system without having to compare the document with the complete contents of every other document. For details on the ColdFusion hash function and an example of its use, see Hash Function.

Algorithm Selection

Although the Encrypt function provided by ColdFusion offers some degree of security, there are other public algorithms, such as DES (Data Encryption Standard), which is a symmetric key algorithm. DES is a U.S. and international standard.

One measure of the strength of an algorithm is how difficult it is for a hacker to figure out the key used to encrypt a string. The key used by the ColdFusion algorithm is 32 bits long; however, there are other algorithms that use keys up to 1028 bits long.

One rule of thumb offered by Bruce Schneir in his book Applied Cryptography is:

"If the cost required to break an algorithm is greater than the value of the encrypted data, then you're probably safe."

Encrypt and Decrypt Functions

The ColdFusion Encrypt function uses an XOR-based algorithm that utilizes a pseudo random 32-bit key based on a seed passed by the user as a parameter to the function. The resulting data is UUencoded and may be as much as three times the original size. For more details, see What is UUencoding?.

Syntax

Encrypt(string, seed)
string

String to be encrypted.

seed

String specifying the seed that is used to generate the 32-bit-key used to encrypt string.

You can use the Decrypt function to reverse the encryption by supplying the encrypted string and the key that was used to encrypt the string.

Syntax

Decrypt(encrypted_string, seed)
encrypted_string

String to be decrypted.

seed

String specifying the seed that was used to generate the seed that was used to encrypt encrypted_string.

Note If you copy this example into ColdFusion Studio, name it encryptmystring.cfm or change the ACTION page of the FORM tag to be the same name as your file.

Example

The following example shows the use of the Encrypt and Decrypt functions and shows the size of a string before and after it has been encrypted.

<HTML>
<HEAD>
<TITLE>Encrypting and Decrypting Example</TITLE>
</HEAD>

<BODY>

<P>This example allows for the encryption and decryption of a 
string. Try it out by entering your own string and a key of your 
own choosing and seeing the results.
<CFIF IsDefined("FORM.myString")>
   <CFSET string = FORM.myString>
   <CFSET key = FORM.myKey>
   <CFSET encrypted = encrypt(string, key)>
   <CFSET decrypted = decrypt(encrypted, key)>
   <CFOUTPUT>
    <H4><B>The string:</B></H4> #string# <BR>
    <H4><B>The string's length:</B></H4> #len(string)# <BR>
    <H4><B>The key:</B></H4> #key#<BR>
    <H4><B>Encrypted:</B></H4> #encrypted#<BR>
    <H4><B>The encrypted string's length:</B></H4> #len(encrypted)# <BR>
    <H4><B>Decrypted:</B></H4> #decrypted#<BR>
    <H4><B>The decrypted string's length:</B></H4> #len(decrypted)# <BR>
    </CFOUTPUT>
</CFIF>
<FORM ACTION="encryptmystring.cfm" METHOD="post">
<P>Input your key:
<P><INPUT TYPE="Text" NAME="myKey" VALUE="foobar">
<P>Input your string to be encrypted:
<P><textArea NAME="myString" cols="40" rows="5" WRAP="VIRTUAL">
This string will be encrypted (try typing some more)
</textArea>
<INPUT TYPE="Submit" VALUE="Encrypt my String">
</FORM>
</BODY>
</HTML>       

What is UUencoding?

UUencoding is an algorithm for converting binary (8-bit) data into a sequence of 7-bit characters. It was designed to assist in the transport of binary files over communications mediums, such as netnews and early email systems, that could not directly transport raw binary data. UUencode was first developed for Unix systems (and was used in the UUCP-Unix to Unix Copy- subsystem) and was later ported to many other platforms.

UUencode works by taking the binary data several bits at a time, and using that numeric value as an offset in a list of designated characters. To interoperate among different systems, the list of characters used in the lookup table must be consistent. Some early UUencode implementations allow a space as one of the replacement characters. This introduces problems when some email and other transports truncate trailing spaces from message lines. Further, other implementations replaced the space with a single quote (`), introducing potential incompatibilities.

To address the need to be consistent, the developers of the MIME email standard developed an encoding algorithm called "Base64." Although it is identical in procedure to UUencode, Base64 uses a carefully chosen set of standard characters designed to be transportable through all known gateways.

What is Base64?

Base64 provides 6-bit encoding of 8-bit ASCII characters. Base64 is a format that uses printable characters, allowing binary data to be sent in forms and email, and to be stored in a database or file. Because high ASCII values and binary objects are not safe for transport over internet protocols such as HTTP and SMTP, ColdFusion offers Base64 as a means of safely sending ASCII and binary data over these protocols.

In addition, Base64 allows you to store binary objects in a database if you convert the data into Base64 first.

ToBase64 Function

The ToBase64 function returns the Base64 representation of a string or binary object. In order to reverse Base64 encoding of a binary object, you can use the ToBinary function.

Note To reverse the encoding of a string, you must first convert it into a binary object, and then convert the binary object into a string using ToString.

Syntax

ToBase64(string or binary_object)

Example

This example shows the use of ToBinary and ToBase64.

<HTML>
<HEAD>
<TITLE>
ToBase64 Example
</TITLE>
</HEAD>

<BODY  bgcolor="#FFFFD5">

<H3>ToBase64 Example</H3>

<!----------------------------------------------------------------------
Initialize data.
----------------------------------------------------------------------->
<CFSET charData ="">
<!----------------------------------------------------------------------
Create a string of all ASCII characters (32-255) and concatenate them 
together.
----------------------------------------------------------------------->
<CFLOOP index="data" from="32" to="255">
    <CFSET ch=chr(data)>
    <CFSET charData=charData & ch>
</CFLOOP>
<P>
The following string is the concatenation of all characters (32 to 255) 
from the ASCII table.<BR>
<CFOUTPUT>#charData#</CFOUTPUT>
</P>
<!----------------------------------------------------------------------
Create a Base64 representation of this string.
----------------------------------------------------------------------->
<CFSET data64=toBase64(charData)>

<!----------------------------------------------------------------------
Convert string to binary.
----------------------------------------------------------------------->
<CFSET binaryData=toBinary(data64)>
<!----------------------------------------------------------------------
Convert binary back to Base64.
----------------------------------------------------------------------->
<CFSET another64=toBase64(binaryData)>
<!----------------------------------------------------------------------
Compare another64 with data64 to ensure that they are equal.
----------------------------------------------------------------------->
<CFIF another64 eq data64>
    <H3>Base64 representations are identical.</H3>
<CFELSE>
    <H3>Conversion error.</H3>
</CFIF>
</BODY>
</HTML>

Hash Function

The Hash function takes a variable-length string and converts it into a 32-byte, hexadecimal string, using the MD5 algorithm developed by Ron Rivest. The MD5 algorithm is a one-way hash, meaning that there is no conversion possible from the hash result back into the source string.

Syntax

Hash(string)
string

Any string.

The result of the Hash function is a fingerprint of the original data. This fingerprint can be used for comparison and validation purposes. For example, a developer could store the hash of a password in a database without exposing the password itself. Later, the developer could check the validity of the password with the following code:

<CFIF Hash(Form.Password) IS NOT MyQuery.PasswordHash>
    <CFLOCATION URL="unauthenticated.cfm">
</CFIF>

Example

The following example is designed to contrast the resulting data from the Hash function and the Encrypt function.

<HTML>
<HEAD>
<TITLE>Hash Example</TITLE>
</HEAD>

<BODY>

<P>This example shows the resulting data from the Hash function and from 
the Encrypt function. Try it out by entering your own string and a key of 
your own choice and see the results.
<CFIF IsDefined("FORM.myString")>
    <CFSCRIPT>
       string = FORM.myString;
       key = FORM.myKey;
       encrypted = encrypt(string, key);
       decrypted = decrypt(encrypted, key);
    hashed = hash(string);
   <CFOUTPUT>
    <H4><B>The string:</B></H4> #string# <BR>
    <H4><B>The string's length:</B></H4> #len(string)# <BR>
    <H4><B>The key:</B></H4> #key#<BR>
    <H4><B>Encrypted:</B></H4> #encrypted#<BR>
    <H4><B>The encrypted string's length:</B></H4> #len(encrypted)# <BR>
    <H4><B>Hashed:</B></H4> #hashed#<BR>
    <H4><B>The hashed string's length:</B></H4> #len(hashed)# <BR>
    </CFOUTPUT>
</CFIF>
<FORM ACTION="hashmystring.cfm" METHOD="post">
<P>Input your key:
<P><INPUT TYPE="Text" NAME="myKey" VALUE="foobar">
<P>Input your string to be encrypted:
<P><textArea NAME="myString" cols="40" rows="5" WRAP="VIRTUAL">
This string will be encrypted (try typing some more) and then the same 
string will be hashed.
</textArea>
<INPUT TYPE="Submit" VALUE="Hash my String">
</FORM>
</BODY>
</HTML>       

Summary

From this technical note, you should have learned a number of things:

  • CFML encryption UUencodes data.
  • Resulting data can be as much as three times larger than the original data.
  • CFML encryption can be reversed by using the Decrypt function.
  • Base64 encoding is similar to UUencoding, but uses a fixed set of characters to perform encoding.
  • ToBase64 encodes data.
  • Base64 can be used to transport binary and ASCII data over HTTP and SMTP protocols.
  • ToBase64 converts binary data that can then be stored in a database.
  • Hash converts variable-length strings into 32-byte hexadecimal strings.
  • There is no function that reverses the conversion of the Hash function.
  • Hash takes a fingerprint of a password. You can use this fingerprint to safely store a password in a database.
  • Hash also takes a fingerprint of a document, which you can then use to check to see if a document is already stored in a database.

For the latest information from Allaire on security issues, see http://www.allaire.com/developer/securityzone/. It is important to check this page often, since new security measures are documented on a regular basis on this page.

See also http://www.counterpane.com/crypto-gram.html for Bruce Schneier's Crypto-Gram newsletter. Crypto-Gram is a free monthly e-mail newsletter on computer security and cryptography