Accessibility
 
 
Programmatic Caching in ColdFusion
By Ben Forta
Product Evangelist
Allaire Corp.

and

David Golden
Technical Writer
Allaire Corp.

Web applications are dynamic and should be data-driven wherever possible. After all, the major distinction between Web site and Web application is that the former is static while the latter is anything but. It is Web applications that deliver the true power and promise of the Web — the kind of applications you build in ColdFusion.

Unfortunately, all that extra power can come with a hefty price tag. Static pages can be served up instantly. (Servers don't have to do anything with the files other than just opening them and streaming them to the client as is). Dynamic pages, on the other hand, require parsing and processing. That takes time.

This is where caching comes into play. Caching is a mechanism with which you, the developer, can control the degree to which your content is dynamic. Some information — like the location of a city's fire engines — needs to be kept the most up-to-date. At the same time, other information can be a few hours or a few days old, such as monthly interest rates on credit cards. Cached Web applications are still dynamic, but you can also relieve some of the burden associated with being dynamic.

Simply, caching involves trade offs. In other words, you must be able to recognize that some operations are faster than others, and that it is possible to make sacrifices (e.g., memory usage, disk usage, how current data is, etc.) to gain performance.

ColdFusion supports several forms of caching. The caching options covered in this article are:

  • P-Code Caching
  • Page Output Caching
  • Database Query Caching
  • Client-Side Caching

P-Code Caching

When ColdFusion Server processes a template, it searches for an Application.cfm template. Next, ColdFusion reads the requested template from the disk, and then it reads the OnRequestEnd.cfm template (if one exists in the Application.cfm directory). If the processed pages use Custom Tags or includes (via the CFINCLUDE tag) then those files are read and processed too. Therefore, for any page request, ColdFusion Server usually has to make multiple disk requests.

There are two bottlenecks in this process. First, disk access is one of the slowest operations. (Memory access is many times faster.) Second, each time a file is read from disk ColdFusion parses it to perform basic error and syntax checking. Parsing takes time.

To address these problems, ColdFusion Sever minimizes the amount of reading from disk by avoiding the reload of previously read templates. It accomplishes this by reading templates once and caching them into the p-code (the preprocessed template cache). ColdFusion will use the p-coded version of a template from cache if it determines that the time stamp matches the template on disk.

P-Code Cache Size

In the ColdFusion Administrator, you can configure the size of the p-code cache to be allotted in server memory. The default is 1 MB, but you can increase that if needed. Based on a MFU algorithm, ColdFusion Server will automatically purge the cache to make space for new files.

Figure 1

Figure 1: ColdFusion Administrator Caching Settings

The size of the p-code cache depends on the number of templates you would like to cache. In general, p-code cache should be figured using a 4x-5x ratio. A processed version of a template contains much more information than a preprocessed version. Therefore, if a template takes 1K on disk, it could take 4K in cache. The ideal cache size is then 4 or 5 times the total size of all CFM files on your system. While this does increase server memory load, RAM is cheap and the performance gain can be significant.

Trusted Cache

As previously stated, ColdFusion must still compare the time stamp of cached version stamp with the time stamp of the template on disk. The time code must be checked for whichever files must be opened.

Checking date and time stamps is not a terribly costly process, but it does add processing time, and as load increases those little stamp checks add up. To address this problem, ColdFusion Server is equipped with Trusted Cache. By activating the Trusted Cache option in the ColdFusion Administrator, ColdFusion Server will assume the template in cache is up to date and to forgo the time stamp check

The trusted cache option is especially beneficial in instances of using a shared directory for templates across a server cluster. Of course, whenever you want to purge the cache and reload a template, you must disable trusted cache and reload the templates.

With SP1 of ColdFusion Server 4.5.1, no time stamp checking of any kind will occur when trusted cache is used. In other words, the disk I/O is zero. That means if you have a template in p-code cache, you could delete it off the disk and continue to access the template.

Whether or not to use trusted cache is a question of the production process used by the application. If production is changed on an ad hoc basis and there is not a regimented update process in place, trusted cache can display erroneous data because it will not reload a fresh template. Trusted cache relies on having a structured deployment in which the server is recycled on a periodic basis within strict guidelines.

Page Output Caching

As already explained, static content can be delivered far quicker than dynamic content. It follows that if dynamic content did not need to be regenerated on each request, you'd have the benefit of dynamic content being served as if it were static. Page output caching enables you to process CFML pages this way — caching the generated output of a page so that it may be served without reprocessing.

At either a page level or component level, the <CFCACHE> tag performs a <CFHTTP> request under the covers, and the resulting page is the saved off in a file.

The caching engine is GET variable aware, so URL parameter differences are respected. Therefore, if the URL contains a record ID, you can potentially have a cached HTML version of each record ID. The caching engine is not POST variable aware. Therefore, if a page varies by passed form field, it should not be cached with CFCACHE.

To enable server-side caching in the <CFCACHE> tag, the ACTION command must be set to CACHE (i.e., ACTION="CACHE").

For a very simple example, take a look at the following code:

<!--- This example will produce as many cached files as there
    are possible URL parameter permutations. --->
<CFCACHE>
<HTML>
<HEAD>
<TITLE>CFCACHE Example</TITLE> 
</HEAD>
<BODY>
<H1>CFCACHE Example</H1>

<H3>This is a test of some simple output</H3>
<CFPARAM NAME="URL.x" DEFAULT="no URL parm passed" >
<CFOUTPUT>The value of URL.x = # URL.x #</CFOUTPUT>
</BODY>
</HTML>

For additional performance gains, you can cache generated output to memory. Of course, you have to do more work yourself.

In memory page output caching, ColdFusion Server checks to see if there is a generated version of a requested page in memory with the URL as the variable name. If the requested page exists, ColdFusion will return the variable value in the form of static HTML to the client. If it doesn't exist in the memory page output cache, that page will be regenerated using <CFHTTP> and stored back to the cache as a variable value.

These variables should be stored in the APPLICATION scope to allow them to be cached for an extended period time across different requests. Please note that, unless you specify TIMEOUT, the cache will be eternal.

Database Query Caching

Database queries can slow an otherwise fast Web site to a crawl. Many developers are not aware of ColdFusion Server's features for caching database operations. The results of database caching can be significant. For example, if an one second query runs sixty times a minute and returns identical information every time, it can be cached to save the server 59 seconds of processing time per minute.

Query-based result caching means that you will not have the most recent information 100 percent of the time. Therefore, the cached queries should be information that does not change very often, such as the names of the fifty U.S. states or queries as the result of "next n" style interface. For maximum flexibility, ColdFusion can save query sets to SESSION and APPLICATION scopes.

A maximum of 100-cached queries server-wide may exist at any time (see Figure 3), and new queries are not cached until the oldest query is dumped. Query-based result caching is used on a first-come, first-served basis, and it can be disabled.

CACHEDWITHIN and CACHEDAFTER

Use the <CFQUERY> tag to configure your query-based result caching settings. Rather than working with the IF statements in variable-based result caching, <CFQUERY> allows you to add parameters quickly and easily.

However, if you need explicit management of caching activities and no maximum level of cache size, variable-based result caching may be for you. For instructions on how to use variable-based result caching, please refer to the ColdFusion Server documentation.

For example, CACHEDWITHIN specifies a relative time span in which query caching will be used. Here is a <CFQUERY> in action with CACHEDWITHIN enabled:

<!--- This example shows the use of CFQUERY with CACHEDWITHIN enabled. --->

<HTML>
<HEAD>
    <TITLE>CFQUERY Example</TITLE>
</HEAD>

<BODY>
<H3>CFQUERY Example</H3>

<!--- define startrow and maxrows to facilitate
    'next N' style browsing --->
<CFPARAM NAME="MaxRows" DEFAULT="10">
<CFPARAM NAME="StartRow" DEFAULT="1">

<!--- query database for information --->
<CFQUERY NAME="GetParks" DATASOURCE="cfsnippets" CACHEDWITHIN="#CreateTimeSpan(0,0,10,0)#">
SELECT      PARKNAME, REGION, STATE
FROM         Parks 
ORDER by ParkName, State
</CFQUERY>

<!--- build HTML table to display query --->
<TABLE cellpadding=1 cellspacing=1>
<TR>
    <TD colspan=2 bgcolor=f0f0f0>
    <B><I>Park Name</I></B>
    </TD>
    <TD bgcolor=f0f0f0>
    <B><I>Region</I></B>
    </TD>
    <TD bgcolor=f0f0f0>
    <B><I>State</I></B>
    </TD>
</TR>

<!--- Output the query and define the startrow and maxrows
      parameters. Use the query variable CurrentCount to
      keep track of the row you are displaying. --->
<CFOUTPUT QUERY="GetParks" StartRow="#StartRow#" MAXROWS="#MaxRows#">
<TR>
    <TD valign=top bgcolor=ffffed>
    <B>#GetParks.CurrentRow#</B>
    </TD>
    <TD valign=top>
    <FONT SIZE="-1">#ParkName#</FONT>
    </TD>
    <TD valign=top>
    <FONT SIZE="-1">#Region#</FONT>
    </TD>
    <TD valign=top>
    <FONT SIZE="-1">#State#</FONT>
    </TD>
</TR>
</CFOUTPUT>

<!--- If the total number of records is less than or equal
to the total number of rows, then offer a link to
the same page, with the StartRow value incremented by
MaxRows (in the case of this example, incremented by 10) --->
<TR>
    <TD colspan=4>
    <CFIF (StartRow + MaxRows) LTE GetParks.RecordCount>
        <a href="cfquery.cfm?startrow=<CFOUTPUT>#Evaluate(StartRow + 
        MaxRows)#</CFOUTPUT>">See next <CFOUTPUT>#MaxRows#</CFOUTPUT> 
        rows</A>
    </CFIF>
    
    </TD>
</TR>
</TABLE>
</BODY>
</HTML>

In this example, queries will be cached for ten minutes. Cached query data will be used if the original query date falls within the time span you define. The CreateTimeSpan function is used to define a period of time from the present backwards.

CACHEDAFTER allows you to specify an absolute date and time for query caching. The code looks similar to CACHEDWITHIN:

CACHEDAFTER="10-15-00"

With CACHEDAFTER included in the <CFQUERY> statement, ColdFusion uses cached query data if the date of the original query is after the date specified. (Years from zero to 29 are interpreted as 21st century values. Years 30 to 99 are interpreted as 20th century values).

To use cached data in either CACHEDAFTER or CACHEDWITHIN, the current query must use the same SQL statement, data source, query name, user name, password, and DBTYPE. In addition, for native drivers it must have the same DBSERVER and DBNAME (Sybase only).

Nothing is perfect, including query-based result caching. First, flushing the cache is difficult to do. To refresh the cache in <CFQUERY>, you can specify a new tag attribute or make a change in the ColdFusion Administrator. Second, if the query data blocks are large enough, available memory can quickly evaporate. Finally, the cache cannot be explicitly managed in query-based result caching.

Client-Side Caching

Client-side caching (browser-side caching) is almost the ideal form of caching. If the page has not changed on the server, the browser will use the its cached version of the page without having anything sent from the server. This saves server resources as well as circumventing the potential bottleneck of Internet bandwidth.

Client-side caching involves embedding time stamp information in the URL requests to a ColdFusion server. The server then looks at the time stamp and compares it to the server-cached version. If the time stamp is identical, nothing is returned to the client. If the time stamp differs, the server will send the page to the client.

To enable client-side caching with <CFCACHE>, simply specify CLIENTCACHE in the ACTION command (i.e., ACTION="CLIENTCACHE").

Because the browser stores the pages in its own cache and does not require anything from the server, ColdFusion Server resources are not consumed by client-side caching. To optimize performance, use both server-side caching and client-side caching; if the browser cache times out, the server can retrieve the cached data from its own cache. Replace CLIENTCACHE or CACHE in the ACTION command with OPTIMAL (i.e., ACTION="OPTIMAL").