By Ben Forta
Product Evangelist
Allaire Corp.
and
David Golden
Technical Writer
Allaire Corp.
Web applications are dynamic and should be data-driven
wherever possible. After all, the major distinction between
Web site and Web application is that the former is static
while the latter is anything but. It is Web applications
that deliver the true power and promise of the Web
the kind of applications you build in ColdFusion.
Unfortunately, all that extra power can come with a hefty
price tag. Static pages can be served up instantly. (Servers
don't have to do anything with the files other than just
opening them and streaming them to the client as is). Dynamic
pages, on the other hand, require parsing and processing.
That takes time.
This is where caching comes into play. Caching is a mechanism
with which you, the developer, can control the degree to
which your content is dynamic. Some information like
the location of a city's fire engines needs to be
kept the most up-to-date. At the same time, other information
can be a few hours or a few days old, such as monthly interest
rates on credit cards. Cached Web applications are still
dynamic, but you can also relieve some of the burden associated
with being dynamic.
Simply, caching involves trade offs. In other words, you
must be able to recognize that some operations are faster
than others, and that it is possible to make sacrifices
(e.g., memory usage, disk usage, how current data is, etc.)
to gain performance.
ColdFusion supports several forms of caching. The caching
options covered in this article are:
- P-Code Caching
- Page Output Caching
- Database Query Caching
- Client-Side Caching
P-Code Caching
When ColdFusion Server processes a template, it searches
for an Application.cfm template. Next, ColdFusion reads
the requested template from the disk, and then it reads
the OnRequestEnd.cfm template (if one exists in the Application.cfm
directory). If the processed pages use Custom Tags or includes
(via the CFINCLUDE tag) then those files are read and processed
too. Therefore, for any page request, ColdFusion Server
usually has to make multiple disk requests.
There are two bottlenecks in this process. First, disk
access is one of the slowest operations. (Memory access
is many times faster.) Second, each time a file is read
from disk ColdFusion parses it to perform basic error and
syntax checking. Parsing takes time.
To address these problems, ColdFusion Sever minimizes the
amount of reading from disk by avoiding the reload of previously
read templates. It accomplishes this by reading templates
once and caching them into the p-code (the preprocessed
template cache). ColdFusion will use the p-coded version
of a template from cache if it determines that the time
stamp matches the template on disk.
P-Code Cache Size
In the ColdFusion Administrator, you can configure the
size of the p-code cache to be allotted in server memory.
The default is 1 MB, but you can increase that if needed.
Based on a MFU algorithm, ColdFusion Server will automatically
purge the cache to make space for new files.
Figure 1: ColdFusion Administrator Caching Settings
The size of the p-code cache depends on the number of templates
you would like to cache. In general, p-code cache should
be figured using a 4x-5x ratio. A processed version of a
template contains much more information than a preprocessed
version. Therefore, if a template takes 1K on disk, it could
take 4K in cache. The ideal cache size is then 4 or 5 times
the total size of all CFM files on your system. While this
does increase server memory load, RAM is cheap and the performance
gain can be significant.
Trusted Cache
As previously stated, ColdFusion must still compare the
time stamp of cached version stamp with the time stamp of
the template on disk. The time code must be checked for
whichever files must be opened.
Checking date and time stamps is not a terribly costly
process, but it does add processing time, and as load increases
those little stamp checks add up. To address this problem,
ColdFusion Server is equipped with Trusted Cache. By activating
the Trusted Cache option in the ColdFusion Administrator,
ColdFusion Server will assume the template in cache is up
to date and to forgo the time stamp check
The trusted cache option is especially beneficial in instances
of using a shared directory for templates across a server
cluster. Of course, whenever you want to purge the cache
and reload a template, you must disable trusted cache and
reload the templates.
With SP1 of ColdFusion Server 4.5.1, no time stamp checking
of any kind will occur when trusted cache is used. In other
words, the disk I/O is zero. That means if you have a template
in p-code cache, you could delete it off the disk and continue
to access the template.
Whether or not to use trusted cache is a question of the
production process used by the application. If production
is changed on an ad hoc basis and there is not a regimented
update process in place, trusted cache can display erroneous
data because it will not reload a fresh template. Trusted
cache relies on having a structured deployment in which
the server is recycled on a periodic basis within strict
guidelines.
Page Output Caching
As already explained, static content can be delivered far
quicker than dynamic content. It follows that if dynamic
content did not need to be regenerated on each request,
you'd have the benefit of dynamic content being served as
if it were static. Page output caching enables you to process
CFML pages this way caching the generated output
of a page so that it may be served without reprocessing.
At either a page level or component level, the <CFCACHE>
tag performs a <CFHTTP> request under the covers,
and the resulting page is the saved off in a file.
The caching engine is GET variable aware, so URL parameter
differences are respected. Therefore, if the URL contains
a record ID, you can potentially have a cached HTML version
of each record ID. The caching engine is not POST variable
aware. Therefore, if a page varies by passed form field,
it should not be cached with CFCACHE.
To enable server-side caching in the <CFCACHE> tag,
the ACTION command must be set to CACHE (i.e., ACTION="CACHE").
For a very simple example, take a look at the following
code:
<!--- This example will produce as many cached files as there
are possible URL parameter permutations. --->
<CFCACHE>
<HTML>
<HEAD>
<TITLE>CFCACHE Example</TITLE>
</HEAD>
<BODY>
<H1>CFCACHE Example</H1>
<H3>This is a test of some simple output</H3>
<CFPARAM NAME="URL.x" DEFAULT="no URL parm passed" >
<CFOUTPUT>The value of URL.x = # URL.x #</CFOUTPUT>
</BODY>
</HTML>
For additional performance gains, you can cache generated
output to memory. Of course, you have to do more work yourself.
In memory page output caching, ColdFusion Server checks
to see if there is a generated version of a requested page
in memory with the URL as the variable name. If the requested
page exists, ColdFusion will return the variable value in
the form of static HTML to the client. If it doesn't exist
in the memory page output cache, that page will be regenerated
using <CFHTTP> and stored back to the cache as a variable
value.
These variables should be stored in the APPLICATION scope
to allow them to be cached for an extended period time across
different requests. Please note that, unless you specify
TIMEOUT, the cache will be eternal.
Database Query Caching
Database queries can slow an otherwise fast Web site to
a crawl. Many developers are not aware of ColdFusion Server's
features for caching database operations. The results of
database caching can be significant. For example, if an
one second query runs sixty times a minute and returns identical
information every time, it can be cached to save the server
59 seconds of processing time per minute.
Query-based result caching means that you will not have
the most recent information 100 percent of the time. Therefore,
the cached queries should be information that does not change
very often, such as the names of the fifty U.S. states or
queries as the result of "next n" style interface. For maximum
flexibility, ColdFusion can save query sets to SESSION and
APPLICATION scopes.
A maximum of 100-cached queries server-wide may exist at
any time (see Figure 3), and new queries are not cached
until the oldest query is dumped. Query-based result caching
is used on a first-come, first-served basis, and it can
be disabled.
CACHEDWITHIN and CACHEDAFTER
Use the <CFQUERY> tag to configure your query-based
result caching settings. Rather than working with the IF
statements in variable-based result caching, <CFQUERY>
allows you to add parameters quickly and easily.
However, if you need explicit management of caching activities
and no maximum level of cache size, variable-based result
caching may be for you. For instructions on how to use variable-based
result caching, please refer to the ColdFusion
Server documentation.
For example, CACHEDWITHIN specifies a relative time span
in which query caching will be used. Here is a <CFQUERY>
in action with CACHEDWITHIN enabled:
<!--- This example shows the use of CFQUERY with CACHEDWITHIN enabled. --->
<HTML>
<HEAD>
<TITLE>CFQUERY Example</TITLE>
</HEAD>
<BODY>
<H3>CFQUERY Example</H3>
<!--- define startrow and maxrows to facilitate
'next N' style browsing --->
<CFPARAM NAME="MaxRows" DEFAULT="10">
<CFPARAM NAME="StartRow" DEFAULT="1">
<!--- query database for information --->
<CFQUERY NAME="GetParks" DATASOURCE="cfsnippets" CACHEDWITHIN="#CreateTimeSpan(0,0,10,0)#">
SELECT PARKNAME, REGION, STATE
FROM Parks
ORDER by ParkName, State
</CFQUERY>
<!--- build HTML table to display query --->
<TABLE cellpadding=1 cellspacing=1>
<TR>
<TD colspan=2 bgcolor=f0f0f0>
<B><I>Park Name</I></B>
</TD>
<TD bgcolor=f0f0f0>
<B><I>Region</I></B>
</TD>
<TD bgcolor=f0f0f0>
<B><I>State</I></B>
</TD>
</TR>
<!--- Output the query and define the startrow and maxrows
parameters. Use the query variable CurrentCount to
keep track of the row you are displaying. --->
<CFOUTPUT QUERY="GetParks" StartRow="#StartRow#" MAXROWS="#MaxRows#">
<TR>
<TD valign=top bgcolor=ffffed>
<B>#GetParks.CurrentRow#</B>
</TD>
<TD valign=top>
<FONT SIZE="-1">#ParkName#</FONT>
</TD>
<TD valign=top>
<FONT SIZE="-1">#Region#</FONT>
</TD>
<TD valign=top>
<FONT SIZE="-1">#State#</FONT>
</TD>
</TR>
</CFOUTPUT>
<!--- If the total number of records is less than or equal
to the total number of rows, then offer a link to
the same page, with the StartRow value incremented by
MaxRows (in the case of this example, incremented by 10) --->
<TR>
<TD colspan=4>
<CFIF (StartRow + MaxRows) LTE GetParks.RecordCount>
<a href="cfquery.cfm?startrow=<CFOUTPUT>#Evaluate(StartRow +
MaxRows)#</CFOUTPUT>">See next <CFOUTPUT>#MaxRows#</CFOUTPUT>
rows</A>
</CFIF>
</TD>
</TR>
</TABLE>
</BODY>
</HTML>
In this example, queries will be cached for ten minutes.
Cached query data will be used if the original query date
falls within the time span you define. The CreateTimeSpan
function is used to define a period of time from the present
backwards.
CACHEDAFTER allows you to specify an absolute date and
time for query caching. The code looks similar to CACHEDWITHIN:
With CACHEDAFTER included in the <CFQUERY> statement,
ColdFusion uses cached query data if the date of the original
query is after the date specified. (Years from zero to 29
are interpreted as 21st century values. Years 30 to 99 are
interpreted as 20th century values).
To use cached data in either CACHEDAFTER or CACHEDWITHIN,
the current query must use the same SQL statement, data
source, query name, user name, password, and DBTYPE. In
addition, for native drivers it must have the same DBSERVER
and DBNAME (Sybase only).
Nothing is perfect, including query-based result caching.
First, flushing the cache is difficult to do. To refresh
the cache in <CFQUERY>, you can specify a new tag
attribute or make a change in the ColdFusion Administrator.
Second, if the query data blocks are large enough, available
memory can quickly evaporate. Finally, the cache cannot
be explicitly managed in query-based result caching.
Client-Side Caching
Client-side caching (browser-side caching) is almost the
ideal form of caching. If the page has not changed on the
server, the browser will use the its cached version of the
page without having anything sent from the server. This
saves server resources as well as circumventing the potential
bottleneck of Internet bandwidth.
Client-side caching involves embedding time stamp information
in the URL requests to a ColdFusion server. The server then
looks at the time stamp and compares it to the server-cached
version. If the time stamp is identical, nothing is returned
to the client. If the time stamp differs, the server will
send the page to the client.
To enable client-side caching with <CFCACHE>, simply
specify CLIENTCACHE in the ACTION command (i.e., ACTION="CLIENTCACHE").
Because the browser stores the pages in its own cache and
does not require anything from the server, ColdFusion Server
resources are not consumed by client-side caching. To optimize
performance, use both server-side caching and client-side
caching; if the browser cache times out, the server can
retrieve the cached data from its own cache. Replace CLIENTCACHE
or CACHE in the ACTION command with OPTIMAL (i.e., ACTION="OPTIMAL").