By: Daryl Banttari
Senior Consultant
Allaire Consulting
Scalability means never having to say, "We can't handle
any more traffic."
Many projects start small: one ColdFusion Web server,
one database server. Then the development team encounters
one of two motivating factors--the need for more uptime
(availability) or the need for more capacity (scalability).
In this article, we discuss the latter, enhancing scalability
through the use of query caching.
Improving scalability is about finding and removing bottlenecks
that restrict the growth of a system. The most common bottlenecks
for Web systems include:
- Insufficient network bandwidth.
- Insufficient CPU resources.
- Inability to get data to/from the database.
Solving each problem seems simple:
- Call the ISP.
- Add servers.
- Add more database server(s).
Unfortunately, the cost of adding database server(s),
both in terms of money and administrative overhead, is very
high. So, it seems that maximizing your existing investment
in database hardware and software is warranted. Enter query
caching.
Query caching is designed to accomplish two goals:
- Decrease the time between a page request and the page
view.
- Reduce the amount of work generated for the database
server for each page view.
Implementing query caching is very simple. For example,
examine the query below, which might be used to retrieve
a list of states for a <SELECT> list.
|
| Before: |
<CFQUERY Name="qStates" DataSource="#Request.DSN#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>
|
| After: |
<CFQUERY Name="qStates" DataSource="#Request.DSN#"
CachedWithin="#CreateTimeSpan(0,1,0,0)#">
SELECT StateCode
FROM States
ORDER BY StateCode
</CFQUERY>
|
|
What Just Happened?
You just empowered ColdFusion to hold on to the results of
that query for up to an hour. ColdFusion will now stop repeatedly
asking the database for the results of this query. In fact,
ColdFusion will reuse the results of that query for up to
an hour before asking the database for that result set again.
The database is now relieved of the duty of fetching these
rows and sorting them (which usually involves creating and
dropping a temporary table) for each request to that page.
Before you get too excited, there are a few details to
consider:
- There is a limit to the number of queries you can cache.
This number is configured in the ColdFusion Administrator
under "Caching." In ColdFusion 4.01, you cannot set this
number higher than 100. This limitation was removed in
ColdFusion 4.5, but that is not an invitation to set the
value to 30,000. Caching too many result sets will cause
memory starvation and heavy virtual memory paging, negating
the benefits.
- If you have a dynamic query, such as "SELECT * FROM
Catalog WHERE CatalogNumber = #val(FORM.CatalogNumber)#",
each permutation of that query counts as one cached query.
Therefore, query caching should only be used for commonly
accessed result sets. The CFML Language Reference defines
a distinct result set by stating "…the current query must
use the same SQL statement, data source, query name, user
name, password, and DBTYPE. Additionally, for native drivers
it must have the same DBSERVER and DBNAME (Sybase only)."
- There is no easy way to invalidate a result set, if
you detect that a result set should be invalidated. Therefore,
the time span used for the result set cache should be
chosen carefully. However, if you have a result set that
is accessed four times per second, setting a timeout as
low as a minute reduces the load on the database (for
that query) by a factor of 240.
- You cannot use query caching for parameterized queries
(queries using <CFQUERYPARAM>.) Parameterized queries
should be used for common non-cached queries, since they
allow the query plan to be reused on some database systems
(such as Oracle), and they are virtually immune to malicious
query editing as documented in Allaire Security Bulletin
ASB99-04 (http://www.allaire.com/handlers/index.cfm?ID=8728&Method=Full)
Keep in mind that queries are also global to the server, so
if you use the same query with the same name in multiple pages,
that cached result set is shared between the pages.
Let's look at the "Tack2 Intranet" case study from
within the example applications optionally installed with
ColdFusion. The home page was altered to allow access from
machines other than localhost. Then a test was performed
with one computer simulating 20 users pressing "refresh"
100 times each in rapid succession against a Windows 2000-based
ColdFusion server with a Windows NT 4.0 SQL Server database
server. Each server is a Pentium 3, 450 MHz PC with at least
128MB of RAM. The queries are pretty light, but there are
four on that page. The following is a graph of CPU utilization
of both servers during the initial test:
Figure 1: CPU Utilization during initial test.
Now, the same test, after implementing query caching:
Figure 2: CPU utilization while using query caching.
Notice how the utilization of the SQL server dropped to
idle. Now let's look at the results of the change:
Figure 3: Comparison between tests, the second using query
caching.
Even on the heavily CPU-bound application server, we see
a 16% improvement in the ability of the ColdFusion server
to produce pages with a 14% improvement in page response
time. An even more compelling story is told in the standard
deviation--the "usual variation" in response time. While
the non-cached server produced pages where the response
time "usually" varied almost as much as the page time, the
cached version was very consistent in its response time.
For pages with more difficult queries, the difference becomes
particularly noticeable.
The real story, however, lies in the difference between
the CPU loads on the SQL server. With the average CPU utilization
during the test dropping from 23.4% to 4.2%, the site's
ability to scale by adding Web servers (for this particular
page, anyway) has increased dramatically.
Conclusion
The repetitive nature of Web pages often causes database servers
to work very hard at producing the same results sets over
and over again. However, by using the Query Caching capability
included with ColdFusion, a significant amount of work can
easily be moved from the database server to the ColdFusion
servers. This allows for a much higher ratio of ColdFusion
servers to database servers, enhances the performance and
scalability of your Web system, and maximizes your investment
in your database servers.