Accessibility
 
 
Locking in ColdFusion

by Ben Forta
Technical Evangelist
Allaire Corp.

This article originally appeared in the August issue of ColdFusion Developer's Journal, published by SYS-CON Media.

We all know that locking is important. Most of us even understand why locks are needed. But exactly where to use a lock, which lock type to use and what code to put within the lock remains confusing at best.

Part of the confusion stems from changes Allaire made in ColdFusion 4.5 that in turn changed the recommendations and suggested practices. Indeed, even my own recommendations changed with that release (as many of you CFUG members are quick to point out). And so, at the request of several of you, and because I've helped contribute to the confusion, I'll cover these topics in this column and try to set the record straight.

Variables

Locks are used primarily with variables, so let's start there. Variables are kind of virtual containers in memory, containers that are used to store data. Look at the following code:

<CFSET first_name="Ben">

The <CFSET> tag creates (or overwrites) a variable, in this case a variable named first_name. first_name can be thought of as a container located somewhere in the memory of the computer, a container that now has the name "Ben" in it. To access the data in the container you simply refer to the container by name, like this:

 
<CFOUTPUT>#first_name#</CFOUTPUT>

In this example I used a simple variable. I could just have easily placed an array or list in that container, or even data as complicated as an array of structures containing arrays of structures, and so on.

Regardless of the type of data, one thing is consistent: you refer to the container (the variable) by a unique name, and that name provides access to the contents of the container at the moment it's requested.

Understanding Threads

Before I go further, one other topic must be mentioned briefly – threads. ColdFusion is a high-performance application server; it's designed to process many requests at once. It does this by running lots of concurrent tasks within the application server, and each one handles a single request at any given time. These tasks are known as "threads," and ColdFusion is a multithreaded application – in other words, ColdFusion is designed to perform multiple tasks concurrently. (There's actually much more to threads than that, but this explanation is adequate for the issue at hand.)

Simultaneous Variable Access

ColdFusion supports several different data scopes. "Scope" defines the life span (persistence) and visibility of data. Let's take a quick look at five scopes:

  1. VARIABLES: Used for data that needs no special persistence. Data in the VARIABLES scope persists for the duration of the processing of a request and is automatically destroyed once the request has completed. The data is visible only within the thread processing that request.
  2. SESSION: Used for session variables, for data that relates to many requests that together make up a session. Data in the SESSION scope persists until the session times out. The data is visible to any thread that processes requests for that session, and it's entirely possible that multiple threads will process requests for the same SESSION (even though a SESSION is mapped to a single client).
  3. CLIENT: Also used for session variables, but CLIENT is a little different from SESSION. Unlike SESSION variables, CLIENT variables aren't stored in memory. Rather, they're stored in a database (a database of your choice or the Windows registry, which is also a database of sorts). Data in the CLIENT scope persists until the client session times out. The data is visible to any thread that processes requests for that client session, and it's entirely possible that multiple threads will process requests for a single CLIENT session (even though a CLIENT session is mapped to a single client).
  4. APPLICATION: Used for variables that are shared across complete applications. Data in the APPLICATION scope is visible to all threads processing requests for that application.
  5. SERVER: Used for variables that are shared across all applications running on the ColdFusion server. Data in the SERVER scope is visible to every thread on the server.

The variable first_name, created earlier, was created in the VARIABLES scope. As explained above, this scope is processed by a single thread, and as soon as that thread has completed processing the request, the variable is destroyed. As such, there is absolutely no way more than one request could access the data in the VARIABLES.first_name container at the same time.

But other scopes behave differently. The following code creates a variable in the SESSION scope:

<CFSET SESSION.first_name="Ben">

As explained above, SESSION variables can indeed be processed by multiple threads at once. If you use frames, if the user hits the refresh button, if the underlying network makes retries - there are lots of conditions that could cause the same SESSION to be processed by more than one thread at any given time.

This is where things get dangerous. Let's go back to our container analogy. If you were to put data into a container at the exact moment someone else was doing so, what would happen? Both writes couldn't occur at the same time, so something would get lost – or worse, the container itself could become corrupted. If the <CFSET> statement above was executed at the exact same time as another <CFSET> statement that was updating the same variable, you'd likely corrupt server memory. If you're lucky, you'll just throw an error, but you could also negatively impact server operation as a whole.

And it's not just SESSION variables that are affected. APPLICATION and SERVER scope variables are even more susceptible to this corruption as they're always shared. (CLIENT variables, however, aren't susceptible as they're stored in a database; the database handles concurrency issues itself.)

Using Locks

How do you get around this problem? The answer is to use a lock. A lock does just that – it locks a block of code (a block containing a <CFSET> statement, for example). Going back to our container analogy, a lock acts as a guard monitoring access to the container's contents. The guard's job is to line up all access requests in the order they're received, granting admission one request at a time, and only after the previous access request has completed.

In other words, locks can arbitrate code execution across multiple threads, pausing execution as needed. And yes, this could slow down your application, but considering the alternative it's a small price to pay. Accessing a variable (writing or reading) while it's being written by another thread is asking for trouble.

The next code snippet sets the same SESSION variable once again, but this time locking it for the duration of the update:

<CFLOCK SCOPE="SESSION" TYPE="EXCLUSIVE" TIMEOUT="10"> 
<CFSET SESSION.first_name="Ben"> 
</CFLOCK>

Locking is implemented using the <CFLOCK> tag, and any code between the <CFLOCK> and </CFLOCK> tags will be locked. The SCOPE attribute specifies the scope to be locked by specifying SESSION as the scope. We're instructing ColdFusion to lock only the code execution for a particular SESSION. We wouldn't want to lock all sessions as that would cause other operations to pause unnecessarily (they wouldn't be updating this SESSION anyway). The TYPE attribute specifies the lock type. EXCLUSIVE means that no other operations on the specified SCOPE will be allowed while the lock is being processed. The TIMEOUT specifies the maximum time that ColdFusion should wait when trying to acquire a lock. If that timeout is reached before the lock can be acquired (perhaps because other threads have the same scope locked), the entire <CFLOCK> code block is skipped (and an exception is thrown).

To lock the APPLICATION scope you'd simply specify SCOPE="APPLICATION". Doing so would lock the APPLICATION scope so any other attempt to access APPLICATION data would be paused. The same is true for SERVER.

It's important to note that <CFLOCK> will do its job if all appropriate code is enclosed within <CFLOCK> tags. If somewhere in the code you had a <CFSET> statement that didn't use a <CFLOCK>, it could access the variable even though it was locked. For locking to work, all accesses must be managed by <CFLOCK> statements.

SCOPE vs. NAME

In the example above I used SCOPE="SESSION" to lock my <CFSET> statement. Three scopes are supported: SESSION, APPLICATION and SERVER. Specifying a SCOPE of SESSION locks any other accesses for the same SESSION only. Specifying a SCOPE of APPLICATION locks any other accesses for the same APPLICATION (as named in the <CFAPPLICATION> tag, usually in APPLICATION.CFM). Specifying a SCOPE of SERVER locks any other accesses for SERVER scope locks server-wide.

ColdFusion also supports locking by NAME. Using this method, you provide a name to identify the activity performed in the locked code, and only locks with the same NAME will be locked. Exactly what operations are locked within the lock is entirely up to you. All <CFLOCK> does is ensure that no two blocks of code with the same NAME are executed at once. Using NAME gives you a greater level of control over lock granularity, but with that control comes additional risk. If you mistakenly use different names for two locks that access the same data, you won't be locking at all.

The ability to lock code by scope was introduced in ColdFusion 4.5, and it's the preferred way to lock code that accesses potentially shared variables.

Read-Only Locks

Locking is really an issue only when variables are being written to. Going back to our container analogy, if multiple users looked into the container at the same time to see what was in it, no harm would be done. The same is true of read access to shared variables.

Some languages support the use of constants, special variables that are actually not variable at all as they can't be changed. ColdFusion has no concept of constants, so CF developers typically create variables in the APPLICATION scope (usually in the APPLICATION.CFM file surrounded by a check) and are careful never to overwrite them. If an application contained variables like this, variables that were never updated (after initial creation), you wouldn't really need to lock them at all. But you'd have to be 100% sure that an update wouldn't occur, realizing that there's nothing you can do programmatically to prevent that.

What to do? Locking all read accesses (every time you refer to #SESSION.first_name#, for example) with exclusive locks imposes a significant performance loss, and the risk may not be worth it. So you could opt not to lock variable reads.

But there's always the chance that someone will edit the code, and the variable that was never supposed to be updated ... well, what if some new code now updated it?

To address this problem, ColdFusion supports an additional lock type, READONLY. A READONLY lock doesn't actually lock anything unless an EXCLUSIVE lock is being processed at the same time. Only then will the READONLY lock pause until the EXCLUSIVE lock has completed. In other words, READONLY locks have no real performance hit associated with them. They are essentially ignored until an EXCLUSIVE lock is in effect.

Other Operations Needing Locks

Variables aren't the only things that need locking. Any code with potential concurrency issues should be locked. Examples of this include:

  1. Accessing files with <CFFILE> if other processes could be accessing the same data file
  2. Calling code that isn't multithread safe (some CFX tags, for example)
  3. Connecting to remote sites via <CFHTTP> if those sites don't allow concurrent connections

In all of these examples, locking the code block can avoid concurrency problems. But instead of locking by SCOPE, these operations should be locked by NAME.

Locking Tips

Locking is important and must be used. But locking slows your application, as already mentioned. Locks must be used carefully, and they must never be overused. Here are some pointers to keep in mind:

  • Don't lock code unnecessarily, but don't create and drop locks too frequently. It's a fine line to walk, but if you find yourself needing to lock two variables with some other lengthy processing in between them (that doesn't need locking), you might be better off using two locks so you don't keep locks active when they're not needed.
  • If you find yourself having to perform complex operations on locked variables (for example, complex string processing, looping or WDDX decoding), consider making local (VARIABLES) copies of the data and performing the processing on the local copy, then using a lock only when saving the local copy back to the shared variable.
  • APPLICATION locks should be used sparingly as they typically apply to lots of code. If you need to lock only part of the APPLICATION scope, consider using the NAME attribute instead of SCOPE. This will give you more granular control over exactly what gets locked and when, which in turn can prevent unnecessary locking (or unnecessarily long locking times). Of course, as explained earlier, using NAME comes with a risk. You must be sure that different names aren't used for code that accesses the same data. (The same thing applies to SERVER variables.)
  • ColdFusion 4.5 supports automatic locking modes in which ColdFusion locks variables for you. There's a bit of overhead in using auto-locking, plus you'll lose the potential performance gains that could be attained by more granular locking. As a rule, if performance is an issue (and when isn't it?), don't use these options. You'll be able to squeeze a bit more performance out of your application by doing it yourself.
  • ColdFusion 4.5 also supports a mode called Full checking. In this mode no locking occurs automatically. Instead, ColdFusion throws an error if a lock isn't used, helping you eliminate potentially missed locks. But this option can't be used with NAME locks - it'll always throw errors on those.

Conclusion

The common theme is that locks should be used, but they must be used carefully. And careful use requires a good understanding of what locks are, what they do and how your application should use them.

Locking is an important ColdFusion feature, and one that serious developers must use in their applications. Without locking there's a very real risk that data corruption will occur, and this can impact server stability.

Incorrect lock use, of course, can bring your application to its knees. That fine line must be walked. Yes, there are performance penalties involved, but every decision involves some kind of trade-off.

What you do is your choice. My advice? Lock it or lose it.

About the Author

Ben Forta is Allaire Corporation's product evangelist for the ColdFusion product line. Ben has over 15 years of experience in the computer industry and spent 6 years as part of the development team responsible for creating ONTime, one of the most successful calendar and group scheduling products, with over one million users worldwide.

Ben is the author of the popular The ColdFusion 4.0 Web Application Construction Kit (now in its third edition) and the more recent Advanced ColdFusion 4.0 Application Development (both published by Que). He co-authored the official Allaire ColdFusion training course, writes a regular column on ColdFusion development, and now spends a considerable amount of time lecturing and speaking on ColdFusion and Internet application development worldwide.

Born in London, England, and educated in London, New York, and Los Angeles, Ben now lives in Oak Park, Michigan, with his wife Marcy, and their four children. Ben welcomes your email at ben@forta.com and invites you to visit his own ColdFusion Web site.