Accessibility
Icon or Spacer
   

ColdFusion Timeouts and Unresponsive Requests

David Stanten
Senior Product Support Engineer
Macromedia, Inc.

Introduction

What is the difference between an unresponsive request, a request that has timed-out, a long-running request, and a cfquery timeout? A firm understanding how ColdFusion differentiates between these can really help when it comes to troubleshooting bottlenecks in your site, analyzing your ColdFusion log files, and improving overall site performance.

Request Timeouts vs. Unresponsive Requests

Let's start by taking a look at the "Timeout requests after ## seconds" setting in the main Settings page of the ColdFusion Administrator. The Timeout requests setting times-out threads doing processing which is internal to the ColdFusion server. For example, let's say you set this Timeout requests value to 30 seconds. If you had a long running (or infinite) loop in your code, ColdFusion will halt this request after 30 seconds.

So, if you do:

<CFLOOP CONDITION="10 GT 9">
</CFLOOP>

You should see a timeout error after 30 seconds:

"The time taken to process your request has exceeded the allowable limit of 30 seconds."

This error will also be written to the Application.log. Any sort of processing internal to the ColdFusion server, such as looping through a structure, calling a long sequence of custom tags, parsing through a result set, or various other tasks will timeout with this error if it exceeds the Timeout requests value you have set.

However, because ColdFusion can only time out processing occurring internal to the ColdFusion server, the Timeout requests setting does not time out actively running database queries or any other external calls (such as cfhttp, cfftp, cfobject, cfx, etc.)

Instead, if the Timeout requests setting is enabled, and if the "Restart at ## unresponsive requests" is also enabled, ColdFusion keeps track of the number of unresponsive threads involving external requests -- like database queries that haven't returned within the timeout period. ColdFusion does not kill these requests. It just marks them as unresponsive and continues to wait for them to return. You will see an entry in the Server.log each time a thread is marked as unresponsive. For example:

"Warning","TID=124","12/01/00","16:34:19","A request exceeded the timeout. The unresponsive thread count is up to 1."

A second long-running database query would produce this entry:

"Warning","TID=124","12/01/00","16:34:49","A request exceeded the timeout. The unresponsive thread count is up to 2."

If a long running query eventually completes, the unresponsive count will decrement back down one.

"Information","TID=480","12/01/00","16:35:14","A long-running request returned. The unresponsive thread count is down to 1."

At this point, ColdFusion will realize that the thread returned, and that it exceeded the Timeout. The Application.log will show:

"Error","TID=930","04/10/01","16:35:15","127.0.0.1","Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)","Request timed out

The time taken to process your request has exceeded the allowable limit of 30 seconds. Please notify the server administrator (using the mail link below) that this error has occurred."

Note that ColdFusion did not terminate the query in this case. It waited for the thread to return from waiting for the database, and only then timed out the ColdFusion thread. So, even though the error reports that the thread took longer than 30 seconds, it may in fact have taken a lot longer than 30 seconds for this to be reported. During this time, this thread has been effectively tied up and removed from the pool of possible threads available to process new incoming requests.

If this incrementing unresponsive thread count number eventually exceeds the number set in "Restart at ## unresponsive requests", the ColdFusion executive will restart the ColdFusion server. If this occurs, an entry will be made in the executive.log:

"Warning","TID=4","03/19/01","13:27:17"," Restarted ColdFusion Application Server."

The server.log may (but does not always) show:

"Fatal","TID=559","04/10/01","15:58:34","Unresponsive thread threshold reached. Restarting service." "Information","TID=918","04/10/01","15:58:36","The ColdFusion Application Server started."

Any requests made at this time may cause entries to be made in the webserver.log, like this:

"Error","03/19/01","13:27:16","An error occurred while attempting to establish a connection to the server.<P> The most likely cause of this problem is that the server is not currently running. Verify that the server is running and restart it if necessary. <P>Unix error number 146 occurred: Connection refused."

Or the last part of the error might say:

<P>"Unix error number 2 occurred: No such file or directory."

Or, on NT:

"Error","TID=622","04/10/01","15:58:35","Request canceled or ignored by server. Server busy or unable to fulfill request. The server is unable to fulfill your request due to extremely high traffic or an unexpected internal error. Please attempt your request again (if you are repeatedly unsuccessful you should notify the site administrator). (Location Code: 26)."

The Location Code might also be a different number, like 25.

It is recommended that "Restart at ## unresponsive requests", be set somewhat lower than "Limit simultaneous requests." The idea is that, if almost all of the active ColdFusion threads are waiting for an external call to return, then performance is likely to be very poor at that time. These threads will appear to be 'hung', when in fact, they may just be waiting for a query to return. If only one or two ColdFusion threads are available to process incoming requests, then it would probably be best to restart the ColdFusion server and thereby free up all the threads still waiting for the external request to return.

For example, if network or other problems caused the connections to the database to be lost, ColdFusion might otherwise wait indefinitely for these threads to return. If the number of threads waiting for external requests to return reached the number of "Simultaneous Requests," and you did not enable "Restart at ## unresponsive requests," your site would appear to be down, even though nothing would actually be wrong with the server itself.

Example

If you want to see an example of these settings in action, set Timeout requests to 30 seconds, and enable "Restart at 4 unresponsive requests." Then run the following query against a SQL Server database:


<CFQUERY Name="WaitFor3"
    	Datasource="mySQLServerDb"
    	Dbtype="ODBC">
	SET NOCOUNT ON
	WAITFOR DELAY '000:03:00'
	select getdate() as now
</CFQUERY> 

This query just waits for three minutes and then returns the current date. It will be marked as unresponsive in the Server.log after 30 seconds, but it will not time out until it returns after three minutes. If you request the same template from several different browsers at the same time, you should be able to reach the unresponsive threshold and force a restart of the ColdFusion server.

If you run this test several different times and closely examine the log files, you may also notice something else. ColdFusion never marks more than one thread as unresponsive in any 30-second period. For those of you that really like to know the details, here's what is going on.

  • When "Restart at ## unresponsive requests" is enabled, a separate thread of execution is started in addition to the threads used to process page requests. This thread does not execute any page requests, but sleeps for a number of seconds and then checks all currently processing pages to see if they have exceeded the "Timeout requests" setting.
  • The time that this thread sleeps (the "scan" time) is the same as the number of seconds specified in the "Timeout requests" setting. If this is less than 30 seconds, then a hard-coded scan-time of 30 seconds is used instead. In no case will a scan be made more frequently than every 30 seconds.
  • This mechanism will never count more than one unresponsive page per scan. That means that there is a minimum of 30 seconds (or the "Timeout requests" value if it is greater than 30) between the discovery of unresponsive threads by this mechanism.
  • If the maximum number of unresponsive threads specified for this mechanism is greater than the number of simultaneous threads specified in Administrator, then the number of simultaneous threads is used instead. It is impossible to have more unresponsive threads than there are simultaneous threads.

Logging Slow Pages

Now let's take a look at the "Log slow pages -- Log pages taking longer than ## seconds" setting in the Logging -> Settings section of the CF Administrator. This setting is one of the main tools ColdFusion provides to help identify bottlenecks in your site. It will work for any template request, whether it be a database query or a CF-processing-intensive template. However, it does not actually record the fact that the page took a long time until it completes. So, if you set this at five seconds, and the page takes seven seconds, the Server.log will show:

"Warning","TID=754","10/18/00","19:00:24","Template: C:\Inetpub\wwwroot \loop.cfm, Ran: 7 seconds."

How does this Log slow pages setting work if you create a loop which runs for longer than your timeout setting? Let's say your Timeout requests is set to 30 seconds, "Log slow pages" is set to 20 seconds and you have a loop that takes 40 seconds to complete. The request will timeout at 30 seconds, and the Application.log will show:

The time taken to process your request has exceeded the allowable limit of 30 seconds.

And the Server.log will show:

"Warning","TID=754","12/18/00","13:04:21","Template: C:\Inetpub\wwwroot \loop.cfm, Ran: 30 seconds."

This is because after it times out, it effectively has returned.

CFQUERY Timeout

In addition to the "Timeout requests after ## seconds" setting, some drivers support a timeout value for individual queries. This value is passed to the driver through the Timeout attribute of cfquery. For example, this query:


<CFQUERY NAME="WaitFor5Seconds"
         DATASOURCE="MySQLServerDb"
         DBTYPE="ODBC"
         TIMEOUT="2">
         SET NOCOUNT ON
         WAITFOR DELAY '000:00:05'
         select getdate() as now
</CFQUERY>

Should produce this error in the Application.log after 2 seconds:

"Error","TID=930","04/10/01","17:06:56","127.0.0.1","Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)","ODBC Error Code = S1T00" (Timeout expired)<P><P> [Microsoft][ODBC SQL Server Driver] Timeout expired.<P>

(Note that the documentation for cfquery incorrectly states that this timeout value is set in milliseconds. The correct units are seconds.)

This timeout setting is currently supported by the SQL Server ODBC driver, and the DB2 and Informix native drivers. It is not supported by the Oracle native driver, Sybase native driver, the Oracle ODBC driver, and many other ODBC drivers. You can check the documentation for your ODBC driver to determine if is supported.

Another detail to be aware of is that the Request timeout setting in the CF Administrator can be overridden for any given template request by appending a RequestTimeout URL parameter to the URL string. For example:

http://localhost/loop.cfm?RequestTimeout=45

Conclusion

Timeouts in ColdFusion is the subject of much confusion with many users. Understanding ColdFusion timeouts will aid your analysis of the entries in your log files, and will help you improve the overall performance of your site.