Chapter 12 - Understanding Transactions and Locking
A good understanding of transactions and locking is essential for anybody who is going to write database applications for more than one user. Even single-user applications require some understanding of locking, though the impact of locking yourself is not nearly as drastic as that of locking an enterprise network of hundreds of users. SQL Server has a number of different styles of locking available to the programmer. This chapter will provide you with the information required to make an accurate assessment of what is needed for your application in terms of transaction control and locking. You can never be too cautious in a multiuser application. As a programmer you should always concentrate on attempting to minimize the amount of locking that can occur so that there is less chance of users interfering with each other.
Defining TransactionsA transaction is a logical unit of work that you want the SQL Server to perform for you. That unit of work may include one or many SQL statements, provided the unit of work is delineated appropriately to the server.Single-statement transactions can be executed in ISQL just by entering their text and typing go. Single-statement transactions are ideal where the results required are simple and self-contained. For example, the following statement will return a list of tables from the database currently being used. (The text for this statement can be found in 12_01.SQL on the CD-ROM.)
Optimistic vs. Pessimistic LockingWhen you write multiuser database applications, you can take one of two approaches to transaction control: optimistic or pessimistic locking. Optimistic locking assumes that you are going to do nothing in your application code to explicitly enforce locks on records while you work on them. Instead, you will rely on the database to manage this on its own while you concentrate on application logic. Pessimistic locking assumes that the application code will attempt to enforce some type of locking mechanism.To implement optimistic locking in your application without having it grind to a halt under excessive locks on the server, you must take care to observe some simple rules, as follows:
Most marketing literature has attempted to tell us, as application developers, that SQL Server is going to manage locking and that there is nothing to worry about. This is a very optimistic locking approach. Unfortunately, it is not very pragmatic because it assumes that there is nothing a programmer or user can do to explicitly cause locking. In fact, there are many situations that will cause a large amount of locking to occur on a server, potentially disabling it for the enterprise that it is supporting.
Background Information on LockingI think it will be useful as background to first discuss some of the basics of locking as they pertain to (and are implemented by) SQL Server, so that some of the more detailed items discussed in the sections below are not without a base of understanding. Specifically, I want to focus on the following two key areas of locking:
Page Sizes and Granularity of DataSQL Server's internal basic unit of work is a 2K data page. What this means is that any activity that is executed on the server must do work on at least 2K of data. To further explain, a table has a number of pages of data associated with it (depending on the number and size of rows that it contains); SQL Server can only reference data in that table a page at a time. So, if an update hits a single record in a table and a lock is held for some period of time, it is more than likely that more than one row is in fact being locked.How does this affect a database application? One of the most important considerations when writing a multiuser application is that there must be a way for multiple users to work independently of one another. For example, two users must be able to update customer records at the same time while answering phone calls from customers. The greater the capability to manipulate data in the same table without affecting other users by locks, the greater the concurrency of an application and the greater the chance of being able to support a lot of users. A highly accessed table (such as a table of unique values for the rest of the system) should be made as concurrent as possible by forcing as few as possible rows of data onto the same data pagethereby limiting the number of coincidental rows locked as the result of a user action. Additionally, users transactions should be kept to a minimum duration when hitting these tables. Two other types of locks can occur that lock data more greatly than a singe data page: table and extent. Table locks occur because a user issued a query to update a table without including a WHERE clause (thereby implicitly saying that "I want to update every row"), and when the number of data pages locked exceeds the Lock Escalation Threshold defined for the particular table or database. Extent locks occur when SQL Server needs to create a new database extent (eight pages of data) to respond to a user query. Unfortunately, there are no controls at our disposal to handle or deal with extent locks, so you simply should know that they occur and what they mean. For more information, see the section entitled LE Thresholds later in this chapter.
Types of LocksSQL Server can place several types of locks on database pages and tables. The page locks that are possible are SHARED, EXCLUSIVE, and UPDATE. SHARED locks and EXCLUSIVE locks are reasonably self explanatory in that they either allow another process to acquire a lock on the same page or they don't.Multiple processes may have SHARED locks on the same data page, and they are usually acquired when data is being read. Importantly though, no other process may take an EXCLUSIVE lock (to perform DML) until all SHARED locks have been released. EXCLUSIVE locks of table pages are given to a process that is updating a record on a page, inserting a new record at the end of a page, or when a process deletes a record from a page. EXCLUSIVE locks disallow any other process from accessing the page. The UPDATE lock type is a middling lock. It sits in between SHARED and EXCLUSIVE in that it will allow a process to acquire a SHARE on the page until an actual update has occurred on it. UPDATE locks are acquired when a CURSOR is being built in the server. UPDATE locks are automatically promoted to EXCLUSIVE when an update occurs on one of the pages associated with the cursor. At the table level, SQL Server has SHARED and EXCLUSIVE locks that work in the same fashion as the page level. SQL Server also has INTENT locks. INTENT locks indicate that a table has a number of pages on it that SQL Server is intending to lock at the page level in response to a user process. SQL Server 6.5 has added insert row-level locking. This new lock allows multiple users to insert records into the same page. It was added because of a large amount of contention with inserts at the end of tables. For additional information, see Chapter 15, "Creating and Using Cursors"
Defining Isolation LevelsThere are a number of ways in SQL Server that you can cause locks to be held or released while querying the database. One of those ways is by setting a transaction's isolation level. As its name implies, an isolation level specifies to the database how "isolated" to keep the data that is currently being worked on by the other users and requesters of data on the server. SQL Server has three different types of isolation levels and they are documented in the following three sections.
To achieve the same effects as isolation levels for a single SELECT statement, refer to the section below, Holding a Lock Explicitly for more information.
If it is important that the query's results be completely unaffected by other users during the life of a particular transaction, so make sure that you use the Repeatable Read isolation level. To set your isolation level to Read Committed, perform the following SQL:
To set your isolation level to Read Uncommitted, perform the following SQL transaction:
To set your isolation level to Repeatable Read perform the following SQL transaction:
Creating and Working with TransactionsIn the opening section of this chapter, you saw how to delineate a transaction using BEGIN, COMMIT and ROLLBACK. SQL Server's keywords or Transact-SQL statements that are required for transaction control are described below for clear definition.
It is possible to issue operations without a BEGIN TRAN statement, and they will affect a database. However, you will not be able to conditionally undo the work that you sent to the server if it is not preceded by a BEGIN TRAN so that SQL Server knows to what state the database must be returned.
DDL and Database StatementsDDL (Data Definition Language) and database modification statements are now allowed inside a transaction. The following statements can appear in transactions:
Using Named Transactions and SavePointsOne thing that becomes obvious during the writing of large stored procedures and applications with large bodies of SQL code is that no matter how it is looked at, the code is pretty unreadable. It is text based and there is a great reliance on programmers all working with the same style of format and layout. When transactional programming is involved, it becomes even more important for people to use good indenting to clearly mark blocks of code.
However, even the most careful programmer will find that it becomes a bit of a nightmare to remember how many indents to ROLLBACK out of in the event of an error condition or some programmatic constraint. Named Transactions and SavePoints are used for just this purpose: they provide a way of rolling back work to a given named or saved portion of the code that has been executing (even if it is at a higher nesting level).
Named TransactionsNamed transactions provide a convenient way of attaching an identifier to a whole body of work. Use named transactions to make it easier to undo large portions of code. To create a named transaction, add the name of the transaction to the BEGIN TRAN statement, as follows. (The text for this statement can be found in 12_05.SQL on the CD-ROM.)
Using SavePointsSavePoints are really just another way of doing a named transaction. They provide a method of marking a place in the code to which a ROLLBACK may be used to undo work. To create a SavePoint, issue the SQL command
The text for this statement can be found in 12_06.SQL on the CD-ROM.
The most likely scenario is that you have issued more BEGIN TRANs than you have corresponding COMMIT TRANs or ROLLBACK TRANs. Remember that transactions must be enclosed in pairs of BEGIN and COMMIT/ROLLBACK. If you fail to do so, SQL Server will think that you want to keep the transaction open for a longer period.
To help identify your code problems, do a walk through of your application, and monitor error conditions carefully. Chances are that an error condition is occurring and some code is returning control before closing an open transaction. Also, check the value of the system variable @@trancount to tell you how deeply nested in transactions you really are.
Serialized Columns without IDENTITYSQL Server 6.0 introduced a new serial datatype, called the IDENTITY, in which SQL Server will automatically assign the next sequential value to a column in a table. IDENTITYs are very valuable in applications that have high transaction volume and want to identify each record uniquely.For some applications that must support multiple database back ends and for those applications that require SQL Server 4.x compatibility, it is possible to implement the same kind of feature as an IDENTITY column by performing the following steps:
Understanding LocksIn addition to the background information provided previously in the section entitled Types of Locks it is important to know how to handle locking when it occurs in your database.
Displaying Lock InformationThere are two ways to review information about locks held in the database: using the SQL Enterprise Manager or through the execution of the system stored procedure, sp_lock. SQL Enterprise Manager, under the covers, is calling sp_lock to get the information to display.
Using SQL Enterprise ManagerTo view information that is being locked using the SQL Enterprise Manager, perform the following steps:
To get more information about the individual statement that is causing locking (see fig. 12.4), you can either double-click the process that is in the Object Locks page or click the More Info toolbar button (see reference in fig. 12.3). Fig. 12.4 - The Process Details dialog box shows additional information about the SQL statement that is causing locks.
Using sp_lockThe sp_lock system stored procedure will return a list of processes and the types of locks that they are holding on the system. To get the locks held by a particular process, add the process ID to the command (sp_lock spid). Here is some example code to show you the output of sp_lock:
Killing a Locking ProcessBefore killing a process that is holding locks on the database, verify with the sp_who and sp_lock system procedures that the spid (server process id) that you are targeting to kill is in fact the user holding the locks.When reviewing the output from sp_who, look at the blk spid column to identify a user that is blocked. Trace the tree of the blocks back the parent spid, and kill that user. To kill a user process you can either use SQL Enterprise Manager, or execute the Kill command.
Using SQL Enterprise ManagerUsing SQL Enterprise Manager to kill a process involves first finding the process that is causing locking, and the steps to do this are outlined previously in the section entitled "Using sp_lock."Having found a process that needs to be killed, you can press the Kill Process button on the toolbar of the Current Activity window (see reference in fig. 12.5). Fig. 12.5 - The Kill Process toolbar button enables you to halt an activity. A warning dialog box appears so you can change your mind and undo your action (see fig. 12.6). Fig. 12.6 - The warning dialog box enables you to confirm whether or not you really want to kill a process.
Using KILLHaving identified the user process (spid) that you want to kill, execute the following SQL to kill it:
Holding a Lock ExplicitlyIf you have application code that really needs to explicitly hold locks on particular sets of data, SQL Server provides you with extensions to the basic SELECT statement that perform this functionality. SQL Server enables you to add optimizer hints or keywords to your SELECT statements that tell it how to process the data that matches your results. There are several kinds of hints that you can place on a set of data affected by a SELECT statement: NOLOCK, HOLDLOCK, UPDLOCK, TABLOCK, PAGLOCK, TABLOCKX. Some of these options are discussed below.
NOLOCK is a very useful option for those people writing applications in which the data is statistically unaffected by a small sample of records having fluctuating values (i.e., you are more interested in trends of data than in the actual values themselves). Care should be taken, and it is important to clearly differentiate between data fetched with the NOLOCK keyword and data that is legitimately accurate according to the known condition of the database as a whole.
If you are reading data and it is no longer available, you will receive error messages 605, 606, 624, or 625. It is recommended that you process these errors in the same way that you process a deadlock condition. That is, inform the users that an error has occurred and ask them to retry their operationsadvanced applications may want to auto retry the first time to avoid confusing the users unnecessarily.
Lock Escalation OptionsSQL Server locks data on the page level. Any query that you execute on the server will hold locks on at least one full page. If you start updating or locking multiple pages on a table SQL Server starts consuming resources to manage your requests. At a certain point (based on a percentage of pages locked per table) it becomes more efficient for the database to lock the entire table (a table lock) than to keep managing the individual pages being locked by a given transaction.Fortunately, SQL Server enables you to configure the way in which it chooses to escalate locks from page level to table level. These are options that are set at the server level with the server stored procedure, sp_configure.
LE ThresholdsUsing sp_configure, it is possible to set three different types of Lock Escalation (LE) thresholds: LE threshold maximum, LE threshold minimum, and LE threshold percent.The threshold maximum is used by the server to determine when to escalate a set of page locks to a table lock. The default for the server is 200 pages. To change this value, follow these steps. (The text for this statement can be found in 12_08.SQL on the CD-ROM.)
The threshold percentage is used to enable you to generically set a level at which you want to escalate a set of page locks to a single table lock relative to the number of rows in the table. The default value of this configuration option is zero (0), meaning that the LE threshold maximum should be used to determine escalation.
From Here...In this chapter you learned about the fundamentals of locking and transactions and how they will affect your application. In addition, you learned about the internals of SQL Server and how it manages many users hitting the same table.Take a look at the following chapters to further develop your SQL Server and application programming knowledge:
Table of Contents
11 - Managing and Using Rules and Defaults
13 - Managing Stored Procedures and Using Flow Control Statements |