All of us have experienced the incredible increase in the speed and capabilities of computing technology. Gordon Moore - one of the founders of Intel - famously remarked that the density of transistors that can be placed on an integrated circuit doubles every two years. This increasing transistor density, together with a number of other related trends, drives a similar growth in computer performance and “Moore’s law” has been remarkably accurate in forecasting the exponential growth of CPU, disk and memory capacities in computer equipment.
However, while this trend drives exponential improvements in the electronic aspects of computing performance, hard disk drive performance improves at a far slower pace. Hard disk drives are mechanical devices which do not benefit from the exponential growth promised by Moore's law. Indeed, during a period in which CPU speeds increased by about 32 times, hard drive speeds increased only 4 times (Figure 1). Compared to everything else in the computer, hard drives have gotten significantly slower. Figure 1 Trends in CPU, memory and disk
Contemporary databases are all about managing data stored on disks, so in order to maintain acceptable performance and leverage increases in CPU and memory capacity, modern databases like Oracle go to enormous lengths to avoid disk IO. The primary method for avoiding IO is – of course – to cache data in memory. The area in memory that Oracle uses to cache data from disk files is known as the buffer cache.
We’re not concerned with the effectiveness of caching in particular just now - though I promise we will get to it eventually. In this article I want to examine how contention within the buffer cache can reduce the effectiveness of caching. Reading from the buffer cache involves latch operations, and we saw last month how latch contention can result. This month, I want to examine other forms of contention within the cache.
As DML statements execute against the database, changes are initially made only to memory. In fact, the only SQL statement that requires an immediate write to disk is the COMMIT statement – and that requires only a write to the redo log, rather than the database files. However, these changes must eventually find their way to disk, and this is the job of various background processes. In particular:
Figure 2 illustrates how these writer processes interact.
Figure 2 The role of background processes in writing to disk
These writers are sometimes referred to as “lazy” writers, because they do not immediately write changes to disk, but instead do so at some later time. This mechanism works wonderfully when these processes can keep up with demand. However, should they fall behind, then various forms of buffer cache contention can result.
The Database Writer (DBWR) writes modified blocks from the buffer cache to the database files. Sessions are adding blocks to the cache all the time, so it’s important that the DBWR not allow modified (“dirty”) blocks to fill the cache. If a session wants to add a new block to the cache, the least recently used block in the cache is discarded and the new block takes its place. However, if all blocks in the cache are dirty, then the session must wait for the DBWR to write the blocks to disk and a “free buffer wait” will occur (Figure 3).
Figure 3 Free buffer waits occur when the DBWR cannot keep up with demand
Another sign that the DBWR might not be keeping up are “write complete” waits, which occur when a session wants to modify a buffer, but the buffer is currently being written out by the DBWR. This will of course happen from time to time, but should be rare unless the DBWR is struggling to complete batches.
There’s a number of possible causes and solutions for DBWR related buffer contention:
The flashback database functionality introduced in 10g allows us to “rollback” a database to recover from a logical corruption rather than “rolling forward” using redo logs. The reduction in MTR (Mean Time to Recover) can be substantial and consequently flashback database was one of the most popular features of Oracle 10g. However, nothing comes for free, and flashback technology requires that we periodically write what essentially amounts to rollback information to the flashback logs. The Recovery Writer process (RVWR) is responsible for writing these log records. If it can’t keep up, then sessions trying to add to the buffer cache will receive a “flashback buf free by RVWR” while the RVWR catches up (Figure 4).
Figure 4 Recovery writer (flashback log) bottleneck
Recovery Writer waits are usually the result of insufficient IO bandwidth for the Flash Recovery Area (FRA). Quite often, the FRA is placed on a smaller number of disks than host the database files which increases the chance that the RVWR will become a bottleneck. The problem can be exacerbated if the archive destination is set to the FRA, since now the RVWR and the redo log writer (LGWR) will compete for IO.
The DBWR and RVWR problems above occur when the entire buffer cache is full with modified blocks. The “buffer busy” wait occurs when a session wants to access a block but the block currently being subjected to an incompatible operation by another session. Some of the major causes of buffer busy are:
The first step in resolving buffer busy contention is to identify the segments involved in buffer busy waits. Prior to the introduction of the V$SEGMENT_STATISTICS view in 9i, we could only do this by running a 10046 trace, or by catching the waits as they appear in V$SESSION_WAIT. From 9i onwards we can simply examine V$SEGMENT_STATISTICS to see which segments are most heavily associated with buffer busy waits.
Figure 5 Spotlight buffer busy drilldown
Figure 5 shows the Spotlight buffer busy drilldown which is available for any version of Oracle. We show the segments most heavily associated with buffer busy waits, as well as the block types encountered, tablespaces and users involved.
If your database is version 10g (or higher) we can show more information including the SQL statements associated with buffer busy (Figure 6).
Figure 6 Identifying SQLs associated with buffer busy in 10g
Remedial action for buffer busy waits could include the following:
A properly configured buffer cache can be the key to avoiding a disk-bound database and delivering peak database performance. However, contention for blocks in the data cache can prevent the buffer cache from reaching it's full potential. To avoid buffer cache contention, follow these guidelines:
This post was originally from 04 December 2012. It has been administratively moved from the wiki to Blog posts based on content criteria.
>Consider using reverse key indexes for primary key indexes that are showing buffer busy waits.
This is quite new info to me...will investigate that.
However pretty nice and complete artcle