Базы данныхИнтернетКомпьютерыОперационные системыПрограммированиеСетиСвязьРазное
Поиск по сайту:
Подпишись на рассылку:

Назад в раздел


Teach Yourself Oracle 8 In 21 Days

Previous chapterNext chapterContents


- Day 20 -
Effectively Tuning and Optimizing the Database

Performance can be one of the most fun or most frustrating aspects of administering a database. It is almost always extremely challenging, and forces you to use all your experience to isolate the bottleneck and determine a response. I find performance tuning and system optimization to be extremely rewarding and challenging; I enjoy it, and I hope you do too.

I hate to refer to performance bottlenecks as bugs or problems because those terms suggest something is broken. Sometimes bottlenecks occur because you have reached the capacity of your system, and no tuning or changes can alter that fact; other solutions must be found. The capacity planner must try to anticipate both when the system will reach capacity and what new resources are needed before this happens.


NOTE: Oracle instance and OS tuning should be performed in conjunction with application tuning. Because much of the instance tuning is based on the data-access patterns and memory usage of the application, changes to the application might result in the opportunity to retune the instance for greater optimization. A perfectly optimized Oracle instance cannot make up for a poorly tuned application. Unfortunately, this book does not cover application tuning in detail, but several good books on the subject are available. In fact, I have written just such a book: Oracle Performance Tuning and Optimization (Sams Publishing).

Performance tuning and capacity planning are closely related. They rely on and play off of each other. The capacity-planning stage assumes that you have a well-tuned system, and bases most of its recommendations on this. The system-tuning stage assumes you have a system that can support the number of users and the number of tasks being performed. The capacity-planning and system-performance engineers often work together to determine where the bottleneck is and how to resolve it.

What Is a Bottleneck?

New Term: The term bottleneck refers to the component or components that have a limiting factor on the system. Some bottlenecks result from inherent limitations of your system, but others can be fixed via parameter tuning. Many times, additional components are needed to fix bottlenecks. System tuning typically involves finding system bottlenecks and eliminating them. Examples of bottlenecks that are typically found in an Oracle system include

  • Disk drive bottleneck--A disk drive can handle only so many I/Os per second before latencies increase to unacceptable levels. The systemwide I/Os per second is fixed, so if your CPU and memory system requests I/Os more quickly than the I/O subsystem can deliver, a disk bottleneck will occur.

  • Memory bottleneck--If you have inadequate memory, you won't have a good cache-hit rate. A high cache-hit rate indicates that most of the data you need is in the Oracle buffer cache. This reduces the number of needed I/Os.

  • CPU bottleneck--In the case of a CPU bottleneck, you are running your system as quickly as the CPUs can run.

  • Network bottleneck--A network bottleneck occurs when so much traffic is over the network, you can't transfer data any faster. This is important to backup and recovery.

  • Bus bottleneck--A bus bottleneck occurs when your hardware cannot transfer data across the CPU to memory or I/O bus. This is a hardware limitation.

  • Other bottlenecks--Numerous other bottlenecks, either hardware, software, or a combination of the two, can occur.

Some bottlenecks can be fixed easily, and others are difficult to overcome. Later today you will see how to determine the location of a bottleneck and receive some hints on how to fix bottlenecks.


NOTE: Every system is different. There is no magic diagnostic tool that can tell you how to tune your system. It is an acquired skill that you will develop with practice.

Performance Tuning Methodology

Determining what, if anything, is a bottleneck in your system can be difficult. I have found that following a tuning methodology or plan can help you to reach the correct conclusion in the shortest amount of time. By tuning methodology, I mean a way of looking at performance issues and determining of the root of those issues. Here is the performance tuning and optimization methodology I use:

1. Analyze the system--Determine whether a performance issue actually exists. You might determine this by running transactions and checking response times or by monitoring the system statistics. You should look at OS stats as well as Oracle stats.

2. Determine the cause--Here you form a hypothesis about where the bottleneck is by looking at the analysis from step 1.

3. Develop a solution or a test--If you are certain you have determined where the bottleneck is, develop a solution for it. If not, develop a test to verify your hypothesis. When you develop the test, you should know what the results of the test will tell you.

4. Run the test or implement the solution--If you have hit upon a solution, implement it. If not, run your test under carefully controlled conditions. This test will be used to determine whether you are right or wrong.

5. Analyze the results--After the test has been run, you must analyze the result. Always view your results with skepticism. If something is too good to be believed, perhaps you should not believe it.


NOTE: You should know what you expect from the test you have developed. Many times a test result can provide good data if the result is positive, but no useful information if the result is negative. For example, adding another disk drive will tell you that you had a disk bottleneck if this addition increases performance, but does not tell you what the problem is if performance does not increase. These tests are still valuable, even though only one result gives you good data.

Although you don't need to use my system, it helps if you develop some system for Troubleshooting.

Analyze the System

Analyzing the system gives you a good place to start. The first step in analyzing a system is to determine whether you have a problem. You should periodically monitor the following:

  • Response times--Do users complain that the system seems slower? Periodically ask your user community how the performance of the system seems to them; some-times end users are reluctant to tell you about prospective problems until they reach a critical state. It is a good idea to have a test query that you can run to compare current response times with previous response times. If your test query takes significantly longer to run than previously, you are seeing some kind of system slowdown.

  • Throughput--Look at the system as a whole by examining the system performance monitor (perfmon for NT, sar for UNIX). Compare current and past results. Is your CPU utilization significantly higher than in previous samples? What about disk usage? You will look at these in more detail later today.

  • Your intuition--Sometimes the system just feels wrong. If you think something isn't quite right, check it out. You know your system better than anyone.

After you analyze the system, you might determine that there are no performance issues, that everything is running fine. Great! But if you do come across performance problems, you can start determining the cause and possible solutions.

Determine the Cause

The next stage is to form a hypothesis of what you think the problem is. This should be based on analysis of the system and what you know of the system's design and capacity. For example, if you have a system with five 4.3GB SCSI disk drives for datafiles, the system should be able to support up to 350 I/Os per second before you see large disk latencies. If you are driving those disk drives at 450 I/Os per second, you could hypothesize that you are having a disk I/O problem. This hypothesis should be specific enough to help lead you to a solution to the bottleneck.

Develop a Solution or a Test

Develop a solution to the bottleneck or some sort of test to confirm your hypothesis. Whether you need to develop a solution or a test depends on how certain you are about having discovered the bottleneck and how expensive the solution is. If you are certain you have a disk bottleneck and have more disk drives available, it is not necessary to test your theory. But if you are not certain or think that other issues might be involved, it is a good idea to develop a test. Be sure your test will give you meaningful information.


WARNING: Remember the difference between cause and effect. You might see something that looks like a disk bottleneck when in fact a lack of memory is causing unnecessary I/O.

Anticipate Your Results

The test you devise to confirm or rule out a certain bottleneck should lead to expected results. If you are unsure what positive and negative results would indicate, the test is not worth running. I often run tests where a positive result is meaningful but a negative tells me nothing. This test is still worth running because that positive result might either solve the problem or rule out a cause. Some examples of the kinds of testing you can do are shown later today.

Run the Test or Implement the Solution

Either run a test that will give you valuable information about the bottleneck or try to fix the problem. Carefully monitor the system in both cases. Even if it seems that the performance of the system has immediately and dramatically improved, you should still monitor the system and collect some data. If you run into a performance bottleneck in the future, this data can be valuable.

Analyze the Results

Monitor the system after a change has been made or a test has been run to determine whether the bottleneck has been eliminated. In the case of a capacity issue, the bottleneck might appear to have been eliminated, but the fix might be only temporary. Carefully analyze your system so you can anticipate when you will run out of capacity again.

You will sometimes find that the results of your test prove that your hypothesis was incorrect. If this is the case, return to step 2 to determine the cause of the bottleneck.

Question Your Results

Sometimes testing does not afford a valid result. This typically occurs when you make a change that you expect to either increase or decrease performance, and neither occurs. This indicates that one of two things has occurred:

  • The bottleneck is somewhere else--If you have a bus bandwidth problem and add more memory, there will be no improvement because you are limited by the speed of the bus.

  • The test was flawed--This could occur for any number of reasons. Perhaps you modified the wrong parameter file, or perhaps the change requires a system reboot.

If your results are flawed, be skeptical. Check the results. It's okay if you don't believe the results and want to try another test. Only rarely does a test turn out to be flawed or do other problems occur.

Operating System Overview

If you understand the basic concepts of how the operating system works, you will have a better feel for how to monitor and tune the operating system.

Windows NT

Microsoft Windows NT, a relatively new operating system, has quickly gained popularity. Windows NT has file and print services similar to those offered by NetWare, but Windows NT is used primarily as an application server or client operating system. Windows NT comes in two varieties:

  • NT Workstation--NT Workstation is designed for the client users and does not contain many of the management pieces that come standard with NT Server.
  • NT Server--NT Server includes the management tools necessary for maintaining a server.

Windows NT, a multiprocessor operating system, can take advantage of scaleable performance increases via the addition of CPUs. Windows NT is also a server operating system; there is no facility within Windows NT to provide login functionality from terminals. From the beginning, Windows NT was designed to support clients over network connections. The Windows NT operating system provides functionality such as 16-bit Windows application support and a GUI, but because of this functionality, some overhead is associated with Windows NT.

Architectural Overview of Windows NT

New Term: Windows NT is based on a microkernel architecture. In a microkernel architecture, the core kernel is small because most of the OS functionality is removed from the kernel. Because of this, the system is modular; large pieces of the OS can be easily replaced. Although a microkernel does not provide additional performance, it does provide a flex-ibility that OS and device-driver developers can exploit. Because of the flexible subsystem architecture, OS changes such as file systems, hardware architecture, and memory subsystems can easily be replaced.

In Windows NT, hardware support is provided through the Hardware Abstraction Layer (HAL). The HAL is employed by hardware vendors to support their systems' base architectures and to provide device-driver support. Because of the microkernel and HAL, Windows NT can support different architectures such as Intel, DEC Alpha, MIPS, PowerPC, and so on. The microkernel is the common code base in which each of these architectures is rooted.

Most 16-bit applications written for Windows can run on non-Intel architectures through the use of a compatibility mode driver. Applications written specifically for Windows NT can be ported to non-Intel platforms with minimal effort. However, applications that have not been written to take advantage of a particular architecture might not run optimally. Oracle is currently supported only on the Intel architecture. As with all Oracle ports, Oracle has been optimized for Windows NT and specifically optimized for Windows NT on Intel platforms.

The Windows NT architecture provides for the use of threads, sometimes known as lightweight processes. Through the use of threads instead of processes, much of the overhead associated with process switching is reduced. Threads are automatically created when the Oracle instance is started (see Table 20.1).

Table 20.1. Oracle service threads.

Thread number Oracle process
0, 1 Oracle Service
2 PMON
3 DBWR
4 LGWR
5 SMON
6 RECO

A shadow thread is created on behalf of each user accessing the Oracle database. Remember, The shadow processes communicate with the user and interact with Oracle to carry out the user's requests. For example, if the user process requests a piece of data not already in the SGA, the shadow process is responsible for reading the data blocks from the datafiles into the SGA. Under Windows NT, these processes are invoked as kernel threads. Each of these threads has its own thread number.

UNIX

UNIX, born in 1969, predates NetWare, Windows NT, and OS/2 by many years. UNIX is different from NetWare, Windows NT, and OS/2 in that it was not designed as a client/server operating system. That is not to say that UNIX is not now used as a network server, but that network service was not the original intent of the operating system.

Even though UNIX is fairly standard, a divergence in UNIX has occurred over the years. Because almost every major computer company develops and sells a version of UNIX, there are differences in the UNIX offerings from the different vendors.

The PC server market boasts three main UNIX versions: SCO UNIX, SCO UnixWare, and Solaris from SunSoft. IBM calls its UNIX offering AIX. Hewlett-Packard has a UNIX operating system called HP-UX; AT&T Global Information Solutions simply calls its product UNIX System V. SunSoft produces versions of UNIX for both Intel and Sun SPARC processors; its products are called Solaris and Solaris X86. Currently, most vendors (with the exception of SCO) base their versions of UNIX on one common core: UNIX System V Release 4, which is sometimes referred to as UNIX SVR4.

In many cases, applications are binarily compatible between operating systems on the same platform, but it is not uncommon for the vendor to add special enhancements in the operating system for performance. Oracle always takes advantage of OS-specific features, even though it would be simpler to ship just one binary.

Architectural Overview of UNIX

UNIX consists of a core piece called the kernel surrounded by applications and tools. The kernel contains all hardware support, device drivers, scheduling routines, and the network stack. Unlike the microkernel architecture used in Windows NT, the UNIX kernel contains all the core operating system functionality.

UNIX is a virtual-memory operating system but is very configurable. In the UNIX operating system, not all memory is allocated and used in the same manner. Shared memory used for the Oracle SGA is treated differently from normal memory. Shared memory is allocated at boot time, is not available to general user processes, and must be allocated through shared memory system calls in the operating system.

The fact that shared memory is treated differently allows certain functions to occur. Some varieties of UNIX allow you to allocate shared memory using a 4MB memory page size. This arrangement cuts down on page entries that must be maintained by the operating system and guarantees larger chunks of contiguous memory. Shared memory is also unique in that it is nonpageable. This means that if you have enough shared memory to create the SGA at instance startup, you don't ever have to worry about the SGA being paged out.

Other features that have been added over the years to the UNIX operating system to improve database performance include realtime processing, high-speed semaphores, and asynchronous I/O. Not all these features are in every implementation of UNIX. Check with your OS vendor to see what is available for your environment.

Finding a Performance Bottleneck

Finding a performance bottleneck can be difficult. There is no proven formula or list of things to look for that immediately tells you what the problem is. Finding a performance bottleneck is an investigative task, and the more experience you get, the better at it you will be.

Knowing Your System's Limitations

To determine whether a system component is prompting a performance bottleneck, ascertain the limitations of that component when possible. Doing so allows you to determine whether you are exceeding that limitation. Unfortunately, these limitations are difficult to gauge in most cases, but it is possible to characterize the component in some cases. The limitations of the I/O subsystem and in the networking components are ascertainable.

The I/O Subsystem

Some pieces of the I/O subsystem, such as the disk drive, have been well tested and documented over the last few years; limitations of the controller and bus are not as well known. The disk drive is a physical piece of hardware that usually comes with a specification. This specification tells you certain things about the disk drive, such as

  • Average seek time--This is the average time it takes for the disk arm to move from one location to another. This relates to the number of random I/Os per second that you can perform.

  • Average track-to-track seek time--This is the time it takes to move to the next adjacent track on the disk drive. This relates to the number of sequential I/Os per second that you can perform.

  • Transfer time--This is the time it takes to move the data from the disk to the system bus.

  • Rotations per second--This is how quickly the disk spins. This relates to how quickly you can rotate to the next bit of data on the disk.

A typical disk drive might have an average seek time of 9 milliseconds; for random I/Os, it takes 9 milliseconds to move to the next data. Add 1 millisecond for data transfer to make 10 milliseconds per random I/O. Divide 1 second by 10 milliseconds to see how many times per second you can perform a random seek:

Seeks = 1 Sec / 0.01 Sec/Seek = 100 Seeks

On Day 4, "Properly Sizing Your Database and Planning for Growth," you saw the "knee of the curve" theory. According to that theory, you don't want to exceed 60-70% of your available capacity; otherwise, times increase. This is why you should not exceed 60-70 seeks per second for random I/O.

The Network Subsystem

The network subsystem is another area where you can compare your throughput against a known maximum. Ethernet networks are available in a number of different throughput levels. This is usually indicated in the name of the network hardware. A few of the network hardware specifications include

  • 10BaseT--This is a 10Mb/second ethernet. With this hardware you can have a peak throughput of 10Mb, or approximately 1.25MB/second.

  • 100BaseT--This is a 100Mb/second ethernet. With this hardware you can have a peak throughput of 100Mb, or approximately 12.5MB/second.

Monitor your system to you determine whether you are close to meeting or exceeding the available throughput. Remember that smaller packets are not as efficient, so if you send many small requests (common in database applications), you might not realize the full potential of the network.

A good way to determine how much network bandwidth you are using is by employing a network monitor. Network monitors can be hardware or software devices. Most operating systems do not provide any way to monitor the network from your system because your network card normally passes only packets addressed to that machine. Other packets are ignored.

Memory

Memory problems sometimes manifest themselves as other problems. For example, a lack of memory might cause excessive I/O, which will appear as an I/O problem. Of prime importance is the fact that the system is not paging or swapping. Paging and swapping occur when no physical memory is available for code processing. The CPU pages, or moves data out of memory, onto a paging area on disk. This operation can be quite expensive. If you are seeing significant paging, add more memory to the system or reduce the Oracle DB block buffer cache to free more memory.


WARNING: If your system is paging or swapping, you have a severe performance problem. No amount of tuning will make up for the performance degradation caused by this paging or swapping. If you are seeing excessive paging, get more memory. If you are at the limit of memory that your system can support, start looking for a new system.

Monitoring the System

After you know your limitations, you can determine whether you are exceeding them by monitoring the system. There are several ways to monitor the system; one of the best ways under Windows NT is with the Performance Monitor, or perfmon. On a UNIX system, you can monitor many objects with the sar utility. Most operating systems also offer a graphical monitoring tool. These tools are good, but are specific to the brand of hardware you purchase. Look in your OS documentation for instructions on how to use these tools.

Oracle provides a set of useful scripts for monitoring the performance of the RDBMS. These scripts are called UTLBSTAT (begin) and UTLESTAT (end). UTLBSTAT should be run first, followed shortly thereafter by UTLESTAT. UTLESTAT displays a large number of statistic and counter values based on the time since UTLBSTAT was run.

Using perfmon

Select Programs | Administrative Tools (Common) | Performance Monitor to invoke perfmon. The Performance Monitor appears in chart mode, which is fine for small numbers of items (such as a trend over an extended period of time). If you are looking at a large amount of data, use report mode; in this mode, values are displayed onscreen as numerical values. Invoke report mode by selecting it via the View drop-down menu or by clicking the Report button.


WARNING: perfmon adds some CPU overhead to the system. If you extend the report inverval, perfmon will run less often, thus reducing CPU overhead. Increasing the report interval also increases the accuracy of the report. I recommend setting the report interval to 15 or 30 seconds.

Looking for I/O Bottlenecks with perfmon

To use perfmon to view events in the system, you must select counters. Each of these counters represents a different event that is monitored. To view counter information, do the following:

1. Click the + button to add a new counter.

2. From the Add to Report screen, choose the PhysicalDisk object from the Object drop-down menu, as shown in Figure 20.1.

Figure 20.1.

Choose the PhysicalDisk object.

3. You are presented with a number of objects from which you can select. You can monitor many useful objects, but my opinion is that the following items are most important in monitoring the disk I/O subsystem:
  • Disk Reads/Sec--The number of reads per second to that disk or volume.

  • Disk Writes/Sec--The number of writes per second to that disk or volume.

  • Disk Transfers/Sec--The total number of reads and writes per second to that disk or volume.

  • Avg. Disk Sec/Read--The average time it takes for the read operation to occur.

  • Avg. Disk Sec/Write--The average time it takes for the read operation to occur.

  • Avg. Disk Sec/Transfer--The average time it takes for the read and write operations to occur.

  • Avg. Disk Queue Length--The average number of I/Os in the disk I/O subsystem.


NOTE: A hardware disk array will appear to the NT system as a single disk. Divide the number of I/Os per second by the number of disk drives to get the I/Os per second per disk drive. This is not 100% accurate because the I/O balance may not be perfect, but it is the best that you can do.

Of particular note should be the number of I/Os per second per disk drive and the disk latency. A typical disk drive or disk array should complete I/Os in 20-30 milliseconds (0.020-0.030 seconds). If any volume is much higher than that (more than 0.060 seconds), you are probably seeing some sort of disk bottleneck. Consider adding more disk drives.

If you are running on UNIX, you can get this information using sar or other monitoring tools.

Looking for Memory Bottlenecks with perfmon

In Windows NT, the Pages/Sec counter under the Memory objects can be used to indicate that the system is paging. If you see significant activity with this counter, you are using too much memory. If you cannot add more memory, you should reduce the memory used by Oracle and other processes or reduce the number of users on the system.

Using UTLBSTAT and UTLESTAT

Within the directory ORANTRDBMS80ADMIN are a number of scripts used internally within Oracle. You ran CATPROC.ORA and CATALOG.ORA from this directory when you created a database. UTLBSTAT and UTLESTAT are also run from here.

UTLBSTAT sets up some internal tables and takes a snapshot of some of the internal Oracle counters. After you run under load for a time, UTLESTAT takes another snapshot and compares the results. UTLESTAT displays a large number of statistic and counter values based on the time since UTLBSTAT was run.


NOTE: The UTLBSTAT.SQL and UTLESTAT.SQL scripts each have a CONNECT INTERNAL string at the top. Most likely, this will not work under NT. Simply comment out this line and run the script from the INTERNAL user account. Place the word Rem at the beginning of those lines.

The UTLBSTAT and UTLESTAT scripts can be run from the SVRMGR30 utility. After you connect in as INTERNAL or SYS, you can run the UTLBSTAT script with the following syntax:


@D:ORANTRDBMS80ADMINUTLBSTAT;

After you run for a while under load, you can finish monitoring Oracle and view the results by running the UTLESTAT script with the following syntax:


@D:ORANTRDBMS80ADMINUTLESTAT;


NOTE: The usefulness of the data retrieved from UTLBSTAT and UTLESTAT might depend on the length of time for which the data is collected. When looking at specific events, five minutes might be enough. When looking at long-term performance, you might need to run for hours or more. If you are interested in a specific event such as a long-running query, the runtime of the event will be sufficient.

The result of UTLESTAT is, in its entirety, quite long. Following is the resulting data from UTLESTAT only; the echo of the SQL statements has been removed:

OUTPUT:

LIBRARY      GETS       GETHITRATI PINS       PINHITRATI RELOADS    INVALIDATI
------------ ---------- ---------- ---------- ---------- ---------- ----------
BODY                  0          1          0          1          0          0
CLUSTER              97          1         50          1          0          0
INDEX                 0          1          0          1          0          0
OBJECT                0          1          0          1          0          0
PIPE                  0          1          0          1          0          0
SQL AREA            120       .808        364       .874          0          0
TABLE/PROCED        235       .923        265       .879          0          0
TRIGGER               0          1          0          1          0          0
Statistic                    Total       Per Transact  Per Logon    Per Second
---------------------------- ----------- ------------  -----------  -----------
DBWR buffers scanned                 601          601       150.25         2.86
DBWR free buffers found              581          581       145.25         2.77
DBWR lru scans                        35           35         8.75          .17
DBWR make free requests               35           35         8.75          .17
DBWR summed scan depth               601          601       150.25         2.86
DBWR timeouts                         68           68           17          .32
SQL*Net roundtrips to/from           129          129        32.25          .61
background timeouts                  210          210         52.5            1
buffer is not pinned count          2175         2175       543.75        10.36
buffer is pinned count              6343         6343      1585.75         30.2
bytes received via SQL*Net         11212        11212         2803        53.39
bytes sent via SQL*Net to c         8270         8270       2067.5        39.38
calls to get snapshot scn:           145          145        36.25          .69
calls to kcmgas                        1            1          .25            0
calls to kcmgcs                       10           10          2.5          .05
calls to kcmgrs                      181          181        45.25          .86
cleanouts only - consistent            5            5         1.25          .02
cluster key scan block gets          801          801       200.25         3.81
cluster key scans                    506          506        126.5         2.41
commit cleanouts                      16           16            4          .08
commit cleanouts successful           16           16            4          .08
consistent gets                     4557         4557      1139.25         21.7
cursor authentications                 6            6          1.5          .03
db block changes                     104          104           26           .5
db block gets                        141          141        35.25          .67
enqueue releases                      26           26          6.5          .12
enqueue requests                      20           20            5           .1
execute count                        144          144           36          .69
free buffer inspected                  7            7         1.75          .03
free buffer requested               1383         1383       345.75         6.59
immediate (CR) block cleano            5            5         1.25          .02
logons cumulative                      4            4            1          .02
messages received                     38           38          9.5          .18
messages sent                         39           39         9.75          .19
no work - consistent read g         4063         4063      1015.75        19.35
opened cursors cumulative            109          109        27.25          .52
parse count (hard)                    23           23         5.75          .11
parse count (total)                  120          120           30          .57
physical reads                      1369         1369       342.25         6.52
physical writes                       45           45        11.25          .21
recursive calls                     2030         2030        507.5         9.67
redo blocks written                   49           49        12.25          .23
redo entries                          67           67        16.75          .32
redo size                          25360        25360         6340       120.76
redo small copies                     67           67        16.75          .32
redo synch writes                      1            1          .25            0
redo wastage                         644          644          161         3.07
redo writes                            3            3          .75          .01
session logical reads               4698         4698       1174.5        22.37
session pga memory                645168       645168       161292      3072.23
session pga memory max            645168       645168       161292      3072.23
session uga memory                 13772        13772         3443        65.58
session uga memory max            144296       144296        36074       687.12
sorts (memory)                        85           85        21.25           .4
sorts (rows)                         783          783       195.75         3.73
table fetch by rowid                2148         2148          537        10.23
table fetch continued row              2            2           .5          .01
table scan blocks gotten            1085         1085       271.25         5.17
table scan rows gotten               456          456          114         2.17
table scans (long tables)              2            2           .5          .01
table scans (short tables)            17           17         4.25          .08
total file opens                       5            5         1.25          .02
user calls                           139          139        34.75          .66
user commits                           1            1          .25            0
write requests                         6            6          1.5          .03
Average Write Queue Length
--------------------------
                          0
Event Name                        Count        Total Time    Avg Time
--------------------------------- ------------ ------------- -------------
SQL*Net message from client                150             0             0
SQL*Net message to client                  150             0             0
control file sequential read                68             0             0
db file scattered read                     137             0             0
db file sequential read                    413             0             0
file identify                                3             0             0
file open                                    9             0             0
log file sync                                1             0             0
Event Name                        Count        Total Time    Avg Time
--------------------------------- ------------ ------------- -------------
control file parallel write                  2             0             0
control file sequential read                 2             0             0
db file parallel write                       6             0             0
db file scattered read                       1             0             0
db file sequential read                      5             0             0
db file single write                         2             0             0
log file parallel write                      3             0             0
log file sync                                1             0             0
pmon timer                                  70             0             0
rdbms ipc message                          250             0             0
smon timer                                   1             0             0
LATCH_NAME          GETS       MISSES      HIT_RATIO   SLEEPS      SLEEPS/MISS
------------------- ---------- ----------- ----------- ----------- -----------
Active checkpoint           69           0           1           0           0
Checkpoint queue l        1488           0           1           0           0
Token Manager               22           0           1           0           0
cache buffer handl          43           0           1           0           0
cache buffers chai       10389           0           1           0           0
cache buffers lru         1451           0           1           0           0
dml lock allocatio          11           0           1           0           0
enqueue hash chain          24           0           1           0           0
enqueues                    62           0           1           0           0
ktm global data              1           0           1           0           0
library cache             3099           0           1           0           0
library cache load          64           0           1           0           0
list of block allo           1           0           1           0           0
messages                   536           0           1           0           0
modify parameter v           4           0           1           0           0
multiblock read ob         288           0           1           0           0
ncodef allocation            3           0           1           0           0
process allocation           1           0           1           0           0
redo allocation            115           0           1           0           0
row cache objects         1318           0           1           0           0
session allocation          56           0           1           0           0
session idle bit           288           0           1           0           0
session switching            3           0           1           0           0
shared pool               1063           0           1           0           0
sort extent pool             1           0           1           0           0
system commit numb         193           0           1           0           0
transaction alloca           7           0           1           0           0
undo global data           101           0           1           0           0
LATCH_NAME          NOWAIT_GETS     NOWAIT_MISSES    NOWAIT_HIT_RATIO
------------------- --------------- ---------------- ----------------
Token Manager                     3                0                1
cache buffers chai             8643                0                1
cache buffers lru              1465                0                1
library cache                     3                0                1
multiblock read ob                1                0                1
process allocation                1                0                1
row cache objects                 3                0                1
vecio buf des                     3                0                1
CLASS              COUNT            TIME
------------------ ---------------- ----------------
0 rows selected.
UNDO_SEGMENT        TRANS_TBL_GETS      TRANS_TBL_WAITS     
UNDO_BYTES_WRITTEN  SEGMENT_SIZE_BYTES  XACTS               
SHRINKS             WRAPS
------------------ ------------------ ------------------ 
------------------ ------------------ ------------------
- ------------------ ------------------
                  0                   8                   
0                2718             1226752                   
0                   0                   0
1 row selected.
NAME                                    VALUE
-------------------------------------- --------------------------------------
background_dump_dest                    %RDBMS80%trace
compatible                              8.0.0.0.0
control_files                           D:DATABASEctl1dogs.ora
db_block_size                           2048
db_files                                1024
db_name                                 dogs
dml_locks                               100
log_buffer                              2048, 8192
log_checkpoint_interval                 10000
max_dump_file_size                      10240
processes                               20, 50
remote_login_passwordfile               SHARED
sequence_cache_hash_buckets             10
shared_pool_size                        3500000
snapshot_refresh_processes              1
user_dump_dest                          %RDBMS80%trace
NAME            GET_REQS GET_MISS SCAN_REQ SCAN_MIS MOD_REQS COUNT    CUR_USAG
--------------- -------- -------- -------- -------- -------- -------- --------
dc_free_extents        9        9        0        0        0       27       10
dc_rollback_seg        2        0        0        0        0        7        2
dc_users              28        0        0        0        0       14        1
dc_objects           169       16        0        0        0      144      138
dc_synonyms            2        1        0        0        0       12        1
dc_usernames         128        0        0        0        0       20        2
dc_object_ids        108        5        0        0        0      104       97
TABLE_SPACE                                                                     
READS      BLKS_READ  READ_TIME  
WRITES     BLKS_WRT   WRITE_TIME MEGABYTES
------------------------------------------------------------------------------
 ---------- ---------- ---------- ----
---- ---------- ---------- ----------
DOGS                                                                            
         0          0          0          
0          0          0         10
SYSTEMB                                                                         
        555       1378          0         
53         53          0         26
TABLE_SPACE                    FILE_NAME                                        
READS      BLKS_READ  READ_TIME  
WRITES     BLKS_WRT   WRITE_TIME MEGABYTES
------------------------------ ------------------------------------------------
" WIDTH="14" HEIGHT="9" ALIGN="BOTTOM"
BORDER="0">; ---------- ---------- ---------- ----
---- ---------- ---------- ----------
DOGS                           D:DATABASEDOGS1.DBF                            
         0          0          0          
0          0          0         10
SYSTEMB                         D:DATABASESYSDOGS.ORA                         
        555       1378          0         
53         53          0         26
START_TIME                END_TIME
------------------------ ------------------------
15-jun-97 16:39:39        15-jun-97 16:43:09
BANNER
----------------------------------------------------------------
Oracle8 Server Release 8.0.2.0.2 - Beta
PL/SQL Release 3.0.2.0.2 - Beta
CORE Version 4.0.2.0.1 - Production
TNS for 32-bit Windows: Version 3.0.2.0.0 - Beta
NLSRTL Version 3.3.0.0.1 - Beta

ANLYSIS: As you can see, a wealth of data collected within Oracle can be retrieved via the UTLBSTAT and UTLESTAT SQL scripts. This data is retrieved mainly from the Oracle V$ views. These views are the window into the Oracle internal performance tables.


NOTE: When using a script such as UTLESTAT where the output data is useful, employ the spool option under SVRMGR30. This option copies the onscreen data into a file. If you specify a filename, that name will be used. Use the following syntax:

SVRMGR> SPOOL stats;



System Tuning

I like to divide tuning into the following components:

  • Software tuning--This involves both the OS and the RDBMS.

  • Hardware tuning--This includes proper system design and disk layout.

  • Application tuning--This involves properly designing the application and database to take advantage of performance features and planning.

Oracle Tuning

This section focuses on things that can be tuned in the Oracle RDBMS. A complete list of Oracle tuning parameters is given in Appendix B, "Oracle Tuning Parameters."

The SGA

The System Global Area (SGA) contains the shared pool, the redo log buffer, and the database block buffers.

The Shared Pool

The shared pool contains the library cache, the data-dictionary cache, and the shared session area (with the multithreaded server).

The Library Cache

The library cache contains the shared SQL and PL/SQL areas. You can improve performance by increasing the cache-hit rate in the library cache and by speeding access to the library cache by holding infrequently used SQL statements in cache longer.

The V$LIBRARYCACHE table contains statistics about how well you are using the library cache. The important columns to view in this table are PINS and RELOADS:

  • PINS--The number of times the item in the library cache was executed.
  • RELOADS--The number of times the library cache missed and the library object was reloaded.

A small number of reloads relative to the number of executions indicates a high cache-hit rate.

The Data-Dictionary Cache

The data-dictionary cache contains a set of tables and views that Oracle uses as a reference to the database. Here Oracle stores information about the logical and physical structure of the database.

To check the efficiency of the data-dictionary cache, check the cache-hit rate. Statistics for the data-dictionary cache are stored in the dynamic performance table V$ROWCACHE (the data-dictionary cache is sometimes known as the row cache). The important columns to view in this table are GETS and GETMISSES:

  • GETS--The total number of requests for the particular item.
  • GETMISSES--The total number of requests resulting in cache misses.

To determine your cache-hit ratio, use the following SQL statement or look at PINHITRATIO in the output of UTLESTAT:

SELECT SUM(reloads) "Cache Misses",
SUM(pints) "Executions",
100 * (SUM(reloads) / SUM(pins) ) "Cache Miss Percent"
FROMB V$librarycache;

This will show you the cache-miss percent:

Cache Hit % = 100% - Cache Miss %

A low number of cache misses is expected, especially during startup when the cache has not been populated.

Shared Session Information

In a multithreaded server configuration, the session information is also stored in the shared pool. This information includes the private SQL areas as well as sort areas. Ensure that you do not run out of memory for this shared session information.

To determine whether you need to increase space for these shared sessions, you can extract the sum of memory allocated for all sessions and the maximum amount of memory allocated for sessions from the dynamic performance table V$SESSTAT. If the maximum amount of memory used is high, it might be necessary to increase the size of the shared pool. Because the shared pool is used for other functions as well, such as the library cache and the data-dictionary cache, it is a good idea to increase the size of the shared pool to accommodate this additional memory usage. If you have enough memory in your system, increase the shared pool by the maximum amount of memory used by the shared server processes; if you have a limited amount of memory, use the sum of memory allocated to sessions that was obtained when an average number of users was connected and running as a basis for the amount of memory by which to increase the shared pool.

Database Block Buffer Cache

The most important Oracle cache in the system is probably the buffer cache. The buffer cache makes up the majority of the Oracle SGA and is used for every query and update in the system. The statistics for the buffer cache are kept in the dynamic performance table V$SYSSTAT and are included in the output from UTLESTAT. The important columns to view in this table are listed here:

  • PHYSICAL READS--This indicates the total number of requests that result in a disk access. This is a cache miss.

  • DB BLOCK GET--This indicates the number of requests for blocks in current mode. Buffers are typically retrieved in current mode for INSERT, UPDATE, and DELETE statements.

  • CONSISTENT GET--This indicates the number of requests for blocks in consistent mode. Buffers are typically retrieved in consistent mode for queries.

The sum of the values in DB BLOCK GETS and CONSISTENT GETS represents the total number of requests for data. The cache-hit ratio is determined using this formula:

Cache-hit ratio = 1 - (PHYSICAL READS / (DB BLOCK GETS + CONSISTENT GETS))

The block buffers are the most important area of the SGA and must be tuned because of the large effect they have on the system and the number of resources they consume.

Performance Enhancements

I prefer to separate the performance-enhancement options from the general tuning of Oracle. Performance enhancements tend to be things that might or might not help your configuration and application; in fact, they might hurt. On the other hand, tuning parameters always help, based on the correct interpretation of Oracle statistics. The following sections review a few of the enhancements you have seen throughout the book.

Block Size

Depending on your configuration and data-access patterns, you might be able to benefit from using a larger block size. With a larger block size, you get the benefit of less wasted space and more efficient I/O in certain conditions. Here are a few guidelines that might help you decide whether changing the size of DB_BLOCK_SIZE can help you:

  • OLTP systems benefit from smaller blocks--If your application is OLTP in nature, you will not benefit from larger blocks. OLTP data typically fits well in the default block size; larger blocks unnecessarily eject blocks from the SGA.

  • DSS systems benefit from larger blocks--In the DSS system in which table scans are common, retrieving more data at a time results in a performance increase.

  • Larger databases benefit from larger blocks--Larger databases see a space benefit from less wastage per block.

  • Databases with large rows benefit from larger blocks--If your rows are extremely large (as is sometimes the case with images or text) and don't fit in the default block, you will see a definite benefit from a larger block size.

Because unnecessarily changing the block size increases I/O overhead, this change does carry some risk. Change the block size with caution.

Clusters

New Term: A cluster, sometimes called an index cluster, is an optional method of storing tables in an Oracle database. Within a cluster, multiple related tables are stored together to improve access time to the related items. Clusters are useful in cases where related data is often accessed together. The existence of a cluster is transparent to users and applications; the cluster affects only how data is stored.

A cluster can be useful for tables in which data is primarily accessed together in a join. In such situations, the reduced I/O needed to bring the additional data into the SGA and the fact that the data is already cached can be a big advantage. However, for situations in which the tables have a large number of INSERT statements or the data is not frequently accessed together, a cluster is not useful and should not be used. Do not cluster tables if full-table scans are often performed on only one of the tables in the cluster. The additional space required by the cluster and the additional I/O reduces performance.

Reduction of Fragmentation

New Term: Fragmentation occurs when pieces of the database are no longer contiguous. Fragmentation can consist of disk fragmentation or tablespace fragmentation. Both of these types of fragmentation usually affect performance. Disk fragmentation usually causes multiple I/Os to occur when one I/O would have been sufficient (for example, with chained or migrated rows). Disk fragmentation can occur when the extents that comprise the database segments are noncontiguous; this is sometimes caused by excessive dynamic growth.

Tablespace fragmentation is caused by the dropping and creating of segments. This can produce large free areas between segments, which results in the inefficient use of space and excessive disk seeks over the empty areas. Tablespace fragmentation can also prevent Oracle from taking advantage of multiblock reads.

One way to eliminate fragmentation is to export the table or tablespace data, remove and re-create the table or tablespace, and import the data. By eliminating fragmentation, you can reduce excessive I/Os and CPU usage, streamlining data access. Any overhead and unnecessary I/Os you can reduce will improve system performance.

Hash Clusters

New Term: A hash cluster is similar to a cluster except that it uses a hash function rather than an index to reference the cluster key. A hash cluster stores the data based on the result of a hash function. The hash function is a numeric function that determines the data block in the cluster based on the value of the cluster key. To achieve good performance from a hash cluster, you must meet the following criteria:

  • The cluster key value must be unique.

  • The majority of queries must be equality queries on the cluster key.

  • The size of the table must be static (little growth occurs).

  • The value of the cluster key must not change.

If you can take advantage of hashing by meeting this strict criteria, you will see good performance. Hashing is extremely efficient under the right conditions; however, having a hash cluster under the wrong conditions can degrade performance.

Indexes

New Term: An index, like the index in this book, is an optional structure designed to help you achieve faster access to your data. When optimally configured and used, indexes can significantly reduce I/O to the datafiles and greatly improve performance. You must first decide whether an index is appropriate for the data and access patterns in your particular system. Having decided to use an index, you must decide which columns to index. Indexing appropriately can greatly improve performance by reducing I/Os and speeding access times.

Careful planning and periodic testing with SQL Trace can lead to the effective use of indexes, with optimal performance being the outcome. Even though indexes have been built on tables in your database, they might not necessarily be used. If the SQL statement is not correctly formed, the index might be bypassed and a full-table scan might be the result. This is why the application developer and the system designer must communicate constantly throughout the development and deployment cycle. The application should be tested and the SQL statements analyzed to ensure that indexes are being used where intended. Indexes are covered in detail on Day 13, "Using Indexes and Sequences."

Multiblock Reads

When performing table scans, Oracle can read more than one block at a time, thus speeding I/Os. Consequently, a larger chunk of data can be read from the disk, thus eliminating some disk seeks. The reduction of disk seeks and the reading of larger blocks reduces both I/O and CPU overhead.

The amount of data read in a multiblock read is specified by the Oracle initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. The value for this parameter should always be set high because there is rarely any disadvantage in doing so. The size of the individual I/O requests depends on DB_FILE_MULTIBLOCK_READ_COUNT and DB_BLOCK_SIZE. A good value for multiblock reads is 64KB.

Multiblock Writes

Multiblock writes are similar to multiblock reads and have many of the same requirements. Multiblock writes are available through the direct path loader as well as through sorts and index creations. As with multiblock reads, the multiblock writes reduce I/O and CPU overhead by writing multiple database blocks in one larger I/O operation.

The amount of data written in a multiblock write is specified by the Oracle initialization parameter DB_FILE_MULTIBLOCK_WRITE_COUNT. The size of the individual I/O requests depends on both DB_FILE_MULTIBLOCK_WRITE_COUNT and DB_BLOCK_SIZE. As with multiblock reads, a good value is 64KB.

The Oracle Parallel Query Option

The Oracle Parallel Query option makes it possible for some Oracle functions to be processed by multiple server processes. The functions affected are queries, index creation, data loading, and recovery. For each of these functions, the general principle is the same: Keep processing while Oracle waits for I/O.

For most queries, the time spent waiting for the data to be retrieved from disk usually overshadows the amount of time actually spent processing the results. With the Parallel Query option, you can compensate for this wasted time by using several server processes to execute the query. While one process is waiting for I/Os to complete, other processes can execute. If you are running on a Symmetric Multiprocessor (SMP) computer, a cluster, or a Massively Parallel Processing (MPP) machine, you can take maximum advantage of the Parallel Query option. The amount of parallelism can be tuned with several of the Oracle initialization parameters:

  • PARALLEL_DEFAULT_MAX_SCANS--This parameter specifies the maximum number of query servers to used by default for a query. This valued is used only if no value is specified in a PARALLEL hint or in the PARALLEL definition clause. This parameter limits the number of query servers used by default when the value of the PARALLEL_DEFAULT_SCANSIZE is used by the query coordinator.

  • PARALLEL_DEFAULT_SCANSIZ--This parameter specifies the number of query servers to be used for a particular table. The size of the table divided by PARALLEL_DEFAULT_SCANSIZE determines the number of query servers, up to PARALLEL_DEFAULT_MAX_SCANS.

  • PARALLEL_MAX_SERVER--This parameter specifies the maximum number of query servers or parallel recovery processes available for this instance.

  • RECOVERY_PARALLELIS--This parameter specifies the number of processes to be used for instance or media recovery. A large value can greatly reduce instance recovery time. A value of 0 or 1 indicates that parallel recovery will not be performed and that recovery will be serial. A good value for this parameter is in the range of the number of disks you have (up to 50).

I am a big fan of the Parallel Query option. I have seen great improvements from the use of parallel queries as well as dramatic reductions in recovery time when the parallel recovery feature is used. You saw this in yesterday's lesson, "Advanced Oracle Options."

The Oracle Parallel Server Option

The Oracle Parallel Server option is one of the most innovative and impressive options available from Oracle. With the Parallel Server option, you can cluster several computers using a shared-disk subsystem and have multiple Oracle instances access the same database. If your application is suitable, you can see good scalability from adding computers.

The Oracle Parallel Server option uses a sophisticated locking mechanism in conjunction with a shared-disk subsystem to allow multiple instances to access the same data. If you have an application that can take advantage of the Oracle parallel server architecture, you should see some good performance improvements.

The two areas that can most influence the performance of your parallel server system are data partitioning and PCM lock management. Both of these can make a huge difference in the performance of your system.

  • Partitioning--By properly partitioning your data to reduce lock traffic and contention for blocks between servers, you can enhance performance. Try to balance your users so that the users accessing the same tables are on the same machine; doing so can reduce contention for locks.
  • PCM locks--By carefully managing the number of locks on each table, you can enhance performance. Tables with a lot of traffic between nodes should have more locks than tables with less contention. By balancing the number of locks, you can reduce overhead.


TIP: By taking advantage of read-only tablespaces where applicable, you can reduce the number of PCM locks in your system. Because read-only tablespaces do not allow updates, no locking is necessary.

Spin Counts

Multiprocessor environments might benefit from tuning of the parameter SPIN_COUNT. Under normal circumstances, if a latch is not available, the process sleeps and then wakes up to try the latch again. If you are on a multiprocessor system, it is likely that the process holding the latch is currently processing on another CPU and will be finished in a short time. If you set SPIN_COUNT to a value greater than 0, the process spins while counting down from SPIN_COUNT to 0. If the latch is still not available, the process goes to sleep.

Setting SPIN_COUNT can hurt performance if you're not careful. This parameter should be set only for multiprocessor computers and should be monitored for effectiveness. A good value to try is 2,000. The value of SPIN_COUNT specifies how many times the process will spin before putting itself to sleep. Because the speed of processors varies, the time it takes to spin also varies; however, the speed of the process holding the desired resource will also vary with the speed of the processor.

OS Tuning

OS tuning is specific and depends on the OS you are running. I will present some of the basics on tuning Windows NT and UNIX here.

NT Tuning Considerations

With Windows NT, probably the most important tuning consideration is ensuring that sufficient physical memory is available to run Oracle and the required number of server processes. Windows NT uses virtual memory, which means that Oracle and user processes can allocate an almost unlimited amount of memory through paging. If you are not careful, you might overconfigure the amount of virtual memory that you are using and exceed the amount of physical memory in the system. If this occurs, the system will begin paging and performance will be severely degraded.

In addition to the concern about memory, other areas of importance in tuning the Windows NT system include the network and I/O subsystems and the reduction in OS overhead.

Memory

Windows NT uses only virtual memory. Because of this, all memory in the system is treated equally, limited only by the size of the paging file; you must be wary of user processes consuming large amounts of memory and causing paging out the SGA. In Windows NT, there are no limitations on memory used for user processes, disk caching, print caching, and so on; thus, it is best to dedicate the Windows NT server to either file and print services or to application services.


NOTE: The amount of virtual memory with Windows NT is limited by the size of the paging file. If the paging file is too small, you will get warning messages and be forced to resize it and reboot.

The size of one process's virtual memory, also known as the working set size, is limited to 2GB under Windows NT 4.0 and earlier, and 3GB with the Windows NT Enterprise Edition.

If you use asynchronous I/O (AIO) and Oracle can lock down memory for AIO, the database block buffers are not swappable. Monitor the system on a regular basis to ensure that no paging is occurring in the system.

Reduce Unnecessary Memory Usage

You can take some measures to reduce the amount of memory used by Windows NT. Invoke the Control Panel's Network Settings screen and choose the Maximize Throughput for Network Applications option. This optimizes the server memory for network applications, reducing some of the file-system caching and overhead in memory management. Also remove any network protocols not needed by the system to cut down on system overhead and memory usage via the Control Panel.

Also use the Control Panel to turn off any services you are not using. Doing so reduces memory usage and CPU overhead. By reducing all the unnecessary services, you can increase the performance of the system.

SGA Tuning

To maximize performance, allocate as much memory as possible to the SGA. Use the techniques discussed on Days 9, "Managing Data," and 10, "Administering User Accounts," to determine whether the memory can best be used for the shared pool or for database block buffers. Because the SGA resides in virtual memory, it is important that you not allocate so much memory for the SGA that the system pages. The overhead of paging and swapping overshadows any benefit you might gain from a larger SGA. Remember to save memory for the user processes. You should frequently monitor your system to ensure that no paging occurs at any time.

User Capacity

You can easily determine the amount of memory necessary for your application on a per-user basis. Start Oracle and note the amount of available memory by using the Windows NT Performance Monitor. Monitor the Available Bytes option under Memory in the Performance Monitor. After users begin accessing the application in a typical manner, record the amount of memory again. Take the difference and divide this result by the number of users accessing the application. Multiply this per-user memory usage value by the maximum number of users who might be connected to the application to determine the amount of memory you must reserve for user connections. Be sure to add a little extra memory just in case. Be careful that a user's PGA does not consume too much system memory.

To increase the number of Oracle connections, you might also have to adjust the Oracle initialization parameter PROCESSES. This parameter should reflect the maximum number of user connections you expect to have plus the Oracle background processes. You should also include some extra processes for administrative tasks.

Network

Minimal network tuning is necessary with Windows NT. However, you can enhance performance by prioritizing the network bindings. Do so the with the Control Panel's Network Configuration utility. If you put the network protocol you use most frequently first on the list, that protocol gets highest priority. Place other network protocols in the list in the order in which you use them. Removing any protocols you do not use can also boost performance by reducing memory consumption and CPU overhead. Also be sure that the system is configured as a server in the Network Configuration screen.

I/O Subsystem

It is important that performance not be bound by physical I/O rates. If you ensure that the system disks are not driven past their limits, you can avoid this. This is covered in detail on Days 14, "Using Oracle Clusters, Stored Procedures, and Database Links," and 15, "Managing Job Queues and Using Oracle Auditing."

With Windows NT, asynchronous I/O (AIO) is always enabled. There is no need to adjust any tuning parameters to ensure that AIO is enabled.

The default block size for Oracle on Windows NT is 2,048. This might be sufficient. If you have a large database or if the data access is primarily sequential, you might want to increase DB_BLOCK_SIZE to 4,096. Although the data blocks take up more space in the SGA, the number of I/Os performed is significantly reduced. It doesn't take much more overhead and time to retrieve 4KB of data from the disks than it does to get 2KB of data.

If data access is primarily sequential, you might benefit by setting the DB_BLOCK_SIZE parameter to 8,192. Because sequential access to the database will read the next block anyway, larger block sizes will read that data into the SGA before you need it. If your data access is random, you will waste space in the SGA. The value you choose for the block size affects performance either for better or worse. If you are unsure, leave the parameter at the default of 2,048.

When creating Windows NT file systems, you have several choices:

  • FAT

  • HPFS

  • NTFS

Although each of these performs well in certain situations, I recommend using NTFS. NTFS provides you with the best level of overall performance and is the file system that Microsoft endorses.

UNIX Tuning Considerations

As is true for NT and UNIX, the most important tuning consideration is ensuring that sufficient memory is available to run Oracle and the required number of server processes. Because UNIX is a virtual memory operating system, you can always start more server processes; but if you are not careful, you might begin paging.

Many varieties of UNIX have made available extended features such as the Post-Wait Semaphore and asynchronous I/O to enhance Oracle performance. Other areas of importance in tuning the UNIX system include the network and I/O subsystem.

Memory

Memory is important in the UNIX operating system. As you have seen in earlier lessons, the way to increase performance is to maximize the use of the SGA to avoid costly disk I/Os. In UNIX as in all operating systems, it is important to avoid paging. Because the SGA is in shared memory in UNIX, the SGA is guaranteed not to page--but the server processes can page if too much memory is being used.

It is important to allocate enough shared memory to accommodate the SGA. If you do not have enough memory to create the SGA, the instance does not start. If this happens, you must either configure the operating system to allow more shared memory or reduce the size of the SGA. If you are using enhanced 4MB shared memory pages, you can allocate the amount of shared memory only in 4MB units.

Be careful not to allocate so much shared memory for the SGA that user and server processes page. Be sure to periodically monitor the system to ensure that no paging is occurring. To maximize memory used by the SGA, reduce all unnecessary memory used by the operating system and limit the amount of memory that can be allocated by users.

Reduce Unnecessary Memory Usage

One of the best ways to free memory for Oracle is to siphon from memory used by the file-system buffers. These buffers are used by the OS to cache data. A significant performance increase can incur when files that have been cached are accessed. But Oracle uses a feature called synchronous I/O to ensure that writes to the disk are not returned until the data has actually been written. Because Oracle must guarantee that I/Os have been written to the disk to ensure data integrity, OS disk write caching cannot be performed.

By default, the number of file-system buffers is determined by the amount of memory in the system. Because Oracle bypasses the disk cache on writes and uses the SGA for reads, you don't really need a large number of file-system buffers. By reducing this number, you might see a slight performance decrease with OS operations, but any additional memory allocated to the SGA increases Oracle performance. The parameters used to tune the file-system buffers is fairly vendor dependent; this should be documented in your OS manuals.

If you reduce the unnecessary memory used for disk caching, more memory can be allocated to the SGA. Don't reduce the disk cache so far that it is difficult to run OS commands and access Oracle parameter files. Do not set the disk cache buffers to less than 600. These values represent the number of 512 byte blocks.

SGA Tuning

To maximize performance, allocate as much memory as possible to the SGA. Use the techniques discussed earlier today to determine whether memory is best used for the shared pool or for databases block buffers.

In UNIX, the shared memory area used by Oracle for the SGA is usually contiguous. However, if you have multiple instances of Oracle that have started and stopped several times, the shared memory area might no longer be contiguous. If you use 4MB pages, you are guaranteed to have at least 4MB of contiguous memory.

The amount of memory allocated for shared memory is the product of two OS-tunable parameters: SHMMAX and SHMSEG. SHMMAX specifies the maximum size of a shared memory segment; SHMSEG specifies the maximum number of shared memory segments available in the system. Applications are responsible for allocating only the amount of shared memory they need and so do not waste space. In SCO UNIX and UnixWare, the parameters SHMMAX and SHMSEG are located in the UNIX parameter file /ETC/CONF/CF.D/STUNE. In Solaris, the shared memory parameters are set in the file /ETC/SYSTEM. Your OS administrator's manual should have more information about setting system parameters.


NOTE: It is much more efficient for Oracle to have one large shared memory segment than several smaller ones. Therefore, SHMMAX should have a value larger than the size of the SGA.

Most varieties of UNIX that run on the Intel platform have a variety of 4MB pages. By referencing your OS vendor-specific documentation, you can determine how to tune this.

The use of 4MB pages can improve performance dramatically, especially if you are running on an Intel Pentium processor.


NOTE: In UNIX implementations that use 4MB pages, there is usually a threshold at which 4MB pages are used. This threshold is approximately 3MB. If you allocate a 2.5MB shared memory segment, you do not get a 4MB page.

The amount of memory allocated to a single user must also be tuned in the UNIX operating system. Because Oracle is treated the same as any other user, you must allocate enough memory for Oracle to use for the SGA and the server processes. By referencing your OS-specific documentation, you can see how to configure this.

Ensure that sufficient memory is available so that Oracle can allocate for the SGA. Remember to save memory for the user processes as well. You should frequently monitor your system to ensure that no paging is occurring at any time.

User Capacity

You can easily determine the amount of memory necessary for your application on a per-user basis. Start Oracle and note the amount of available memory available with the UNIX utility sar -r. The output from sar -r consists of freemem (free memory pages) and freeswp (free swap pages). The value given in the freemem column is the number of 4KB pages available. When users begin accessing the application in a typical manner, record the amount of memory again. Take the difference and divide this result by the number of users accessing the application. This value is the per-user memory usage. Multiply this value by the maximum number of users who might connect to the application to determine the amount of memory you must reserve for user connections. Be sure to leave a little extra memory just in case.

The size of a user's PGA is not bound by any initialization parameters. Because of this, be careful that a user's PGA does not consume too much system memory.

The UNIX operating system parameters MAXUP and NPROC must also be set to allow a sufficient number of users to connect. Remember that when users connect to Oracle, an Oracle shadow process is created under the Oracle user ID. Therefore, you must increase not only the number of processes systemwide, but also the per-user process limits.

The per-user process limit is set with the OS parameter MAXUP. The maximum number of processes systemwide is set by the OS parameter NPROC. Both values are in the stune file. NPROC should be at least 50 greater than MAXUP to account for OS processes.

To increase the number of Oracle connections, you might also have to adjust the Oracle initialization parameter PROCESSES. The PROCESSES parameter should reflect the maximum number of user connections you expect to have plus the Oracle background processes. You should also include some extra processes for administrative tasks.

Network

With UNIX, the amount of network tuning is usually minimal. Typically, no network tuning is required, and the network tuning that is required is vendor specific. Refer to your OS documentation for any networking tuning that is required.

I/O Subsystem

As with all other operating systems described today, it is important to ensure that performance is not bound by physical I/O rates. Be sure that random I/Os do not exceed the physical limitations of the disk drives. Refer to Days 14 and 15 for details.

With UNIX, you have the choice of using the UNIX file system for your data storage or the RAW device interface. This choice is not always an easy one. The RAW device interface is more difficult to manage but provides a higher level of performance. File-system files are much easier to use but have more overhead associated with them.

File System

Using the UNIX file system is easier than using RAW devices. Using file-system files, Oracle simply creates the file. However, when using the file system, Oracle must contend with the UNIX disk-caching system and use synchronous writes to ensure that the write request does not return to the DBWR or LGWR before it has actually written the data to disk. With the UNIX file system, there is also the overhead of the data being read into the UNIX disk cache and then being copied to the SGA. This arrangement causes additional overhead on reads. Finally, when you use the file system, you are not guaranteed to have contiguous blocks on the disk--in fact, you are almost guaranteed not to have contiguous blocks.

RAW Device Interface

The RAW device interface allows for the least amount of overhead you can achieve with UNIX I/Os. When using the RAW device interface, UNIX simply assigns a section of the disk to each RAW device. This portion of the disk is contiguous; accesses to it bypass all disk caching and file-system overhead.

RAW devices are not as easy to manage because each device is considered one big chunk of data for the operating system. Backup and recovery must be handled slightly differently because file copies do not work, and the size of the raw device cannot be changed after it is created. Backup operations must be done via the UNIX DD command or by a third-party backup utility that supports raw devices.

RAW devices give greater performance with less overhead and are fully supported by Oracle. Whether you use RAW devices is a decision you must make based on ease of use and increased performance.

Asynchronous I/O

With UNIX, AIO is not always enabled. It is necessary to enable AIO in both the OS and in Oracle. By using AIO, the DBWR can manage many I/Os at once, eliminating the need for multiple DBWR processes. List I/O allows the DBWR to pass to the OS a list of AIO commands, reducing the number of calls it must make.


NOTE: If asynchronous I/O is not available to you, you can compensate by adding DBWR processes. You should have one or two DBWR processes per data disk. Use the parameter DB_WRITERS to increase the number of DBWR processes.

For some UNIX implementations, the following Oracle initialization parameters for asynchronous I/O should also be set:

  • USE_ASYNC_IO--This parameter tells Oracle that the DBWR should use asynchronous I/O. Set it to TRUE.
  • LGWR_USE_ASYNC_IO--This parameter tells Oracle that the LGWR should use asynchronous I/O. Set it to TRUE.

You should always use asynchronous I/O (if possible). When you use asynchronous I/O, you can keep the number of DBWR processes to one and therefore reduce process overhead.

Hardware Tuning

The I/O system should be designed and implemented with the following goals in mind:

  • Isolate sequential I/O--By isolating sequential I/O so that it is purely sequential to the disk, you can greatly enhance throughput. Any random I/O to these disks degrades performance. Writes to the redo log are sequential.
  • Spread out random I/O--Random I/O performance can be increased by adding more disk drives to the system and spreading out the I/Os among the disks. I/Os to the datafiles are typically random (especially in OLTP systems).

If you follow these guidelines and plan your system so that your disk drives can support the amount of disk I/O demanded of them, I/O should not be a problem.

Application Tuning

One of the most common areas that performance can be enhanced is in the application itself. SQL statements can often be redesigned to dramatically improve performance. I recently worked on a project where one of the engineers redesigned the SQL statements of a set of queries that took more than four hours to run to more effectively use indexes; after the engineer made the changes, the queries took less than 10 minutes to run.

Here are a few things that can be done to improve the application itself. My other book, Oracle Performance Tuning and Optimization (Sams Publishing) covers these topics in great detail.

  • Use EXPLAIN PLAN--By using EXPLAIN PLAN, you can analyze the execution plan the optimizer has chosen for the SQL statement.

  • Use SQL*Trace--By using SQL*Trace, you can analyze the execution of the SQL statement and determine any potential bottlenecks.

  • Tune SQL statements--You can improve performance by configuring SQL statements to take advantage of such things as indexes, clusters, and hash clusters.

  • Use the Oracle optimizer--Take advantage of the optimizer to improve performance.

  • Use procedures and packages--Use procedures, functions, and packages to improve performance.

  • Provide for data integrity--Consider the importance of data integrity and how to optimally provide for it.

  • Use hints--Use hints to take advantage of information you know about the data and application.

If you tune the SQL statements and application to take advantage of indexes, clusters, and other features, you should see significant improvement in performance.

Summary

Today you looked at various ways of determining whether some sort of performance bottleneck exists, and if one does, how to isolate that bottleneck and provide solutions. The lesson began by defining the term bottleneck. You were also presented with my performance tuning methodology. If you follow these steps (or steps of your own), you should be able to find the bottleneck or at least find a way to work around it.

Today's lesson also provided a brief overview into the operation and function of your OS. Next, you saw a way of looking at the system in terms of performance. How you monitor the system was also shown today. Much can be learned from the NT Performance Monitor and sar, and much can be learned from Oracle itself using UTLBSTAT and UTLESTAT. With these tools, you can narrow the focus of your performance investigations and determine potential changes to your system.

What's Next?

Tomorrow's lesson is the last of the 21 days of instruction on the ins and outs of Oracle8. This book finishes up by teaching you the specifics of the Oracle8 networking system. You will learn how to configure SQL*Net and use the Oracle Name Server. Also, you will learn how to configure and use the Oracle Web server. Other miscellaneous topics will also be covered in the final lesson of the book.

Q&A

Q What is a bottleneck?

A The term bottleneck refers to a component in the system that is slowing down the whole system.

Q What type of operating systems are NT and UNIX?

A Windows NT and UNIX are virtual memory operating systems. This means that the memory used by a single process can be larger than the physical memory in the system. When more memory is needed than is available, some of the memory is paged out.

Q What sort of things are important when Troubleshooting a performance bottleneck?

A There are many important things to keep in mind, but I feel that of primary importance is that you follow a methodology. By using some sort of method of Troubleshooting, you can develop tests that can help you resolve the issues involved.

Q Should you always believe your results?

A No. Be skeptical. If something doesn't seem right, try another test.

Workshop

The workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to provide you with experience in using what you've learned. The answers to the quiz questions can be found in Appendix A, "Answers."

Quiz

1. Name some areas that could be potential bottlenecks.

2. What is the best tool to use to look at the Oracle shared pool, cache-hit ratio?

3. How many random I/Os per second per disk drive are recommended before latencies start to increase?

4. How many sequential I/Os per second per disk drive are recommended before latencies start to increase?

5. What is disk latency?

6. How do you calculate the buffer cache-hit ratio?

7. How do you know whether the shared pool is large enough?

8. What two factors make an index effective?

9. Which is run first: UTLBSTAT.SQL or UTLESTAT.SQL?

10. Why might you use a large block size?

Exercises

1. Turn on diskperf and reboot your system. Use the Performance Monitor to look at physical disk statistics.

2. Use UTLBSTAT.SQL and UTLESTAT.SQL and analyze the results.

3. Check out your system's block buffer cache-hit ratio.

4. Determine the cache-hit ratio on your data-dictionary cache and your library cache.


Previous chapterNext chapterContents


Macmillan Computer Publishing USA

© Copyright, Macmillan Computer Publishing. All rights reserved.



  • Главная
  • Новости
  • Новинки
  • Скрипты
  • Форум
  • Ссылки
  • О сайте




  • Emanual.ru – это сайт, посвящённый всем значимым событиям в IT-индустрии: новейшие разработки, уникальные методы и горячие новости! Тонны информации, полезной как для обычных пользователей, так и для самых продвинутых программистов! Интересные обсуждения на актуальные темы и огромная аудитория, которая может быть интересна широкому кругу рекламодателей. У нас вы узнаете всё о компьютерах, базах данных, операционных системах, сетях, инфраструктурах, связях и программированию на популярных языках!
     Copyright © 2001-2024
    Реклама на сайте