Teach Yourself Oracle 8 In 21 Days
- Day 19 -
Advanced Oracle Options
Today you will learn about some of the advanced options available to Oracle, including
the Oracle Parallel Server option (OPS) and the Oracle Parallel Query option. Although
these options sound as though they would be similar to each other, they have no correlation.
New Term: The Oracle Parallel Server
option, an add-on feature, allows more than one instance of Oracle to share the same
database; two or more systems can simultaneously access the same data, providing
a performance boost and failover capabilities where one node can take over for another
in the event of a failure. This linking of systems to form one larger system has
traditionally been called a cluster.
NOTE: The term cluster has historically
been used to identify a large system that performs a single task and is made up of
two or more smaller systems working in tandem.
Numerous vendors have recently introduced "clustering" systems that
consist of an offline standby node or another node running a completely different
application that can take over some tasks in the event the primary node fails. In
my opinion, this is not a clustering solution. Because vendors are piggybacking on
cluster terminology, the end-user is being deceived. Vendors such as Oracle, Tandem,
DEC, and others have had true clusters for many years and are being cheated by these
standby systems.
My rule of thumb is that if you cannot access all data from all nodes all the time,
it is not a cluster.
The Oracle parallel server cluster has been available for many years, but has
recently been introduced on the NT platform. If you use the features of the Parallel
Server option, both performance and system uptime can be improved.
The Oracle Parallel Query option is not a clustering option; it has nothing to
do with clustering (except that you can use it on a cluster). This option allows
certain SQL operations to be parallelized. By parallelizing these operations, you
can improve performance by utilizing the time the system waits for I/O operations
to complete.
The Parallel Query option can provide a good-to-outstanding performance improvement
over traditional query operations. Today you will learn how to use the Parallel Query
option as well as how to tune your application to better perform with the Parallel
Query option.
The Parallel Server Option
The Parallel Server option is one of the most innovative and impressive options
available from Oracle. With this option you can cluster several computers using a
shared-disk subsystem and have multiple Oracle instances access the same database,
as shown in Figure 19.1. If your application is suitable, you can see very good scalability
by adding extra computers.
New Term: The Oracle Parallel Server
option uses a sophisticated locking mechanism in conjunction with a shared-disk subsystem
to allow multiple instances to access the same data. Communication between the computers
occurs through a server interconnect, which usually consists of high-speed
network accesses at a very low level. Using the traditional network stack does not
provide the performance required for a server interconnect.
Figure 19.1.
A parallel-server configuration.
Server Interconnect
The server interconnect provides two functions: to communicate locking information
and to act as a system heartbeat. The system heartbeat communicates to other systems
in the cluster that the server is still operational. If the heartbeat message does
not arrive, other servers in the cluster assume that the system is nonfunctional
and roll back transactions that have not been committed.
The shared-disk subsystem allows all members of the cluster to access all shared
data in the cluster. This disk subsystem must be shared and should be available at
all times, regardless of the state of any particular server. This subsystem can use
RAID and data caching as long as both operations are performed on the disk side of
the shared channel rather than on the controller side. Because the release of a lock
sometimes depends on data being written out to the shared disk, the more performance
you have, the better off you are.
Locking
Locking is performed with a process called the Distributed Lock Manager (DLM).
The DLM is responsible for locking data that is being modified so that the data cannot
be modified in another instance. Locking ensures data integrity across the entire
cluster. A data block or group of blocks is locked until another instance needs that
data.
If you can partition your users so that users accessing data in a particular table
all use the same instance to access that data, you will have reduced lock contention.
You can enhance performance by carefully partitioning the data and the users. If
you partition the data into update-intensive and read-intensive tables, you will
also benefit.
At instance startup, a number of Parallel Cache Management (PCM) locks are created.
PCM locks lock data blocks being accessed within each instance to guarantee that
multiple instances do not alter the same data.
You can use PCM locks to lock data blocks for reading or for updating. If a PCM
lock is used as a read-lock, other instances can acquire read-locks on the same data
blocks. It is only when updating that an exclusive lock must be acquired.
PCM locks are allocated to datafiles; as such, they give you some flexibility
over the configuration of the locks. A PCM lock locks one or more data blocks, depending
on the number of PCM locks allocated to the datafile and the size of the datafile.
Because an inherent overhead is associated with PCM locks, it is not beneficial to
overconfigure the locks.
If you know your data-access patterns, you can configure your system based on
these general rules:
- Partition work between servers. Try to balance the systems so that users accessing
the same table reside on the same computer. This arrangement reduces lock contention
between machines. By segmenting the work, you can reduce the amount of lock traffic.
Remember that once a lock is acquired, it is released only when another system needs
to lock that data.
- Put lots of PCM locks on tables with heavy update traffic. If you have lots of
updates, you can benefit from lowering the blocks-per-lock ratio. By increasing the
number of locks, you increase overhead--but by having fewer blocks per lock, you
can cut down on the percentage of locks with contention.
- Use PCTFREE and PCTUSED to specify fewer rows per block on
high-contention tables. By doing this and decreasing the number of blocks per lock,
you reduce the lock contention--at the cost of more locks and more space required.
- Put fewer locks on read tables. If you have tables that are mostly read, use
fewer PCM locks. Read locks are not exclusive; the reduction in locks cuts down on
interconnect traffic.
- Partition indexes to separate tablespaces. Because indexes are mostly read, you
can benefit by requiring fewer PCM locks. By segmenting the tables, you can put fewer
PCM locks on the index tables and more on the data tables.
The dynamic performance tables V$BH, V$CACHE, and V$PING
contain information about the frequency of PCM lock contention. By looking at the
FREQUENCY column in these tables, you can get an idea of the number of times
lock conversions took place because of contention between instances.
The dynamic performance table, V$LOCK_ACTIVITY, provides information
about all types of PCM lock conversions. From this information you can determine
whether a particular instance is seeing a dramatic change in lock activity. An increase
in lock activity might indicate that you have an insufficient number of PCM locks
on that instance. With this information, you can use the V$BH, V$CACHE,
and V$PING tables to identify the problem area.
The Parallel Server option can be effective if your application is partitionable.
If all the users in your system must access the same data, a parallel server might
not be for you. But if you can partition your workload into divisions based on table
access or if you need a fault-tolerant configuration, the Parallel Server option
may work.
If you use the Parallel Server option, you must take special care to properly
configure the system. By designing the system properly, you can take maximum advantage
of the parallel server features.
The Parallel Query Option
The Oracle Parallel Query option makes it possible for some Oracle functions to
be processed by multiple server processes. These functions are queries, index creation,
data loading, and recovery. In each of these functions, the general principle is
the same: Keep processing while Oracle waits for I/O.
For most queries, the time spent waiting for the data to be retrieved from disk
usually overshadows the amount of time actually spent processing the results. With
the Parallel Query option, you can compensate for this by using several server processes
to execute the query. While one process is waiting for I/Os to complete, other processes
can be executing. If you are running on a Symmetric Multiprocessor (SMP) computer,
a cluster, or a Massively Parallel Processing (MPP) machine, you can take maximum
advantage of the Parallel Query option.
Many different types of SQL functions can take advantage of the Parallel Query
option, including
- Aggregatations
- The CREATE INDEX function
- The CREATE TABLE AS SELECT function
- The DELETE function
- The GROUP BY function
- Hash joins
- The instance recovery operation
- Nested loop joins
- The ORDER BY function
- Sort merge joins
- Table scans
- Union operations
- The UPDATE function
These operations and several others will automatically be parallelized based on
the database-initialization parameters and table definitions.
Parallel Query Processing
Parallel query processing allows certain Oracle statements to be run in parallel
by multiple server processes. The Oracle server can process the following statements
in parallel:
- SELECT statements
- Subqueries in UPDATE and DELETE statements
- CREATE TABLEtablename AS SELECT statements
- CREATE INDEX statements
Parallel queries are effective on large operations such as table scans and sorts.
Parallel Query Operation
With traditional queries such as table scans, the server process reads the data
sequentially, as shown in Figure 19.2. Much of the time spent in this query is spent
waiting for I/Os to complete.
Figure 19.2.
A table scan performed without the use of the Parallel Query option.
New Term: A parallel query splits the
query into several pieces, each handled by a different server process. These processes,
called query servers, are dispatched by a process known as the query coordinator.
The query coordinator dispatches the query servers and coordinates the results from
all the servers to send back to the user. The result of this arrangement is that
many smaller table scans take place under the hood (transparent to the user). From
the user's standpoint, it is simply a much faster table scan. Figure 19.3 shows a
parallel query.
Figure 19.3.
A table scan performed with the use of the Parallel Query option.
The query coordinator is given a SQL statement and a degree of parallelism, and
is responsible for dividing the query among the query servers and integrating the
individual results into one result. The degree of parallelism is the number of query
servers assigned to the particular query.
The Oracle server can make parallel the following operations:
Each of these operations has requirements that determine how the query is parallelized.
The performance achieved by the parallel query is determined both by the size of
the data to be accessed and the degree of parallelism achieved.
The query coordinator determines how the query is parallelized (if at all). The
decision is made in this order:
- 1. The optimizer determines the execution plan of the statement.
2. The query coordinator determines which operations can be performed in
parallel.
3. The query coordinator determines how many query servers to enlist.
4. The query coordinator enlists query servers that perform the query.
5. The query coordinator reassembles the resulting data and passes it back
to the user.
The degree of parallelism is determined using the following precedence:
- 1. Query hints--User-defined hints included in the SQL statement have
the highest precedence.
2. Table definition--The default degree of parallelism as defined for the
table has second precedence.
3. Initialization parameters--Finally, the Oracle initialization parameters
are used.
Regardless of what these values are set to, the number of query servers cannot
exceed the number of query servers available in the query-server pool. This number
is specified by the Oracle initialization parameter PARALLEL_MAX_SERVERS.
Hints for the degree of parallelism are set within a comment string in the SQL
statement. The syntax of this comment is as follows:
PARALLEL (alias_or_tablename , [integer/DEFAULT] [, integer/DEFAULT])
The PARALLEL hint specifies the table or alias being scanned, followed
by a value for the number of query servers to be used (or DEFAULT). The
final optional value specifies how the table is to be split among different instances
of a parallel server. Here is an example using the dogs table that you have
seen throughout this book:
SELECT /*+ FULL(dogs) PARALLEL(dogs, 4) */
dogname
FROM dogs;
Because of the addition of the FULL and PARALLEL hints to this
statement, the Oracle optimizer will create an execution plan that uses a full-table
scan. Furthermore, this table scan will be executed with a parallel degree of 4
if the query servers are available. This statement overrides both the degree of parallelism
specified in the table definition and the default Oracle initialization parameters.
The hint NOPARALLEL disables parallel scanning of a table and overrides
the specified degree of parallelism. The NOPARALLEL hint has the following
syntax:
NOPARALLEL (alias_or_tablename)
Parallel Query Tuning
Parallel query operations can be very effective on multiprocessor or parallel-processing
computers; they can also be effective on uniprocessor systems where much of the time
is spent waiting for I/O operations to complete. Systems with sufficient I/O bandwidth--and
especially systems with disk arrays--benefit from parallel query operations.
If your system is typically processing at 100% of your CPU utilization and you
have a small number of disk drives, you probably won't benefit from parallel query
operations. The same is true if your system is extremely limited in memory.
The two areas that can be tuned for parallel queries are I/O and parallel servers.
By properly configuring your datafiles, you can help parallel queries be more effective.
I/O Configuration
The function of a parallel query is to split up query operations so that they
more effectively take advantage of the system. One of the ways a parallel query does
this is by allowing the processing of the query to continue while pieces of the query
operation are stalled, waiting for I/Os to complete. Parallel queries are not effective
if the entire table is limited to one disk drive.
If you stripe the table across many drives, I/Os can be distributed and a higher
level of parallelism can occur. Striping can be performed with OS striping, with
Oracle striping, or (better yet) with a hardware disk array. See Day 20, "Effectively
Tuning and Optimizing the Database," for more detailed information about tuning
I/O.
Large contiguous extents can also help performance in parallel query operations.
During scan operations, the query coordinator splits contiguous ranges of blocks
into large, medium, and small groups of blocks. Each query server is given a large
group of blocks to start with, and each server progressively works its way down to
the small group of blocks until the scan is completed. This is done in an attempt
to balance the load handled by each query server. If several large extents are in
a table, the query coordinator can find blocks to dispatch to the query servers much
more easily.
TIP: Remember, your temporary tablespace
should consist of several large extents on a striped volume. This arrangement helps
sorting performance.
Degree of Parallelism
The proper distribution of I/Os and the degree of parallelism are the two most
important things to tune in the Parallel Query option. Tuning the degree of parallelism
is partially trial and error and partially analysis. It is very important to take
notes when you are experimenting with the degree of parallelism. Your first guess
should be based on the following factors:
- The CPU capacity of your system--The number and capacity of CPUs affects the
number of query processes you should run.
- The capacity of the system to handle large numbers of processes--Some operating
systems can handle many simultaneous threads; others are more limited.
- The system load--If the system is already running at 100% capacity, the degree
of parallelism doesn't have much effect. If you are running at 90%, too many query
processes can overload the system.
- The amount of query processing on the system--If most operations are updates
but there are a few critical queries, you might want many query processes.
- The I/O capacity of the system--If your disks are striped or if you are using
a disk array, you should be able to handle a large number of parallel queries.
- The types of operations--Are you performing many full-table scans or sorts? These
operations benefit greatly from parallel query servers.
All these parameters should have some influence on the degree of parallelism you
set for your system. Remember that the preceding points are simply guidelines to
help you guess a starting point. Here are a few other suggestions:
- CPU-intensive operations such as sorts should indicate a lower degree of parallelism.
CPU-bound tasks are already taking advantage of the CPUs and tend not to be waiting
for I/O.
- Disk-intensive operations such as full-table scans should indicate a higher degree
of parallelism. The more operations waiting for I/O, the more the system can benefit
from another query server.
- Many concurrent processes should indicate a lower degree of parallelism. Too
many processes can overload the system.
Once you determine your starting point, you can monitor your system by querying
the dynamic performance table, V$PQ_SYSSTAT. This can be performed with
the query shown in Listing 19.1.
INPUT:
Listing 19.1. Viewing parallel query performance data
from V$PQ_SYSSTAT.
SQL> select * from v$pq_sysstat;
OUTPUT:
STATISTIC VALUE
------------------------------ ----------
Servers Busy 0
Servers Idle 12
Servers Highwater 16
Server Sessions 380
Servers Started 4
Servers Shutdown 4
Servers Cleaned Up 0
Queries Initiated 21
DFO Trees 77
Local Msgs Sent 2459361
Distr Msgs Sent 0
Local Msgs Recv'd 2459318
Distr Msgs Recv'd 0
13 rows selected.
ANLYSIS: When looking at the output from this query, the following statistics
are quite useful:
- Servers Busy--This indicates the number of servers busy at any one time.
Check this statistic several times to get a good idea of the average value. If the
value is equal to the initialization parameter PARALLEL_MIN_SERVERS, you
have probably configured too many query servers.
- Servers IdlE--This indicates the number of servers idle at any one time.
If you always have many idle servers, consider reducing PARALLEL_MIN_SERVERS.
- Servers Starter--This indicates the number of query servers that have
started in this instance. If the value for Servers Busy is low but you see
a large number for Servers Started, you might be using query servers sporadically.
- Servers Shutdown--This indicates the number of query servers that have
been shut down because they are idle. This value is most likely similar to the Servers
Started value.
After you determine your degree of parallelism, begin testing; evaluate the information
you get from V$PQ_SYSSTAT and from your operating system's monitoring facilities.
Keep an eye out for CPU usage and excessive waiting for I/O. If the CPU usage is
too high, try reducing the degree of parallelism. If the CPU usage is too low and
there is significant waiting for I/O, try increasing the degree of parallelism.
Remember that the degree of parallelism is determined by SQL hints, table definitions,
and initialization parameters. The total number of query servers is determined by
the initialization parameter PARALLEL_MAX_SERVERS; the number started initially
is determined by the initialization parameter PARALLEL_MIN_SERVERS.
The total number of query servers in use is the number of queries executed in
parallel multiplied by their degree of parallelism. If you try to use more than PARALLEL_MAX_SERVERS,
you will not be able to parallelize your query.
Direct-Write Sorts
You can use the Direct Write Sort option with the Parallel Query option and have
each query server perform its own direct writes.
As you saw earlier, using direct writes causes the server processes to write the
output of sort operations directly to disk, bypassing the buffer cache. The effect
of direct writes is that for sort operations, large amounts of block buffers are
not ejected from the buffer cache. This leaves the buffer cache available for normal
queries and updates. When you use direct-write sorts with the Parallel Query option,
each query server gets its own set of direct-write buffers.
Remember, direct-write sorts require more memory than normal sorts. The amount
of memory that these sorts use with the Parallel Query option can be determined with
the following formula:
- Direct write sort memory = (number of query servers) * SORT_WRITE_BUFFERS
* SORT_WRITE_BUFFER_SIZE
Use direct-write sorts only if you have sufficient memory and temporary disk space.
The temporary disk space should have a sufficient I/O bandwidth to handle the load.
Parallel Index Creation
Another feature of the Parallel Query option is its ability to create indexes
in parallel. With the parallel index creation feature, the time it takes to create
an index can be greatly reduced.
As with parallel query processing, a coordinator process dispatches two sets of
query servers. One set of query servers scans the table to be indexed to obtain the
ROWIDs and column values needed for the index. Another set performs the
sorting on those values and passes the results to the coordinator process. The coordinator
process then assembles the B*-tree index from these sorted items.
When creating an index, the degree of parallelism follows the same precedence
as it does in parallel query processing. The first value used is an optional PARALLEL
clause in the CREATE INDEX statement, followed by the table definition,
and finally the initialization parameters.
Creating an index in parallel can be several times faster than creating an index
by normal means. The same conditions apply for index creation as were given for parallel
query processing. A system that has been configured to take advantage of parallel
query processing will also see good performance from parallel index creation.
Parallel Loading
You can load in parallel by having multiple concurrent sessions perform a direct
path load into the same table. Depending on the configuration of the system, you
can see excellent load performance by loading in parallel. Because loading is both
CPU and I/O intensive, you should see good results in an SMP or MPP environment with
a high-bandwidth I/O subsystem.
Parallel loads are performed by multiple direct loader processes, each using the
PARALLEL=TRUE and DIRECT=TRUE options. When you specify PARALLEL=TRUE,
the loader does not place an exclusive lock on the table being loaded as it would
otherwise. During the parallel load, the loader creates temporary segments for each
of the concurrent processes and merges them on completion.
Although parallel loading performs best when each temporary file is located on
a separate disk, the increased performance of the load does not usually justify the
complexity of the manual striping needed to do this. I still recommend striping the
tables on an OS level--or preferably on a hardware disk. You can improve performance
by putting each of the input files on a separate volume to take advantage of the
sequential nature of the reads.
Parallel loading can be beneficial, especially if load time is critical in your
environment. By putting each of the input files on separate disk volumes, you can
increase performance. Overall, the general tuning principles used in parallel query
processing are also valid in parallel loading.
Parallel Recovery
Parallel recovery is probably my favorite feature of the Parallel Query option.
When benchmarking Oracle and testing hardware and software, it is often necessary
to intentionally crash the system to prove recoverability. With the Parallel Recovery
option, the time it takes to perform an instance recovery can be dramatically reduced.
Recovery time is significantly reduced when the system being recovered has many
disks and supports asynchronous I/O. For a small system that has few drives or for
an operating system that does not support asynchronous I/O, it might not be wise
to enable parallel recovery.
In traditional recovery, one process reads from the redo log files and applies
changes to the datafiles, as shown in Figure 19.4. This operation can take a significant
amount of time because the recovery process must wait for disk I/Os to complete.
Figure 19.4.
Traditional instance recovery without parallel query.
With the Parallel Recovery option, one process is responsible for reading and
dispatching redo entries from the redo log files and passing those entries to the
recovery processes that apply the changes to the datafiles, as shown in Figure 19.5.
Because the dispatcher process reads sequentially from the redo log files, the
I/O performance is much higher than that of the recovery processes that are writing
random data throughout the datafiles. Because writing the data is very seek intensive,
it is a good idea to have one or two recovery processes for each data disk in the
system.
By having more recovery processes, you can have more outstanding I/Os and thus
use all the data drives simultaneously. Because recovery occurs at instance startup,
this arrangement reduces dead time when no other database processing can be performed.
Figure 19.5.
Instance recovery with parallel recovery.
The number of concurrent recovery processes is set with the initialization parameter
RECOVERY_PARALLEL. The value of this parameter cannot exceed the value specified
in the initialization parameter PARALLEL_MAX_SERVERS.
If you specify a sufficient number of recovery servers, you will see an immediate
improvement in instance recovery time. Do not use parallel recovery if your system
does not support asynchronous I/O or if you are limited to a few disk drives. If
your I/O subsystem is high bandwidth and your data is properly striped (either through
software or hardware), you should see very good improvement.
In summary, the Parallel Query option is useful for distributing processing loads
so that CPUs process while other processes wait for I/Os to complete. With multiprocessor
machines, the Parallel Query option can be quite beneficial; this is not to say that
the option is not beneficial on uniprocessor machines as well.
NOTE: Probably the biggest performance
problem I have come across is a lack of disk drives. As larger and larger disks are
produced at lower and lower prices, many installations end up with I/O problems because
a few large disks can provide the space of many smaller disks. The larger disks provide
more disk space but not much more performance (per disk). The Parallel Query option
can help only in systems where I/O is not a bottleneck. When I/O is not a problem,
you will see significant gains from parallel queries.
If you have processes waiting for queries to complete and a sufficient number of
disk drives, you will see an improvement with parallel queries, regardless of whether
you are on a multiprocessor or uniprocessor system.
Summary
Today you learned about several features available to Oracle, including the Oracle
Parallel Query and Oracle Parallel Server options. These features are quite different
from each other and serve completely different purposes.
The Oracle Parallel Server option (OPS), a clustering option, can be used as a
performance option and as a fault-tolerant option. Because multiple systems can access
the same database, a significant performance enhancement can be seen in some situations.
NOTE: Under some conditions, the Oracle
Parallel Server option can provide a significant performance increase. The performance
improvement you see is very dependent on your application and data layout. Without
careful planning, you might not see any performance improvement, and you might even
see performance degradation.
The Oracle Parallel Query option allows certain SQL operations to be parallelized,
thus reducing idle time waiting for I/Os to complete. If you take advantage of the
Parallel Query option, you can see very good performance enhancements under the right
conditions. The Parallel Query option is probably my favorite feature of the Oracle
RDBMS because such incredible results can be seen from the right parallelism.
What's Next?
In tomorrow's lesson, "Effectively Tuning and Optimizing the Database,"
you will learn to tune the Oracle RDBMS, and how to use OS and RDBMS information
to make changes that will improve your system's performance. You will learn how to
use the Oracle administration scripts UTLBSTAT.SQL and UTLESTAT.SQL
to obtain Oracle performance information, and you will learn how to interpret some
of that information.
Q&A
- Q What is the Oracle Parallel Server option?
A The Oracle Parallel Server option is a feature of Oracle that allows
you to cluster two or more systems sharing the same database.
Q Does the Parallel Server option require any special hardware?
A Yes. The Parallel Server option requires a server interconnect that might
be as simple as a fast network card and a shared-disk subsystem. The shared-disk
subsystem is provided by your hardware vendor.
Q What is the Parallel Query option?
A The Parallel Query option allows certain SQL operations to be parallelized.
If you parallelize the operation, time spent waiting for I/O by one thread of operation
can be spent processing another thread of operation.
Q Do single-processor systems benefit from the Parallel Query option?
A Definitely. Because the greatest benefit of the Parallel Query option
is to keep the system busy while it is waiting for I/Os to complete, this will work
well even with a single-CPU system.
Workshop
The workshop provides quiz questions to help you solidify your understanding of
the material covered and exercises to provide you with experience in using what you've
learned. Find answers to the quiz questions in Appendix A, "Answers."
Quiz
- 1. What is the Oracle Parallel Server option used for?
2. What is the server interconnect for?
3. What is a shared-disk subsystem?
4. What does DLM stand for?
5. What does PCM stand for?
6. What is the Parallel Query option used for?
7. How does the Parallel Query option improve performance?
8. Name the primary way to set the degree of parallelism.
9. Name the secondary way to set the degree of parallelism.
10. Name some operations that can be parallelized.
Exercises
- 1. Run a query using traditional methods while timing it.
2. Run the same query using a parallel degree of 2 set in a hint.
Compare the time.
3. Run the same query using a parallel degree of 4 set in a hint.
Compare the time.
4. Run the same query using a parallel degree of 8 set in a hint.
Compare the time.
© Copyright, Macmillan Computer Publishing. All
rights reserved.
|