Teach Yourself SQL in 21 Days, Second Edition
- Day 6 -
Joining Tables
Objectives
Today you will learn about joins. This information will enable you to gather and
manipulate data across several tables. By the end of the day, you will understand
and be able to do the following:
- Perform an outer join
- Perform a left join
- Perform a right join
- Perform an equi-join
- Perform a non-equi-join
- Join a table to itself
Introduction
One of the most powerful features of SQL is its capability to gather and manipulate
data from across several tables. Without this feature you would have to store all
the data elements necessary for each application in one table. Without common tables
you would need to store the same data in several tables. Imagine having to redesign,
rebuild, and repopulate your tables and databases every time your user needed a query
with a new piece of information. The JOIN statement of SQL enables you to
design smaller, more specific tables that are easier to maintain than larger tables.
Multiple Tables in a Single SELECT Statement
Like Dorothy in The Wizard of Oz, you have had the power to join tables since
Day 2, "Introduction to the Query: The SELECT Statement," when
you learned about SELECT and FROM. Unlike Dorothy, you don't have
to click you heels together three times to perform a join. Use the following two
tables, named, cleverly enough, TABLE1 and TABLE2.
NOTE: The queries in today's examples
were produced using Borland's ISQL tool. You will notice some differences between
these queries and the ones that we used earlier in the book. For example, these queries
do not begin with an SQL prompt. Another difference is that ISQL does not require
a semicolon at the end of the statement. (The semicolon is optional in ISQL.) But
the SQL basics are still the same.
INPUT:
SELECT *
FROM TABLE1
OUTPUT:
ROW REMARKS
========== =======
row 1 Table 1
row 2 Table 1
row 3 Table 1
row 4 Table 1
row 5 Table 1
row 6 Table 1
INPUT:
SELECT *
FROM TABLE2
OUTPUT:
ROW REMARKS
========== ========
row 1 table 2
row 2 table 2
row 3 table 2
row 4 table 2
row 5 table 2
row 6 table 2
To join these two tables, type this:
INPUT:
SELECT *
FROM TABLE1,TABLE2
OUTPUT:
ROW REMARKS ROW REMARKS
========== ========== ========== ========
row 1 Table 1 row 1 table 2
row 1 Table 1 row 2 table 2
row 1 Table 1 row 3 table 2
row 1 Table 1 row 4 table 2
row 1 Table 1 row 5 table 2
row 1 Table 1 row 6 table 2
row 2 Table 1 row 1 table 2
row 2 Table 1 row 2 table 2
row 2 Table 1 row 3 table 2
row 2 Table 1 row 4 table 2
row 2 Table 1 row 5 table 2
row 2 Table 1 row 6 table 2
row 3 Table 1 row 1 table 2
row 3 Table 1 row 2 table 2
row 3 Table 1 row 3 table 2
row 3 Table 1 row 4 table 2
row 3 Table 1 row 5 table 2
row 3 Table 1 row 6 table 2
row 4 Table 1 row 1 table 2
row 4 Table 1 row 2 table 2
row 4 Table 1 row 3 table 2
row 4 Table 1 row 4 table 2
row 4 Table 1 row 5 table 2
row 4 Table 1 row 6 table 2
row 5 Table 1 row 1 table 2
row 5 Table 1 row 2 table 2
row 5 Table 1 row 3 table 2
row 5 Table 1 row 4 table 2
row 5 Table 1 row 5 table 2
row 5 Table 1 row 6 table 2
row 6 Table 1 row 1 table 2
row 6 Table 1 row 2 table 2
row 6 Table 1 row 3 table 2
row 6 Table 1 row 4 table 2
row 6 Table 1 row 5 table 2
row 6 Table 1 row 6 table 2
Thirty-six rows! Where did they come from? And what kind of join is this?
ANALYSIS:
A close examination of the result of your first join shows that each row from
TABLE1 was added to each row from TABLE2. An extract from this
join shows what happened:
OUTPUT:
ROW REMARKS ROW REMARKS
===== ========== ========= ========
row 1 Table 1 row 1 table 2
row 1 Table 1 row 2 table 2
row 1 Table 1 row 3 table 2
row 1 Table 1 row 4 table 2
row 1 Table 1 row 5 table 2
row 1 Table 1 row 6 table 2
Notice how each row in TABLE2 was combined with row 1 in TABLE1.
Congratulations! You have performed your first join. But what kind of join? An inner
join? an outer join? or what? Well, actually this type of join is called a cross-join.
A cross-join is not normally as useful as the other joins covered today, but this
join does illustrate the basic combining property of all joins: Joins bring tables
together.
Suppose you sold parts to bike shops for a living. When you designed your database,
you built one big table with all the pertinent columns. Every time you had a new
requirement, you added a new column or started a new table with all the old data
plus the new data required to create a specific query. Eventually, your database
would collapse from its own weight--not a pretty sight. An alternative design, based
on a relational model, would have you put all related data into one table. Here's
how your customer table would look:
INPUT:
SELECT *
FROM CUSTOMER
OUTPUT:
NAME ADDRESS STATE ZIP PHONE REMARKS
========== ========== ====== ========== ========= ==========
TRUE WHEEL 55O HUSKER NE 58702 555-4545 NONE
BIKE SPEC CPT SHRIVE LA 45678 555-1234 NONE
LE SHOPPE HOMETOWN KS 54678 555-1278 NONE
AAA BIKE 10 OLDTOWN NE 56784 555-3421 JOHN-MGR
JACKS BIKE 24 EGLIN FL 34567 555-2314 NONE
ANALYSIS:
This table contains all the information you need to describe your customers. The
items you sold would go into another table:
INPUT:
SELECT *
FROM PART
OUTPUT:
PARTNUM DESCRIPTION PRICE
=========== ==================== ===========
54 PEDALS 54.25
42 SEATS 24.50
46 TIRES 15.25
23 MOUNTAIN BIKE 350.45
76 ROAD BIKE 530.00
10 TANDEM 1200.00
And the orders you take would have their own table:
INPUT:
SELECT *
FROM ORDERS
OUTPUT:
ORDEREDON NAME PARTNUM QUANTITY REMARKS
=========== ========== =========== =========== =======
15-MAY-1996 TRUE WHEEL 23 6 PAID
19-MAY-1996 TRUE WHEEL 76 3 PAID
2-SEP-1996 TRUE WHEEL 10 1 PAID
30-JUN-1996 TRUE WHEEL 42 8 PAID
30-JUN-1996 BIKE SPEC 54 10 PAID
30-MAY-1996 BIKE SPEC 10 2 PAID
30-MAY-1996 BIKE SPEC 23 8 PAID
17-JAN-1996 BIKE SPEC 76 11 PAID
17-JAN-1996 LE SHOPPE 76 5 PAID
1-JUN-1996 LE SHOPPE 10 3 PAID
1-JUN-1996 AAA BIKE 10 1 PAID
1-JUL-1996 AAA BIKE 76 4 PAID
1-JUL-1996 AAA BIKE 46 14 PAID
11-JUL-1996 JACKS BIKE 76 14 PAID
One advantage of this approach is that you can have three specialized people or
departments responsible for maintaining their own data. You don't need a database
administrator who is conversant with all aspects of your project to shepherd one
gigantic, multidepartmental database. Another advantage is that in the age of networks,
each table could reside on a different machine. People who understand the data could
maintain it, and it could reside on an appropriate machine (rather than that nasty
corporate mainframe protected by legions of system administrators).
Now join PARTS and ORDERS:
INPUT/OUTPUT:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM,
P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P
ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION
=========== ========== =========== ========= ============
15-MAY-1996 TRUE WHEEL 23 54 PEDALS
19-MAY-1996 TRUE WHEEL 76 54 PEDALS
2-SEP-1996 TRUE WHEEL 10 54 PEDALS
30-JUN-1996 TRUE WHEEL 42 54 PEDALS
30-JUN-1996 BIKE SPEC 54 54 PEDALS
30-MAY-1996 BIKE SPEC 10 54 PEDALS
30-MAY-1996 BIKE SPEC 23 54 PEDALS
17-JAN-1996 BIKE SPEC 76 54 PEDALS
17-JAN-1996 LE SHOPPE 76 54 PEDALS
1-JUN-1996 LE SHOPPE 10 54 PEDALS
1-JUN-1996 AAA BIKE 10 54 PEDALS
1-JUL-1996 AAA BIKE 76 54 PEDALS
1-JUL-1996 AAA BIKE 46 54 PEDALS
11-JUL-1996 JACKS BIKE 76 54 PEDALS
...
ANALYSIS:
The preceding code is just a portion of the result set. The actual set is 14 (number
of rows in ORDERS) x 6 (number of rows in PART), or 84 rows. It
is similar to the result from joining TABLE1 and TABLE2 earlier
today, and it is still one statement shy of being useful. Before we reveal that statement,
we need to regress a little and talk about another use for the alias.
Finding the Correct Column
When you joined TABLE1 and TABLE2, you used SELECT *,
which returned all the columns in both tables. In joining ORDERS to PART,
the SELECT statement is a bit more complicated:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM,
P.PARTNUM, P.DESCRIPTION
SQL is smart enough to know that ORDEREDON and NAME exist only
in ORDERS and that DESCRIPTION exists only in PART, but
what about PARTNUM, which exists in both? If you have a column that has
the same name in two tables, you must use an alias in your SELECT clause
to specify which column you want to display. A common technique is to assign a single
character to each table, as you did in the FROM clause:
FROM ORDERS O, PART P
You use that character with each column name, as you did in the preceding SELECT
clause. The SELECT clause could also be written like this:
SELECT ORDEREDON, NAME, O.PARTNUM, P.PARTNUM, DESCRIPTION
But remember, someday you might have to come back and maintain this query. It
doesn't hurt to make it more readable. Now back to the missing statement.
Equi-Joins
An extract from the PART/ORDERS join provides a clue as to what is missing:
30-JUN-1996 TRUE WHEEL 42 54 PEDALS
30-JUN-1996 BIKE SPEC 54 54 PEDALS
30-MAY-1996 BIKE SPEC 10 54 PEDALS
Notice the PARTNUM fields that are common to both tables. What if you
wrote the following?
INPUT:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM,
P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P
WHERE O.PARTNUM = P.PARTNUM
OUTPUT:
ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION
=========== ========== =========== ========= ==============
1-JUN-1996 AAA BIKE 10 10 TANDEM
30-MAY-1996 BIKE SPEC 10 10 TANDEM
2-SEP-1996 TRUE WHEEL 10 10 TANDEM
1-JUN-1996 LE SHOPPE 10 10 TANDEM
30-MAY-1996 BIKE SPEC 23 23 MOUNTAIN BIKE
15-MAY-1996 TRUE WHEEL 23 23 MOUNTAIN BIKE
30-JUN-1996 TRUE WHEEL 42 42 SEATS
1-JUL-1996 AAA BIKE 46 46 TIRES
30-JUN-1996 BIKE SPEC 54 54 PEDALS
1-JUL-1996 AAA BIKE 76 76 ROAD BIKE
17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE
19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE
11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE
17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
ANALYSIS:
Using the column PARTNUM that exists in both of the preceding tables,
you have just combined the information you had stored in the ORDERS table
with information from the PART table to show a description of the parts
the bike shops have ordered from you. The join that was used is called an equi-join
because the goal is to match the values of a column in one table to the corresponding
values in the second table.
You can further qualify this query by adding more conditions in the WHERE
clause. For example:
INPUT/OUTPUT:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM,
P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P
WHERE O.PARTNUM = P.PARTNUM
AND O.PARTNUM = 76
ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION
=========== ========== =========== ========== ============
1-JUL-1996 AAA BIKE 76 76 ROAD BIKE
17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE
19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE
11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE
17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
The number 76 is not very descriptive, and you wouldn't want your sales
people to have to memorize a part number. (We have had the misfortune to see many
data information systems in the field that require the end user to know some obscure
code for something that had a perfectly good name. Please don't write one of those!)
Here's another way to write the query:
INPUT/OUTPUT:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM,
P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P
WHERE O.PARTNUM = P.PARTNUM
AND P.DESCRIPTION = 'ROAD BIKE'
ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION
=========== ========== =========== ========== ============
1-JUL-1996 AAA BIKE 76 76 ROAD BIKE
17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE
19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE
11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE
17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
Along the same line, take a look at two more tables to see how they can be joined.
In this example the employee_id column should obviously be unique. You could
have employees with the same name, they could work in the same department, and earn
the same salary. However, each employee would have his or her own employee_id.
To join these two tables, you would use the employee_id column.
EMPLOYEE_TABLE |
EMPLOYEE_PAY_TABLE |
employee_id |
employee_id |
last_name |
salary |
first_name |
department |
middle_name |
supervisor |
|
marital_status |
INPUT:
SELECT E.EMPLOYEE_ID, E.LAST_NAME, EP.SALARY
FROM EMPLOYEE_TBL E,
EMPLOYEE_PAY_TBL EP
WHERE E.EMPLOYEE_ID = EP.EMPLOYEE_ID
AND E.LAST_NAME = 'SMITH';
OUTPUT:
E.EMPLOYEE_ID E.LAST_NAME EP.SALARY
============= =========== =========
13245 SMITH 35000.00
TIP: When you join two tables without
the use of a WHERE clause, you are performing a Cartesian join. This join
combines all rows from all the tables in the FROM clause. If each table
has 200 rows, then you will end up with 40,000 rows in your results (200 x 200).
Always join your tables in the WHERE clause unless you have a real need
to join all the rows of all the selected tables.
Back to the original tables. Now you are ready to use all this information about
joins to do something really useful: finding out how much money you have made from
selling road bikes:
INPUT/OUTPUT:
SELECT SUM(O.QUANTITY * P.PRICE) TOTAL
FROM ORDERS O, PART P
WHERE O.PARTNUM = P.PARTNUM
AND P.DESCRIPTION = 'ROAD BIKE'
TOTAL
===========
19610.00
ANALYSIS:
With this setup, the sales people can keep the ORDERS table updated,
the production department can keep the PART table current, and you can find
your bottom line without redesigning your database.
NOTE: Notice the consistent use of table
and column aliases in the SQL statement examples. You will save many, many keystrokes
by using aliases. They also help to make your statement more readable.
Can you join more than one table? For example, to generate information to send
out an invoice, you could type this statement:
INPUT/OUTPUT:
SELECT C.NAME, C.ADDRESS, (O.QUANTITY * P.PRICE) TOTAL
FROM ORDER O, PART P, CUSTOMER C
WHERE O.PARTNUM = P.PARTNUM
AND O.NAME = C.NAME
NAME ADDRESS TOTAL
========== ========== ===========
TRUE WHEEL 55O HUSKER 1200.00
BIKE SPEC CPT SHRIVE 2400.00
LE SHOPPE HOMETOWN 3600.00
AAA BIKE 10 OLDTOWN 1200.00
TRUE WHEEL 55O HUSKER 2102.70
BIKE SPEC CPT SHRIVE 2803.60
TRUE WHEEL 55O HUSKER 196.00
AAA BIKE 10 OLDTOWN 213.50
BIKE SPEC CPT SHRIVE 542.50
TRUE WHEEL 55O HUSKER 1590.00
BIKE SPEC CPT SHRIVE 5830.00
JACKS BIKE 24 EGLIN 7420.00
LE SHOPPE HOMETOWN 2650.00
AAA BIKE 10 OLDTOWN 2120.00
You could make the output more readable by writing the statement like this:
INPUT/OUTPUT:
SELECT C.NAME, C.ADDRESS,
O.QUANTITY * P.PRICE TOTAL
FROM ORDERS O, PART P, CUSTOMER C
WHERE O.PARTNUM = P.PARTNUM
AND O.NAME = C.NAME
ORDER BY C.NAME
NAME ADDRESS TOTAL
========== ========== ===========
AAA BIKE 10 OLDTOWN 213.50
AAA BIKE 10 OLDTOWN 2120.00
AAA BIKE 10 OLDTOWN 1200.00
BIKE SPEC CPT SHRIVE 542.50
BIKE SPEC CPT SHRIVE 2803.60
BIKE SPEC CPT SHRIVE 5830.00
BIKE SPEC CPT SHRIVE 2400.00
JACKS BIKE 24 EGLIN 7420.00
LE SHOPPE HOMETOWN 2650.00
LE SHOPPE HOMETOWN 3600.00
TRUE WHEEL 55O HUSKER 196.00
TRUE WHEEL 55O HUSKER 2102.70
TRUE WHEEL 55O HUSKER 1590.00
TRUE WHEEL 55O HUSKER 1200.00
NOTE: Notice that when joining the three
tables (ORDERS, PART, and CUSTOMER) that the ORDERS
table was used in two joins and the other tables were used only once. Tables that
will return the fewest rows with the given conditions are commonly referred to as
driving tables, or base tables. Tables other than the base table in a query are usually
joined to the base table for more efficient data retrieval. Consequently, the ORDERS
table is the base table in this example. In most databases a few base tables join
(either directly or indirectly) all the other tables. (See Day 15, "Streamlining
SQL Statements for Improved Performance," for more on base tables.)
You can make the previous query more specific, thus more useful, by adding the
DESCRIPTION column as in the following example:
INPUT/OUTPUT:
SELECT C.NAME, C.ADDRESS,
O.QUANTITY * P.PRICE TOTAL,
P.DESCRIPTION
FROM ORDERS O, PART P, CUSTOMER C
WHERE O.PARTNUM = P.PARTNUM
AND O.NAME = C.NAME
ORDER BY C.NAME
NAME ADDRESS TOTAL DESCRIPTION
========== ========== =========== ==============
AAA BIKE 10 OLDTOWN 213.50 TIRES
AAA BIKE 10 OLDTOWN 2120.00 ROAD BIKE
AAA BIKE 10 OLDTOWN 1200.00 TANDEM
BIKE SPEC CPT SHRIVE 542.50 PEDALS
BIKE SPEC CPT SHRIVE 2803.60 MOUNTAIN BIKE
BIKE SPEC CPT SHRIVE 5830.00 ROAD BIKE
BIKE SPEC CPT SHRIVE 2400.00 TANDEM
JACKS BIKE 24 EGLIN 7420.00 ROAD BIKE
LE SHOPPE HOMETOWN 2650.00 ROAD BIKE
LE SHOPPE HOMETOWN 3600.00 TANDEM
TRUE WHEEL 55O HUSKER 196.00 SEATS
TRUE WHEEL 55O HUSKER 2102.70 MOUNTAIN BIKE
TRUE WHEEL 55O HUSKER 1590.00 ROAD BIKE
TRUE WHEEL 55O HUSKER 1200.00 TANDEM
ANALYSIS:
This information is a result of joining three tables. You can now use this information
to create an invoice.
NOTE: In the example at the beginning
of the day, SQL grouped TABLE1 and TABLE2 to create a new table
with X (rows in TABLE1) x Y (rows in TABLE2) number of rows. A
physical table is not created by the join, but rather in a virtual sense. The join
between the two tables produces a new set that meets all conditions in the WHERE
clause, including the join itself. The SELECT statement has reduced the
number of rows displayed, but to evaluate the WHERE clause SQL still creates
all the possible rows. The sample tables in today's examples have only a handful
of rows. Your actual data may have thousands of rows. If you are working on a platform
with lots of horsepower, using a multiple-table join might not visibly affect performance.
However, if you are working in a slower environment, joins could cause a significant
slowdown.
We aren't telling you not to use joins, because you have seen the advantages to
be gained from a relational design. Just be aware of the platform you are using and
your customer's requirements for speed versus reliability.
Non-Equi-Joins
Because SQL supports an equi-join, you might assume that SQL also has a non-equi-join.
You would be right! Whereas the equi-join uses an = sign in the WHERE
statement, the non-equi-join uses everything but an = sign. For example:
INPUT:
SELECT O.NAME, O.PARTNUM, P.PARTNUM,
O.QUANTITY * P.PRICE TOTAL
FROM ORDERS O, PART P
WHERE O.PARTNUM > P.PARTNUM
OUTPUT:
NAME PARTNUM PARTNUM TOTAL
========== =========== =========== ===========
TRUE WHEEL 76 54 162.75
BIKE SPEC 76 54 596.75
LE SHOPPE 76 54 271.25
AAA BIKE 76 54 217.00
JACKS BIKE 76 54 759.50
TRUE WHEEL 76 42 73.50
BIKE SPEC 54 42 245.00
BIKE SPEC 76 42 269.50
LE SHOPPE 76 42 122.50
AAA BIKE 76 42 98.00
AAA BIKE 46 42 343.00
JACKS BIKE 76 42 343.00
TRUE WHEEL 76 46 45.75
BIKE SPEC 54 46 152.50
BIKE SPEC 76 46 167.75
LE SHOPPE 76 46 76.25
AAA BIKE 76 46 61.00
JACKS BIKE 76 46 213.50
TRUE WHEEL 76 23 1051.35
TRUE WHEEL 42 23 2803.60
...
ANALYSIS:
This listing goes on to describe all the rows in the join WHERE O.PARTNUM
> P.PARTNUM. In the context of your bicycle shop, this information
doesn't have much meaning, and in the real world the equi-join is far more common
than the non-equi-join. However, you may encounter an application in which a non-equi-join
produces the perfect result.
Outer Joins versus Inner Joins
Just as the non-equi-join balances the equi-join, an outer join complements the
inner join. An inner join is where the rows of the tables are combined with each
other, producing a number of new rows equal to the product of the number of rows
in each table. Also, the inner join uses these rows to determine the result of the
WHERE clause. An outer join groups the two tables in a slightly different
way. Using the PART and ORDERS tables from the previous examples,
perform the following inner join:
INPUT:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE,
O.NAME, O.PARTNUM
FROM PART P
JOIN ORDERS O ON ORDERS.PARTNUM = 54
OUTPUT:
PARTNUM DESCRIPTION PRICE NAME PARTNUM
======= ==================== =========== ========== ===========
54 PEDALS 54.25 BIKE SPEC 54
42 SEATS 24.50 BIKE SPEC 54
46 TIRES 15.25 BIKE SPEC 54
23 MOUNTAIN BIKE 350.45 BIKE SPEC 54
76 ROAD BIKE 530.00 BIKE SPEC 54
10 TANDEM 1200.00 BIKE SPEC 54
NOTE: The syntax you used to get this
join--JOIN ON--is not ANSI standard. The implementation you used
for this example has additional syntax. You are using it here to specify an inner
and an outer join. Most implementations of SQL have similar extensions. Notice the
absence of the WHERE clause in this type of join.
ANALYSIS:
The result is that all the rows in PART are spliced on to specific rows
in ORDERS where the column PARTNUM is 54. Here's a RIGHT
OUTER JOIN statement:
INPUT/OUTPUT:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE,
O.NAME, O.PARTNUM
FROM PART P
RIGHT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54
PARTNUM DESCRIPTION PRICE NAME PARTNUM
======= ==================== ======= ============== =======
<null> <null> <null> TRUE WHEEL 23
<null> <null> <null> TRUE WHEEL 76
<null> <null> <null> TRUE WHEEL 10
<null> <null> <null> TRUE WHEEL 42
54 PEDALS 54.25 BIKE SPEC 54
42 SEATS 24.50 BIKE SPEC 54
46 TIRES 15.25 BIKE SPEC 54
23 MOUNTAIN BIKE 350.45 BIKE SPEC 54
76 ROAD BIKE 530.00 BIKE SPEC 54
10 TANDEM 1200.00 BIKE SPEC 54
<null> <null> <null> BIKE SPEC 10
<null> <null> <null> BIKE SPEC 23
<null> <null> <null> BIKE SPEC 76
<null> <null> <null> LE SHOPPE 76
<null> <null> <null> LE SHOPPE 10
<null> <null> <null> AAA BIKE 10
<null> <null> <null> AAA BIKE 76
<null> <null> <null> AAA BIKE 46
<null> <null> <null> JACKS BIKE 76
ANALYSIS:
This type of query is new. First you specified a RIGHT OUTER JOIN, which
caused SQL to return a full set of the right table, ORDERS, and to place
nulls in the fields where ORDERS.PARTNUM <> 54. Following is a LEFT
OUTER JOIN statement:
INPUT/OUTPUT:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE,
O.NAME, O.PARTNUM
FROM PART P
LEFT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54
PARTNUM DESCRIPTION PRICE NAME PARTNUM
======= ================== =========== ========== ===========
54 PEDALS 54.25 BIKE SPEC 54
42 SEATS 24.50 BIKE SPEC 54
46 TIRES 15.25 BIKE SPEC 54
23 MOUNTAIN BIKE 350.45 BIKE SPEC 54
76 ROAD BIKE 530.00 BIKE SPEC 54
10 TANDEM 1200.00 BIKE SPEC 54
ANALYSIS:
You get the same six rows as the INNER JOIN. Because you specified LEFT
(the LEFT table), PART determined the number of rows you would
return. Because PART is smaller than ORDERS, SQL saw no need to
pad those other fields with blanks.
Don't worry too much about inner and outer joins. Most SQL products determine
the optimum JOIN for your query. In fact, if you are placing your query
into a stored procedure (or using it inside a program (both stored procedures and
Embedded SQL covered on Day 13, "Advanced SQL Topics"), you should not
specify a join type even if your SQL implementation provides the proper syntax. If
you do specify a join type, the optimizer chooses your way instead of the optimum
way.
Some implementations of SQL use the + sign instead of an OUTER
JOIN statement. The + simply means "Show me everything even
if something is missing." Here's the syntax:
SYNTAX:
SQL> select e.name, e.employee_id, ep.salary,
ep.marital_status
from e,ployee_tbl e,
employee_pay_tbl ep
where e.employee_id = ep.employee_id(+)
and e.name like '%MITH';
ANALYSIS:
This statement is joining the two tables. The + sign on the ep.employee_id
column will return all rows even if they are empty.
Joining a Table to Itself
Today's final topic is the often-used technique of joining a table to itself.
The syntax of this operation is similar to joining two tables. For example, to join
table TABLE1 to itself, type this:
INPUT:
SELECT *
FROM TABLE1, TABLE1
OUTPUT:
ROW REMARKS ROW REMARKS
========== ========== ========== ========
row 1 Table 1 row 1 Table 1
row 1 Table 1 row 2 Table 1
row 1 Table 1 row 3 Table 1
row 1 Table 1 row 4 Table 1
row 1 Table 1 row 5 Table 1
row 1 Table 1 row 6 Table 1
row 2 Table 1 row 1 Table 1
row 2 Table 1 row 2 Table 1
row 2 Table 1 row 3 Table 1
row 2 Table 1 row 4 Table 1
row 2 Table 1 row 5 Table 1
row 2 Table 1 row 6 Table 1
row 3 Table 1 row 1 Table 1
row 3 Table 1 row 2 Table 1
row 3 Table 1 row 3 Table 1
row 3 Table 1 row 4 Table 1
row 3 Table 1 row 5 Table 1
row 3 Table 1 row 6 Table 1
row 4 Table 1 row 1 Table 1
row 4 Table 1 row 2 Table 1
...
ANALYSIS:
In its complete form, this join produces the same number of combinations as joining
two 6-row tables. This type of join could be useful to check the internal consistency
of data. What would happen if someone fell asleep in the production department and
entered a new part with a PARTNUM that already existed? That would be bad
news for everybody: Invoices would be wrong; your application would probably blow
up; and in general you would be in for a very bad time. And the cause of all your
problems would be the duplicate PARTNUM in the following table:
INPUT/OUTPUT:
SELECT * FROM PART
PARTNUM DESCRIPTION PRICE
=========== ==================== ===========
54 PEDALS 54.25
42 SEATS 24.50
46 TIRES 15.25
23 MOUNTAIN BIKE 350.45
76 ROAD BIKE 530.00
10 TANDEM 1200.00
76 CLIPPLESS SHOE 65.00 <-NOTE SAME #
You saved your company from this bad situation by checking PART before
anyone used it:
INPUT/OUTPUT:
SELECT F.PARTNUM, F.DESCRIPTION,
S.PARTNUM,S.DESCRIPTION
FROM PART F, PART S
WHERE F.PARTNUM = S.PARTNUM
AND F.DESCRIPTION <> S.DESCRIPTION
PARTNUM DESCRIPTION PARTNUM DESCRIPTION
========== ======================== ======= ============
76 ROAD BIKE 76 CLIPPLESS SHOE
76 CLIPPLESS SHOE 76 ROAD BIKE
ANALYSIS:
Now you are a hero until someone asks why the table has only two entries. You,
remembering what you have learned about JOINs, retain your hero status by
explaining how the join produced two rows that satisfied the condition WHERE
F.PARTNUM = S.PARTNUM AND F.DESCRIPTION <> S.DESCRIPTION. Of course, at
some point, the row of data containing the duplicate PARTNUM would have
to be corrected.
Summary
Today you learned that a join combines all possible combinations of rows present
in the selected tables. These new rows are then available for selection based on
the information that you want.
Congratulations--you have learned almost everything there is to know about the
SELECT clause. The one remaining item, subqueries, is covered tomorrow (Day
7, "Subqueries: The Embedded SELECT Statement").
Q&A
- Q Why cover outer, inner, left, and right joins when I probably won't ever
use them?
A A little knowledge is a dangerous thing, and no knowledge can be expensive.
You now know enough to understand the basics of what your SQL engine might try while
optimizing you queries.
Q How many tables can you join on?
A That depends on the implementation. Some implementations have a 25-table
limit, whereas others have no limit. Just remember, the more tables you join on,
the slower the response time will be. To be safe, check your implementation to find
out the maximum number of tables allowed in a query.
Q Would it be fair to say that when tables are joined, they actually become
one table?
A Very simply put, that is just about what happens. When you join the tables,
you can select from any of the columns in either table.
Workshop
The Workshop provides quiz questions to help solidify your understanding of the
material covered, as well as exercises to provide you with experience in using what
you have learned. Try to answer the quiz and exercise questions before checking the
answers in Appendix F, "Answers to Quizzes and Exercises."
Quiz
- 1. How many rows would a two-table join produce if one table had 50,000
rows and the other had 100,000?
2. What type of join appears in the following SELECT statement?
select e.name, e.employee_id, ep.salary
from employee_tbl e,
employee_pay_tbl ep
where e.employee_id = ep.employee_id;
- 3. Will the following SELECT statements work?
a. select name, employee_id,
salary
from employee_tbl e,
employee_pay_tbl ep
where employee_id = employee_id
and name like '%MITH';
b. select e.name, e.employee_id,
ep.salary
from employee_tbl e,
employee_pay_tbl ep
where name like '%MITH';
c. select e.name, e.employee_id,
ep.salary
from employee_tbl e,
employee_pay_tbl ep
where e.employee_id = ep.employee_id
and e.name like '%MITH';
- 4. In the WHERE clause, when joining the tables, should you do
the join first or the conditions?
5. In joining tables are you limited to one-column joins, or can you join
on more than one column?
Exercises
- 1. In the section on joining tables to themselves, the last example returned
two combinations. Rewrite the query so only one entry comes up for each redundant
part number.
2. Rewrite the following query to make it more readable and shorter.
INPUT:
select orders.orderedon, orders.name, part.partnum,
part.price, part.description from orders, part
where orders.partnum = part.partnum and orders.orderedon
between '1-SEP-96' and '30-SEP-96'
order by part.partnum;
- 3. From the PART table and the ORDERS table, make up
a query that will return the following:
OUTPUT:
ORDEREDON NAME PARTNUM QUANTITY
================== ================== ======= ========
2-SEP-96 TRUE WHEEL 10 1
© Copyright, Macmillan Computer Publishing. All
rights reserved.
|