Teach Yourself SQL in 21 Days, Second Edition
- Day 4 -
Functions: Molding the Data You Retrieve
Objectives
Today we talk about functions. Functions in SQL enable you to perform feats such
as determining the sum of a column or converting all the characters of a string to
uppercase. By the end of the day, you will understand and be able to use all the
following:
- Aggregate functions
- Date and time functions
- Arithmetic functions
- Character functions
- Conversion functions
- Miscellaneous functions
These functions greatly increase your ability to manipulate the information you
retrieved using the basic functions of SQL that were described earlier this week.
The first five aggregate functions, COUNT, SUM, AVG, MAX,
and MIN, are defined in the ANSI standard. Most implementations of SQL have
extensions to these aggregate functions, some of which are covered today. Some implementations
may use different names for these functions.
Aggregate Functions
These functions are also referred to as group functions. They return a value based
on the values in a column. (After all, you wouldn't ask for the average of a single
field.) The examples in this section use the table TEAMSTATS:
INPUT:
SQL> SELECT * FROM TEAMSTATS;
OUTPUT:
NAME POS AB HITS WALKS SINGLES DOUBLES TRIPLES HR SO
--------- --- --- ---- ----- ------- ------- ------- -- --
JONES 1B 145 45 34 31 8 1 5 10
DONKNOW 3B 175 65 23 50 10 1 4 15
WORLEY LF 157 49 15 35 8 3 3 16
DAVID OF 187 70 24 48 4 0 17 42
HAMHOCKER 3B 50 12 10 10 2 0 0 13
CASEY DH 1 0 0 0 0 0 0 1
6 rows selected.
COUNT
The function COUNT returns the number of rows that satisfy the condition
in the WHERE clause. Say you wanted to know how many ball players were hitting
under 350. You would type
INPUT/OUTPUT:
SQL> SELECT COUNT(*)
2 FROM TEAMSTATS
3 WHERE HITS/AB < .35;
COUNT(*)
--------
4
To make the code more readable, try an alias:
INPUT/OUTPUT:
SQL> SELECT COUNT(*) NUM_BELOW_350
2 FROM TEAMSTATS
3 WHERE HITS/AB < .35;
NUM_BELOW_350
-------------
4
Would it make any difference if you tried a column name instead of the asterisk?
(Notice the use of parentheses around the column names.) Try this:
INPUT/OUTPUT:
SQL> SELECT COUNT(NAME) NUM_BELOW_350
2 FROM TEAMSTATS
3 WHERE HITS/AB < .35;
NUM_BELOW_350
-------------
4
The answer is no. The NAME column that you selected was not involved
in the WHERE statement. If you use COUNT without a WHERE
clause, it returns the number of records in the table.
INPUT/OUTPUT:
SQL> SELECT COUNT(*)
2 FROM TEAMSTATS;
COUNT(*)
---------
6
SUM
SUM does just that. It returns the sum of all values in a column. To
find out how many singles have been hit, type
INPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES
2 FROM TEAMSTATS;
OUTPUT:
TOTAL_SINGLES
-------------
174
To get several sums, use
INPUT/OUTPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR
2 FROM TEAMSTATS;
TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
174 32 5 29
To collect similar information on all 300 or better players, type
INPUT/OUTPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR
2 FROM TEAMSTATS
3 WHERE HITS/AB >= .300;
TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
164 30 5 29
To compute a team batting average, type
INPUT/OUTPUT:
SQL> SELECT SUM(HITS)/SUM(AB) TEAM_AVERAGE
2 FROM TEAMSTATS;
TEAM_AVERAGE
------------
.33706294
SUM works only with numbers. If you try it on a nonnumerical field, you
get
INPUT/OUTPUT:
SQL> SELECT SUM(NAME)
2 FROM TEAMSTATS;
ERROR:
ORA-01722: invalid number
no rows selected
This error message is logical because you cannot sum a group of names.
AVG
The AVG function computes the average of a column. To find the average
number of strike outs, use this:
INPUT:
SQL> SELECT AVG(SO) AVE_STRIKE_OUTS
2 FROM TEAMSTATS;
OUTPUT:
AVE_STRIKE_OUTS
---------------
16.166667
The following example illustrates the difference between SUM and AVG:
INPUT/OUTPUT:
SQL> SELECT AVG(HITS/AB) TEAM_AVERAGE
2 FROM TEAMSTATS;
TEAM_AVERAGE
------------
.26803448
ANALYSIS:
The team was batting over 300 in the previous example! What happened? AVG
computed the average of the combined column hits divided by at bats, whereas the
example with SUM divided the total number of hits by the number of at bats.
For example, player A gets 50 hits in 100 at bats for a .500 average. Player B gets
0 hits in 1 at bat for a 0.0 average. The average of 0.0 and 0.5 is .250. If you
compute the combined average of 50 hits in 101 at bats, the answer is a respectable
.495. The following statement returns the correct batting average:
INPUT/OUTPUT:
SQL> SELECT AVG(HITS)/AVG(AB) TEAM_AVERAGE
2 FROM TEAMSTATS;
TEAM_AVERAGE
------------
.33706294
Like the SUM function, AVG works only with numbers.
MAX
If you want to find the largest value in a column, use MAX. For example,
what is the highest number of hits?
INPUT:
SQL> SELECT MAX(HITS)
2 FROM TEAMSTATS;
OUTPUT:
MAX(HITS)
---------
70
Can you find out who has the most hits?
INPUT/OUTPUT:
SQL> SELECT NAME
2 FROM TEAMSTATS
3 WHERE HITS = MAX(HITS);
ERROR at line 3:
ORA-00934: group function is not allowed here
Unfortunately, you can't. The error message is a reminder that this group function
(remember that aggregate functions are also called group functions)
does not work in the WHERE clause. Don't despair, Day 7, "Subqueries:
The Embedded SELECT Statement," covers the concept of subqueries and
explains a way to find who has the MAX hits.
What happens if you try a nonnumerical column?
INPUT/OUTPUT:
SQL> SELECT MAX(NAME)
2 FROM TEAMSTATS;
MAX(NAME)
---------------
WORLEY
Here's something new. MAX returns the highest (closest to Z) string.
Finally, a function that works with both characters and numbers.
MIN
MIN does the expected thing and works like MAX except it returns
the lowest member of a column. To find out the fewest at bats, type
INPUT:
SQL> SELECT MIN(AB)
2 FROM TEAMSTATS;
OUTPUT:
MIN(AB)
---------
1
The following statement returns the name closest to the beginning of the alphabet:
INPUT/OUTPUT:
SQL> SELECT MIN(NAME)
2 FROM TEAMSTATS;
MIN(NAME)
---------------
CASEY
You can combine MIN with MAX to give a range of values. For
example:
INPUT/OUTPUT:
SQL> SELECT MIN(AB), MAX(AB)
2 FROM TEAMSTATS;
MIN(AB) MAX(AB)
-------- --------
1 187
This sort of information can be useful when using statistical functions.
NOTE: As we mentioned in the introduction,
the first five aggregate functions are described in the ANSI standard. The remaining
aggregate functions have become de facto standards, present in all important implementations
of SQL. We use the Oracle7 names for these functions. Other implementations may use
different names.
VARIANCE
VARIANCE produces the square of the standard deviation, a number vital
to many statistical calculations. It works like this:
INPUT:
SQL> SELECT VARIANCE(HITS)
2 FROM TEAMSTATS;
OUTPUT:
VARIANCE(HITS)
--------------
802.96667
If you try a string
INPUT/OUTPUT:
SQL> SELECT VARIANCE(NAME)
2 FROM TEAMSTATS;
ERROR:
ORA-01722: invalid number
no rows selected
you find that VARIANCE is another function that works exclusively with
numbers.
STDDEV
The final group function, STDDEV, finds the standard deviation of a column
of numbers, as demonstrated by this example:
INPUT:
SQL> SELECT STDDEV(HITS)
2 FROM TEAMSTATS;
OUTPUT:
STDDEV(HITS)
------------
28.336666
It also returns an error when confronted by a string:
INPUT/OUTPUT:
SQL> SELECT STDDEV(NAME)
2 FROM TEAMSTATS;
ERROR:
ORA-01722: invalid number
no rows selected
These aggregate functions can also be used in various combinations:
INPUT/OUTPUT:
SQL> SELECT COUNT(AB),
2 AVG(AB),
3 MIN(AB),
4 MAX(AB),
5 STDDEV(AB),
6 VARIANCE(AB),
7 SUM(AB)
8 FROM TEAMSTATS;
COUNT(AB) AVG(AB) MIN(AB) MAX(AB) STDDEV(AB) VARIANCE(AB) SUM(AB)
--------- ------- ------- ------- ---------- ------------ -------
6 119.167 1 187 75.589 5712.97 715
The next time you hear a sportscaster use statistics to fill the time between
plays, you will know that SQL is at work somewhere behind the scenes.
Date and Time Functions
We live in a civilization governed by times and dates, and most major implementations
of SQL have functions to cope with these concepts. This section uses the table PROJECT
to demonstrate the time and date functions.
INPUT:
SQL> SELECT * FROM PROJECT;
OUTPUT:
TASK STARTDATE ENDDATE
-------------- --------- ---------
KICKOFF MTG 01-APR-95 01-APR-95
TECH SURVEY 02-APR-95 01-MAY-95
USER MTGS 15-MAY-95 30-MAY-95
DESIGN WIDGET 01-JUN-95 30-JUN-95
CODE WIDGET 01-JUL-95 02-SEP-95
TESTING 03-SEP-95 17-JAN-96
6 rows selected.
NOTE: This table used the Date data type.
Most implementations of SQL have a Date data type, but the exact syntax may vary.
ADD_MONTHS
This function adds a number of months to a specified date. For example, say something
extraordinary happened, and the preceding project slipped to the right by two months.
You could make a new schedule by typing
INPUT:
SQL> SELECT TASK,
2 STARTDATE,
3 ENDDATE ORIGINAL_END,
4 ADD_MONTHS(ENDDATE,2)
5 FROM PROJECT;
OUTPUT:
TASK STARTDATE ORIGINAL_ ADD_MONTH
-------------- --------- --------- ---------
KICKOFF MTG 01-APR-95 01-APR-95 01-JUN-95
TECH SURVEY 02-APR-95 01-MAY-95 01-JUL-95
USER MTGS 15-MAY-95 30-MAY-95 30-JUL-95
DESIGN WIDGET 01-JUN-95 30-JUN-95 31-AUG-95
CODE WIDGET 01-JUL-95 02-SEP-95 02-NOV-95
TESTING 03-SEP-95 17-JAN-96 17-MAR-96
6 rows selected.
Not that a slip like this is possible, but it's nice to have a function that makes
it so easy. ADD_MONTHS also works outside the SELECT clause. Typing
INPUT:
SQL> SELECT TASK TASKS_SHORTER_THAN_ONE_MONTH
2 FROM PROJECT
3 WHERE ADD_MONTHS(STARTDATE,1) > ENDDATE;
produces the following result:
OUTPUT:
TASKS_SHORTER_THAN_ONE_MONTH
----------------------------
KICKOFF MTG
TECH SURVEY
USER MTGS
DESIGN WIDGET
ANALYSIS:
You will find that all the functions in this section work in more than one place.
However, ADD MONTHS does not work with other data types like character or
number without the help of functions TO_CHAR and TO_DATE, which
are discussed later today.
LAST_DAY
LAST_DAY returns the last day of a specified month. It is for those of
us who haven't mastered the "Thirty days has September..." rhyme--or at
least those of us who have not yet taught it to our computers. If, for example, you
need to know what the last day of the month is in the column ENDDATE, you
would type
INPUT:
SQL> SELECT ENDDATE, LAST_DAY(ENDDATE)
2 FROM PROJECT;
Here's the result:
OUTPUT:
ENDDATE LAST_DAY(ENDDATE)
--------- -----------------
01-APR-95 30-APR-95
01-MAY-95 31-MAY-95
30-MAY-95 31-MAY-95
30-JUN-95 30-JUN-95
02-SEP-95 30-SEP-95
17-JAN-96 31-JAN-96
6 rows selected.
How does LAST DAY handle leap years?
INPUT/OUTPUT:
SQL> SELECT LAST_DAY('1-FEB-95') NON_LEAP,
2 LAST_DAY('1-FEB-96') LEAP
3 FROM PROJECT;
NON_LEAP LEAP
--------- ---------
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
6 rows selected.
ANALYSIS:
You got the right result, but why were so many rows returned? Because you didn't
specify an existing column or any conditions, the SQL engine applied the date functions
in the statement to each existing row. Let's get something less redundant by using
the following:
INPUT:
SQL> SELECT DISTINCT LAST_DAY('1-FEB-95') NON_LEAP,
2 LAST_DAY('1-FEB-96') LEAP
3 FROM PROJECT;
This statement uses the word DISTINCT (see Day 2, "Introduction
to the Query: The SELECT Statement") to produce the singular result
OUTPUT:
NON_LEAP LEAP
--------- ---------
28-FEB-95 29-FEB-96
Unlike me, this function knows which years are leap years. But before you trust
your own or your company's financial future to this or any other function, check
your implementation!
MONTHS_BETWEEN
If you need to know how many months fall between month x and month y, use MONTHS_BETWEEN
like this:
INPUT:
SQL> SELECT TASK, STARTDATE, ENDDATE,MONTHS_BETWEEN(STARTDATE,ENDDATE)
DURATION
2 FROM PROJECT;
OUTPUT:
TASK STARTDATE ENDDATE DURATION
-------------- --------- --------- ---------
KICKOFF MTG 01-APR-95 01-APR-95 0
TECH SURVEY 02-APR-95 01-MAY-95 -.9677419
USER MTGS 15-MAY-95 30-MAY-95 -.483871
DESIGN WIDGET 01-JUN-95 30-JUN-95 -.9354839
CODE WIDGET 01-JUL-95 02-SEP-95 -2.032258
TESTING 03-SEP-95 17-JAN-96 -4.451613
6 rows selected.
Wait a minute--that doesn't look right. Try this:
INPUT/OUTPUT:
SQL> SELECT TASK, STARTDATE, ENDDATE,
2 MONTHS_BETWEEN(ENDDATE,STARTDATE) DURATION
3 FROM PROJECT;
TASK STARTDATE ENDDATE DURATION
-------------- --------- --------- ---------
KICKOFF MTG 01-APR-95 01-APR-95 0
TECH SURVEY 02-APR-95 01-MAY-95 .96774194
USER MTGS 15-MAY-95 30-MAY-95 .48387097
DESIGN WIDGET 01-JUN-95 30-JUN-95 .93548387
CODE WIDGET 01-JUL-95 02-SEP-95 2.0322581
TESTING 03-SEP-95 17-JAN-96 4.4516129
6 rows selected.
ANALYSIS:
That's better. You see that MONTHS_BETWEEN is sensitive to the way you
order the months. Negative months might not be bad. For example, you could use a
negative result to determine whether one date happened before another. For example,
the following statement shows all the tasks that started before May 19, 1995:
INPUT:
SQL> SELECT *
2 FROM PROJECT
3 WHERE MONTHS_BETWEEN('19 MAY 95', STARTDATE) > 0;
OUTPUT:
TASK STARTDATE ENDDATE
-------------- --------- ---------
KICKOFF MTG 01-APR-95 01-APR-95
TECH SURVEY 02-APR-95 01-MAY-95
USER MTGS 15-MAY-95 30-MAY-95
NEW_TIME
If you need to adjust the time according to the time zone you are in, the New_TIME
function is for you. Here are the time zones you can use with this function:
Abbreviation |
Time Zone |
AST or ADT |
Atlantic standard or daylight time |
BST or BDT |
Bering standard or daylight time |
CST or CDT |
Central standard or daylight time |
EST or EDT |
Eastern standard or daylight time |
GMT |
Greenwich mean time |
HST or HDT |
Alaska-Hawaii standard or daylight time |
MST or MDT |
Mountain standard or daylight time |
NST |
Newfoundland standard time |
PST or PDT |
Pacific standard or daylight time |
YST or YDT |
Yukon standard or daylight time |
You can adjust your time like this:
INPUT:
SQL> SELECT ENDDATE EDT,
2 NEW_TIME(ENDDATE, 'EDT','PDT')
3 FROM PROJECT;
OUTPUT:
EDT NEW_TIME(ENDDATE
---------------- ----------------
01-APR-95 1200AM 31-MAR-95 0900PM
01-MAY-95 1200AM 30-APR-95 0900PM
30-MAY-95 1200AM 29-MAY-95 0900PM
30-JUN-95 1200AM 29-JUN-95 0900PM
02-SEP-95 1200AM 01-SEP-95 0900PM
17-JAN-96 1200AM 16-JAN-96 0900PM
6 rows selected.
Like magic, all the times are in the new time zone and the dates are adjusted.
NEXT_DAY
NEXT_DAY finds the name of the first day of the week that is equal to
or later than another specified date. For example, to send a report on the Friday
following the first day of each event, you would type
INPUT:
SQL> SELECT STARTDATE,
2 NEXT_DAY(STARTDATE, 'FRIDAY')
3 FROM PROJECT;
which would return
OUTPUT:
STARTDATE NEXT_DAY(
--------- ---------
01-APR-95 07-APR-95
02-APR-95 07-APR-95
15-MAY-95 19-MAY-95
01-JUN-95 02-JUN-95
01-JUL-95 07-JUL-95
03-SEP-95 08-SEP-95
6 rows selected.
ANALYSIS:
The output tells you the date of the first Friday that occurs after your STARTDATE.
SYSDATE
SYSDATE returns the system time and date:
INPUT:
SQL> SELECT DISTINCT SYSDATE
2 FROM PROJECT;
OUTPUT:
SYSDATE
----------------
18-JUN-95 1020PM
If you wanted to see where you stood today in a certain project, you could type
INPUT/OUTPUT:
SQL> SELECT *
2 FROM PROJECT
3 WHERE STARTDATE > SYSDATE;
TASK STARTDATE ENDDATE
-------------- --------- ---------
CODE WIDGET 01-JUL-95 02-SEP-95
TESTING 03-SEP-95 17-JAN-96
Now you can see what parts of the project start after today.
Arithmetic Functions
Many of the uses you have for the data you retrieve involve mathematics. Most
implementations of SQL provide arithmetic functions similar to the functions covered
here. The examples in this section use the NUMBERS table:
INPUT:
SQL> SELECT *
2 FROM NUMBERS;
OUTPUT:
A B
--------- ---------
3.1415 4
-45 .707
5 9
-57.667 42
15 55
-7.2 5.3
6 rows selected.
ABS
The ABS function returns the absolute value of the number you point to.
For example:
INPUT:
SQL> SELECT ABS(A) ABSOLUTE_VALUE
2 FROM NUMBERS;
OUTPUT:
ABSOLUTE_VALUE
--------------
3.1415
45
5
57.667
15
7.2
6 rows selected.
ABS changes all the negative numbers to positive and leaves positive
numbers alone.
CEIL and FLOOR
CEIL returns the smallest integer greater than or equal to its argument.
FLOOR does just the reverse, returning the largest integer equal to or less
than its argument. For example:
INPUT:
SQL> SELECT B, CEIL(B) CEILING
2 FROM NUMBERS;
OUTPUT:
B CEILING
--------- ---------
4 4
.707 1
9 9
42 42
55 55
5.3 6
6 rows selected.
And
INPUT/OUTPUT:
SQL> SELECT A, FLOOR(A) FLOOR
2 FROM NUMBERS;
A FLOOR
--------- ---------
3.1415 3
-45 -45
5 5
-57.667 -58
15 15
-7.2 -8
6 rows selected.
COS, COSH, SIN, SINH, TAN, and TANH
The COS, SIN, and TAN functions provide support for
various trigonometric concepts. They all work on the assumption that n is in radians.
The following statement returns some unexpected values if you don't realize COS
expects A to be in radians.
INPUT:
SQL> SELECT A, COS(A)
2 FROM NUMBERS;
OUTPUT:
A COS(A)
--------- ---------
3.1415 -1
-45 .52532199
5 .28366219
-57.667 .437183
15 -.7596879
-7.2 .60835131
ANALYSIS:
You would expect the COS of 45 degrees to be in the neighborhood
of .707, not .525. To make this function work the way you would
expect it to in a degree-oriented world, you need to convert degrees to radians.
(When was the last time you heard a news broadcast report that a politician had done
a pi-radian turn? You hear about a 180-degree turn.) Because 360 degrees = 2 pi radians,
you can write
INPUT/OUTPUT:
SQL> SELECT A, COS(A* 0.01745329251994)
2 FROM NUMBERS;
A COS(A*0.01745329251994)
--------- -----------------------
3.1415 .99849724
-45 .70710678
5 .9961947
-57.667 .5348391
15 .96592583
-7.2 .9921147
ANALYSIS:
Note that the number 0.01745329251994 is radians divided by degrees.
The trigonometric functions work as follows:
INPUT/OUTPUT:
SQL> SELECT A, COS(A*0.017453), COSH(A*0.017453)
2 FROM NUMBERS;
A COS(A*0.017453) COSH(A*0.017453)
--------- --------------- ----------------
3.1415 .99849729 1.0015035
-45 .70711609 1.3245977
5 .99619483 1.00381
-57.667 .53485335 1.5507072
15 .96592696 1.0344645
-7.2 .99211497 1.0079058
6 rows selected.
And
INPUT/OUTPUT:
SQL> SELECT A, SIN(A*0.017453), SINH(A*0.017453)
2 FROM NUMBERS;
A SIN(A*0.017453) SINH(A*0.017453)
--------- --------------- ----------------
3.1415 .05480113 .05485607
-45 -.7070975 -.8686535
5 .08715429 .0873758
-57.667 -.8449449 -1.185197
15 .25881481 .26479569
-7.2 -.1253311 -.1259926
6 rows selected.
And
INPUT/OUTPUT:
SQL> SELECT A, TAN(A*0.017453), TANH(A*0.017453)
2 FROM NUMBERS;
A TAN(A*0.017453) TANH(A*0.017453)
--------- --------------- ----------------
3.1415 .05488361 .05477372
-45 -.9999737 -.6557867
5 .08748719 .08704416
-57.667 -1.579769 -.7642948
15 .26794449 .25597369
-7.2 -.1263272 -.1250043
6 rows selected.
EXP
EXP enables you to raise e (e is a mathematical constant
used in various formulas) to a power. Here's how EXP raises e by
the values in column A:
INPUT:
SQL> SELECT A, EXP(A)
2 FROM NUMBERS;
OUTPUT:
A EXP(A)
--------- ---------
3.1415 23.138549
-45 2.863E-20
5 148.41316
-57.667 9.027E-26
15 3269017.4
-7.2 .00074659
6 rows selected.
LN and LOG
These two functions center on logarithms. LN returns the natural logarithm
of its argument. For example:
INPUT:
SQL> SELECT A, LN(A)
2 FROM NUMBERS;
OUTPUT:
ERROR:
ORA-01428: argument '-45' is out of range
Did we neglect to mention that the argument had to be positive? Write
INPUT/OUTPUT:
SQL> SELECT A, LN(ABS(A))
2 FROM NUMBERS;
A LN(ABS(A))
--------- ----------
3.1415 1.1447004
-45 3.8066625
5 1.6094379
-57.667 4.0546851
15 2.7080502
-7.2 1.974081
6 rows selected.
ANALYSIS:
Notice how you can embed the function ABS inside the LN call.
The other logarith-mic function, LOG, takes two arguments, returning the
logarithm of the first argument in the base of the second. The following query returns
the logarithms of column B in base 10.
INPUT/OUTPUT:
SQL> SELECT B, LOG(B, 10)
2 FROM NUMBERS;
B LOG(B,10)
----------- ---------
4 1.660964
.707 -6.640962
9 1.0479516
42 .61604832
55 .57459287
5.3 1.3806894
6 rows selected.
MOD
You have encountered MOD before. On Day 3, "Expressions, Conditions,
and Operators," you saw that the ANSI standard for the modulo operator %
is sometimes implemented as the function MOD. Here's a query that returns
a table showing the remainder of A divided by B:
INPUT:
SQL> SELECT A, B, MOD(A,B)
2 FROM NUMBERS;
OUTPUT:
A B MOD(A,B)
--------- --------- ---------
3.1415 4 3.1415
-45 .707 -.459
5 9 5
-57.667 42 -15.667
15 55 15
-7.2 5.3 -1.9
6 rows selected.
POWER
To raise one number to the power of another, use POWER. In this function
the first argument is raised to the power of the second:
INPUT:
SQL> SELECT A, B, POWER(A,B)
2 FROM NUMBERS;
OUTPUT:
ERROR:
ORA-01428: argument '-45' is out of range
ANALYSIS:
At first glance you are likely to think that the first argument can't be negative.
But that impression can't be true, because a number like -4 can be raised to a power.
Therefore, if the first number in the POWER function is negative, the second
must be an integer. You can work around this problem by using CEIL (or FLOOR):
INPUT:
SQL> SELECT A, CEIL(B), POWER(A,CEIL(B))
2 FROM NUMBERS;
OUTPUT:
A CEIL(B) POWER(A,CEIL(B))
--------- --------- ----------------
3.1415 4 97.3976
-45 1 -45
5 9 1953125
-57.667 42 9.098E+73
15 55 4.842E+64
-7.2 6 139314.07
6 rows selected.
That's better!
SIGN
SIGN returns -1 if its argument is less than 0, 0
if its argument is equal to 0, and 1 if its argument is greater
than 0, as shown in the following example:
INPUT:
SQL> SELECT A, SIGN(A)
2 FROM NUMBERS;
OUTPUT:
A SIGN(A)
--------- ---------
3.1415 1
-45 -1
5 1
-57.667 -1
15 1
-7.2 -1
0 0
7 rows selected.
You could also use SIGN in a SELECT WHERE clause like this:
INPUT:
SQL> SELECT A
2 FROM NUMBERS
3 WHERE SIGN(A) = 1;
OUTPUT:
A
---------
3.1415
5
15
SQRT
The function SQRT returns the square root of an argument. Because the
square root of a negative number is undefined, you cannot use SQRT on negative
numbers.
INPUT/OUTPUT:
SQL> SELECT A, SQRT(A)
2 FROM NUMBERS;
ERROR:
ORA-01428: argument '-45' is out of range
However, you can fix this limitation with ABS:
INPUT/OUTPUT:
SQL> SELECT ABS(A), SQRT(ABS(A))
2 FROM NUMBERS;
ABS(A) SQRT(ABS(A))
--------- ------------
3.1415 1.7724277
45 6.7082039
5 2.236068
57.667 7.5938791
15 3.8729833
7.2 2.6832816
0 0
7 rows selected.
Character Functions
Many implementations of SQL provide functions to manipulate characters and strings
of characters. This section covers the most common character functions. The examples
in this section use the table CHARACTERS.
INPUT/OUTPUT:
SQL> SELECT * FROM CHARACTERS;
LASTNAME FIRSTNAME M CODE
--------------- --------------- - ---------
PURVIS KELLY A 32
TAYLOR CHUCK J 67
CHRISTINE LAURA C 65
ADAMS FESTER M 87
COSTALES ARMANDO A 77
KONG MAJOR G 52
6 rows selected.
CHR
CHR returns the character equivalent of the number it uses as an argument.
The character it returns depends on the character set of the database. For this example
the database is set to ASCII. The column CODE includes numbers.
INPUT:
SQL> SELECT CODE, CHR(CODE)
2 FROM CHARACTERS;
OUTPUT:
CODE CH
--------- --
32
67 C
65 A
87 W
77 M
52 4
6 rows selected.
The space opposite the 32 shows that 32 is a space in the ASCII
character set.
CONCAT
You used the equivalent of this function on Day 3, when you learned about operators.
The || symbol splices two strings together, as does CONCAT. It
works like this:
INPUT:
SQL> SELECT CONCAT(FIRSTNAME, LASTNAME) "FIRST AND LAST NAMES"
2 FROM CHARACTERS;
OUTPUT:
FIRST AND LAST NAMES
------------------------
KELLY PURVIS
CHUCK TAYLOR
LAURA CHRISTINE
FESTER ADAMS
ARMANDO COSTALES
MAJOR KONG
6 rows selected.
ANALYSIS:
Quotation marks surround the multiple-word alias FIRST AND LAST NAMES.
Again, it is safest to check your implementation to see if it allows multiple-word
aliases.
Also notice that even though the table looks like two separate columns, what you
are seeing is one column. The first value you concatenated, FIRSTNAME, is
15 characters wide. This operation retained all the characters in the field.
INITCAP
INITCAP capitalizes the first letter of a word and makes all other characters
lowercase.
INPUT:
SQL> SELECT FIRSTNAME BEFORE, INITCAP(FIRSTNAME) AFTER
2 FROM CHARACTERS;
OUTPUT:
BEFORE AFTER
-------------- ----------
KELLY Kelly
CHUCK Chuck
LAURA Laura
FESTER Fester
ARMANDO Armando
MAJOR Major
6 rows selected.
LOWER and UPPER
As you might expect, LOWER changes all the characters to lowercase; UPPER
does just the reverse.
The following example starts by doing a little magic with the UPDATE
function (you learn more about this next week) to change one of the values to lowercase:
INPUT:
SQL> UPDATE CHARACTERS
2 SET FIRSTNAME = 'kelly'
3 WHERE FIRSTNAME = 'KELLY';
OUTPUT:
1 row updated.
INPUT:
SQL> SELECT FIRSTNAME
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME
---------------
kelly
CHUCK
LAURA
FESTER
ARMANDO
MAJOR
6 rows selected.
Then you write
INPUT:
SQL> SELECT FIRSTNAME, UPPER(FIRSTNAME), LOWER(FIRSTNAME)
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME UPPER(FIRSTNAME LOWER(FIRSTNAME
--------------- --------------- ---------------
kelly KELLY kelly
CHUCK CHUCK chuck
LAURA LAURA laura
FESTER FESTER fester
ARMANDO ARMANDO armando
MAJOR MAJOR major
6 rows selected.
Now you see the desired behavior.
LPAD and
RPAD
LPAD and RPAD take a minimum of two and a maximum of three arguments.
The first argument is the character string to be operated on. The second is the number
of characters to pad it with, and the optional third argument is the character to
pad it with. The third argument defaults to a blank, or it can be a single character
or a character string. The following statement adds five pad characters, assuming
that the field LASTNAME is defined as a 15-character field:
INPUT:
SQL> SELECT LASTNAME, LPAD(LASTNAME,20,'*')
2 FROM CHARACTERS;
OUTPUT:
LASTNAME LPAD(LASTNAME,20,'*'
-------------- --------------------
PURVIS *****PURVIS
TAYLOR *****TAYLOR
CHRISTINE *****CHRISTINE
ADAMS *****ADAMS
COSTALES *****COSTALES
KONG *****KONG
6 rows selected.
ANALYSIS:
Why were only five pad characters added? Remember that the LASTNAME column
is 15 characters wide and that LASTNAME includes the blanks to the right
of the characters that make up the name. Some column data types eliminate padding
characters if the width of the column value is less than the total width allocated
for the column. Check your implementation. Now try the right side:
INPUT:
SQL> SELECT LASTNAME, RPAD(LASTNAME,20,'*')
2 FROM CHARACTERS;
OUTPUT:
LASTNAME RPAD(LASTNAME,20,'*'
--------------- --------------------
PURVIS PURVIS *****
TAYLOR TAYLOR *****
CHRISTINE CHRISTINE *****
ADAMS ADAMS *****
COSTALES COSTALES *****
KONG KONG *****
6 rows selected.
ANALYSIS:
Here you see that the blanks are considered part of the field name for these operations.
The next two functions come in handy in this type of situation.
LTRIM and
RTRIM
LTRIM and RTRIM take at least one and at most two arguments.
The first argument, like LPAD and RPAD, is a character string.
The optional second element is either a character or character string or defaults
to a blank. If you use a second argument that is not a blank, these trim functions
will trim that character the same way they trim the blanks in the following examples.
INPUT:
SQL> SELECT LASTNAME, RTRIM(LASTNAME)
2 FROM CHARACTERS;
OUTPUT:
LASTNAME RTRIM(LASTNAME)
--------------- ---------------
PURVIS PURVIS
TAYLOR TAYLOR
CHRISTINE CHRISTINE
ADAMS ADAMS
COSTALES COSTALES
KONG KONG
6 rows selected.
You can make sure that the characters have been trimmed with the following statement:
INPUT:
SQL> SELECT LASTNAME, RPAD(RTRIM(LASTNAME),20,'*')
2 FROM CHARACTERS;
OUTPUT:
LASTNAME RPAD(RTRIM(LASTNAME)
--------------- --------------------
PURVIS PURVIS**************
TAYLOR TAYLOR**************
CHRISTINE CHRISTINE***********
ADAMS ADAMS***************
COSTALES COSTALES************
KONG KONG****************
6 rows selected.
The output proves that trim is working. Now try LTRIM:
INPUT:
SQL> SELECT LASTNAME, LTRIM(LASTNAME, 'C')
2 FROM CHARACTERS;
OUTPUT:
LASTNAME LTRIM(LASTNAME,
--------------- ---------------
PURVIS PURVIS
TAYLOR TAYLOR
CHRISTINE HRISTINE
ADAMS ADAMS
COSTALES OSTALES
KONG KONG
6 rows selected.
Note the missing Cs in the third and fifth rows.
REPLACE
REPLACE does just that. Of its three arguments, the first is the string
to be searched. The second is the search key. The last is the optional replacement
string. If the third argument is left out or NULL, each occurrence of the
search key on the string to be searched is removed and is not replaced with anything.
INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST') REPLACEMENT
2 FROM CHARACTERS;
OUTPUT:
LASTNAME REPLACEMENT
--------------- ---------------
PURVIS PURVIS
TAYLOR TAYLOR
CHRISTINE CHRIINE
ADAMS ADAMS
COSTALES COALES
KONG KONG
6 rows selected.
If you have a third argument, it is substituted for each occurrence of the search
key in the target string. For example:
INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST','**') REPLACEMENT
2 FROM CHARACTERS;
OUTPUT:
LASTNAME REPLACEMENT
--------------- ------------
PURVIS PURVIS
TAYLOR TAYLOR
CHRISTINE CHRI**INE
ADAMS ADAMS
COSTALES CO**ALES
KONG KONG
6 rows selected.
If the second argument is NULL, the target string is returned with no
changes.
INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, NULL) REPLACEMENT
2 FROM CHARACTERS;
OUTPUT:
LASTNAME REPLACEMENT
--------------- ---------------
PURVIS PURVIS
TAYLOR TAYLOR
CHRISTINE CHRISTINE
ADAMS ADAMS
COSTALES COSTALES
KONG KONG
6 rows selected.
SUBSTR
This three-argument function enables you to take a piece out of a target string.
The first argument is the target string. The second argument is the position of the
first character to be output. The third argument is the number of characters to show.
INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,2,3)
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME SUB
--------------- ---
kelly ell
CHUCK HUC
LAURA AUR
FESTER EST
ARMANDO RMA
MAJOR AJO
6 rows selected.
If you use a negative number as the second argument, the starting point is determined
by counting backwards from the end, like this:
INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,-13,2)
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME SU
--------------- --
kelly ll
CHUCK UC
LAURA UR
FESTER ST
ARMANDO MA
MAJOR JO
6 rows selected.
ANALYSIS:
Remember the character field FIRSTNAME in this example is 15 characters
long. That is why you used a -13 to start at the third character. Counting
back from 15 puts you at the start of the third character, not at the start of the
second. If you don't have a third argument, use the following statement instead:
INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,3)
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME SUBSTR(FIRSTN
--------------- -------------
kelly lly
CHUCK UCK
LAURA URA
FESTER STER
ARMANDO MANDO
MAJOR JOR
6 rows selected.
The rest of the target string is returned.
INPUT:
SQL> SELECT * FROM SSN_TABLE;
OUTPUT:
SSN__________
300541117
301457111
459789998
3 rows selected.
ANALYSIS:
Reading the results of the preceding output is difficult--Social Security numbers
usually have dashes. Now try something fancy and see whether you like the results:
INPUT:
SQL> SELECT SUBSTR(SSN,1,3)||'-'||SUBSTR(SSN,4,2)||'-'||SUBSTR(SSN,6,4) SSN
2 FROM SSN_TABLE;
OUTPUT:
SSN_________
300-54-1117
301-45-7111
459-78-9998
3 rows selected.
NOTE: This particular use of the substr
function could come in very handy with large numbers using commas such as 1,343,178,128
and in area codes and phone numbers such as 317-787-2915 using dashes.
Here is another good use of the SUBSTR function. Suppose you are writing
a report and a few columns are more than 50 characters wide. You can use the SUBSTR
function to reduce the width of the columns to a more manageable size if you know
the nature of the actual data. Consider the following two examples:
INPUT:
SQL> SELECT NAME, JOB, DEPARTMENT FROM JOB_TBL;
OUTPUT:
NAME______________________________________________________________
JOB_______________________________DEPARTMENT______________________
ALVIN SMITH
VICEPRESIDENT MARKETING
1 ROW SELECTED.
ANALYSIS:
Notice how the columns wrapped around, which makes reading the results a little
too difficult. Now try this select:
INPUT:
SQL> SELECT SUBSTR(NAME, 1,15) NAME, SUBSTR(JOB,1,15) JOB,
DEPARTMENT
2 FROM JOB_TBL;
OUTPUT:
NAME________________JOB_______________DEPARTMENT_____
ALVIN SMITH VICEPRESIDENT MARKETING
Much better!
TRANSLATE
The function TRANSLATE takes three arguments: the target string, the
FROM string, and the TO string. Elements of the target string that
occur in the FROM string are translated to the corresponding element in
the TO string.
INPUT:
SQL> SELECT FIRSTNAME, TRANSLATE(FIRSTNAME
2 '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
3 'NNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAAAA)
4 FROM CHARACTERS;
OUTPUT:
FIRSTNAME TRANSLATE(FIRST
--------------- ---------------
kelly kelly
CHUCK AAAAA
LAURA AAAAA
FESTER AAAAAA
ARMANDO AAAAAAA
MAJOR AAAAA
6 rows selected.
Notice that the function is case sensitive.
INSTR
To find out where in a string a particular pattern occurs, use INSTR.
Its first argument is the target string. The second argument is the pattern to match.
The third and forth are numbers representing where to start looking and which match
to report. This example returns a number representing the first occurrence of O
starting with the second character:
INPUT:
SQL> SELECT LASTNAME, INSTR(LASTNAME, 'O', 2, 1)
2 FROM CHARACTERS;
OUTPUT:
LASTNAME INSTR(LASTNAME,'O',2,1)
--------------- -----------------------
PURVIS 0
TAYLOR 5
CHRISTINE 0
ADAMS 0
COSTALES 2
KONG 2
6 rows selected.
ANALYSIS:
The default for the third and fourth arguments is 1. If the third argument
is negative, the search starts at a position determined from the end of the string,
instead of from the beginning.
LENGTH
LENGTH returns the length of its lone character argument. For example:
INPUT:
SQL> SELECT FIRSTNAME, LENGTH(RTRIM(FIRSTNAME))
2 FROM CHARACTERS;
OUTPUT:
FIRSTNAME LENGTH(RTRIM(FIRSTNAME))
--------------- ------------------------
kelly 5
CHUCK 5
LAURA 5
FESTER 6
ARMANDO 7
MAJOR 5
6 rows selected.
ANALYSIS:
Note the use of the RTRIM function. Otherwise, LENGTH would
return 15 for every value.
Conversion Functions
These three conversion functions provide a handy way of converting one type of
data to another. These examples use the table CONVERSIONS.
INPUT:
SQL> SELECT * FROM CONVERSIONS;
OUTPUT:
NAME TESTNUM
--------------- ---------
40 95
13 23
74 68
The NAME column is a character string 15 characters wide, and TESTNUM
is a number.
TO_CHAR
The primary use of TO_CHAR is to convert a number into a character. Different
implementations may also use it to convert other data types, like Date, into a character,
or to include different formatting arguments. The next example illustrates the primary
use of TO_CHAR:
INPUT:
SQL> SELECT TESTNUM, TO_CHAR(TESTNUM)
2 FROM CONVERT;
OUTPUT:
TESTNUM TO_CHAR(TESTNUM)
--------- ----------------
95 95
23 23
68 68
Not very exciting, or convincing. Here's how to verify that the function returned
a character string:
INPUT:
SQL> SELECT TESTNUM, LENGTH(TO_CHAR(TESTNUM))
2 FROM CONVERT;
OUTPUT:
TESTNUM LENGTH(TO_CHAR(TESTNUM))
--------- ------------------------
95 2
23 2
68 2
ANALYSIS:
LENGTH of a number would have returned an error. Notice the difference
between TO CHAR and the CHR function discussed earlier. CHR
would have turned this number into a character or a symbol, depending on the character
set.
TO_NUMBER
TO_NUMBER is the companion function to TO_CHAR, and of course,
it converts a string into a number. For example:
INPUT:
SQL> SELECT NAME, TESTNUM, TESTNUM*TO_NUMBER(NAME)
2 FROM CONVERT;
OUTPUT:
NAME TESTNUM TESTNUM*TO_NUMBER(NAME)
--------------- -------- -----------------------
40 95 3800
13 23 299
74 68 5032
ANALYSIS:
This test would have returned an error if TO_NUMBER had returned a character.
Miscellaneous Functions
Here are three miscellaneous functions you may find useful.
GREATEST
and LEAST
These functions find the GREATEST or the LEAST member from a
series of expressions. For example:
INPUT:
SQL> SELECT GREATEST('ALPHA', 'BRAVO','FOXTROT', 'DELTA')
2 FROM CONVERT;
OUTPUT:
GREATEST
-------
FOXTROT
FOXTROT
FOXTROT
ANALYSIS:
Notice GREATEST found the word closest to the end of the alphabet. Notice
also a seemingly unnecessary FROM and three occurrences of FOXTROT.
If FROM is missing, you will get an error. Every SELECT needs a
FROM. The particular table used in the FROM has three rows, so
the function in the SELECT clause is performed for each of them.
INPUT:
SQL> SELECT LEAST(34, 567, 3, 45, 1090)
2 FROM CONVERT;
OUTPUT:
LEAST(34,567,3,45,1090)
-----------------------
3
3
3
As you can see, GREATEST and LEAST also work with numbers.
USER
USER returns the character name of the current user of the database.
INPUT:
SQL> SELECT USER FROM CONVERT;
OUTPUT:
USER
------------------------------
PERKINS
PERKINS
PERKINS
There really is only one of me. Again, the echo occurs because of the number of
rows in the table. USER is similar to the date functions explained earlier
today. Even though USER is not an actual column in the table, it is selected
for each row that is contained in the table.
Summary
It has been a long day. We covered 47 functions--from aggregates to conversions.
You don't have to remember every function--just knowing the general types (aggregate
functions, date and time functions, arithmetic functions, character functions, conversion
functions, and miscellaneous functions) is enough to point you in the right direction
when you build a query that requires a function.
Q&A
- Q Why are so few functions defined in the ANSI standard and so many
defined by the individual implementations?
A ANSI standards are broad strokes and are not meant to drive companies
into bankruptcy by forcing all implementations to have dozens of functions. On the
other hand, when company X adds a statistical package to its SQL and it sells well,
you can bet company Y and Z will follow suit.
Q I thought you said SQL was simple. Will I really use all of these
functions?
A The answer to this question is similar to the way a trigonometry teacher
might respond to the question, Will I ever need to know how to figure the area of
an isosceles triangle in real life? The answer, of course, depends on your profession.
The same concept applies with the functions and all the other options available with
SQL. How you use functions in SQL depends mostly on you company's needs. As long
as you understand how functions work as a whole, you can apply the same concepts
to your own queries.
Workshop
The Workshop provides quiz questions to help solidify your understanding of the
material covered, as well as exercises to provide you with experience in using what
you have learned. Try to answer the quiz and exercise questions before checking the
answers in Appendix F, "Answers to Quizzes and Exercises."
Quiz
- 1. Which function capitalizes the first letter of a character string and
makes the rest lowercase?
2. Which functions are also known by the name group functions?
3. Will this query work?
SQL> SELECT COUNT(LASTNAME) FROM CHARACTERS;
- 4. How about this one?
SQL> SELECT SUM(LASTNAME) FROM CHARACTERS;
- 5. Assuming that they are separate columns, which function(s) would splice
together FIRSTNAME and LASTNAME?
6. What does the answer 6 mean from the following SELECT?
INPUT:
SQL> SELECT COUNT(*) FROM TEAMSTATS;
OUTPUT:
COUNT(*)
- 7. Will the following statement work?
SQL> SELECT SUBSTR LASTNAME,1,5 FROM NAME_TBL;
Exercises
- 1. Using today's TEAMSTATS table, write a query to determine
who is batting under .25. (For the baseball-challenged reader, batting average is
hits/ab.)
2. Using today's CHARACTERS table, write a query that will return
the following:
INITIALS__________CODE
K.A.P. 32
1 row selected.
© Copyright, Macmillan Computer Publishing. All
rights reserved.
|