Monday, 28 March 2011

Oracle Function Based Indexes

Oracle’s function based indexes are a powerful feature that, from experience, is rarely seen or used by Java developers. I guess that this is because if a Java developer has anything to do with SQL at all, then they’ll usually write their SQL, test it and deliver it to the source control system. At that point, the project’s DBA (if there is one and if he has the time) takes a look at the new SQL and does a bit of tuning if necessary. This makes me think that function based indexes are really a ‘DBA thing’. So, for Java programmers, in this blog I’m going to demonstrate function based indexes using a simple EMPLOYEE table, which we’ll first need to create and populate, so bare with me as this is the boring bit.


Create the test table...

create table employee (
emp_num number not null,
surname varchar2(32) not null,
firstname varchar2(32) not null,
dob date,
constraint employee_pk primary key(emp_num)
);

Insert some test data...

INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (42, 'Williams', 'Sammy');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (16, 'Williams', 'Shane');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (50, 'Williams', 'Paul');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (1, 'Jones', 'William');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (2, 'Smith', 'Fred');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (3, 'Walters', 'Steve');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (4, 'Baker', 'Ted');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (5, 'Jones', 'Jack');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (6, 'Chatterly', 'David');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (7, 'Aston', 'Callum');
INSERT INTO EMPLOYEE (emp_num, surname, firstname) VALUES (8, 'Kirk', 'James');

In this example, the big idea is to be able to find employees by surname and, without any indexes on these columns, our search would initiate a slow full table scan and therefore the obvious thing to do would be to index the surname column:

create index emp_idx on employee(surname);

Enabling the explain plan functionality, we can now do search on the employee’s surname:

set autotrace on explain
select surname, firstname, emp_num, dob from employee where surname = 'Williams';

and demonstrate that our simple index is being used:

Execution Plan
----------------------------------------------------------
Plan hash value: 1245964252
----------------------------------------------------------------------------------------
| Id | Operation                  | Name     | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT           |          | 3    | 174   | 2 (0)      | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID| EMPLOYEE | 3    | 174   | 2 (0)      | 00:00:01 |
|* 2 | INDEX RANGE SCAN           | EMP_IDX  | 3    |       | 1 (0)      | 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SURNAME"='Williams')
Note
-----
- dynamic sampling used for this statement

At this point, all is well, but as usual the requirements guy comes into your office and says that the customer now wants case insensitive employee surname searches, and you spend a little time redeveloping your SQL and come up with:

select surname, firstname, emp_num, dob from employee where upper(surname) = 'WILLIAMS';

You run this and discover that your search is now taking a lot longer because the use of the upper function is forcing Oracle to do a full table scan:

Execution Plan
----------------------------------------------------------
Plan hash value: 2119105728
---------------------------------------------------------------------------
| Id | Operation        | Name     | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
| 0  | SELECT STATEMENT |          | 3    | 174   | 3 (0)      | 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMPLOYEE | 3    | 174   | 3 (0)      | 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(UPPER("SURNAME")='WILLIAMS')
Note
-----
- dynamic sampling used for this statement

and this is where function based indexes come in. You now need to create an index for your new query optimising for a case insensitive search

create index emp_upper_idx on employee(upper(surname));

Run the query again:

select surname, firstname, emp_num, dob from employee where upper(surname) = 'WILLIAMS';

and the execution plan shows that the index is used...

Execution Plan
----------------------------------------------------------
Plan hash value: 642203055
--------------------------------------------------------------------------------
| Id | Operation                  | Name          | Rows  | Bytes | Cost (%CPU)| 
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT           |               | 3     | 174   | 2 (0)      |
| 1  | TABLE ACCESS BY INDEX ROWID| EMPLOYEE      | 3     | 174   | 2 (0)      |
|* 2 | INDEX RANGE SCAN           | EMP_UPPER_IDX | 1     |       | 1 (0)      |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(UPPER("SURNAME")='WILLIAMS')
Note
-----
- dynamic sampling used for this statement

One thing to note is that a function based index is highly tuned to a specific where clause(s); for example, if the requirement changed again and you had to search on full names, you’d have to both write a new piece of SQL and, in order to maintain performance, add a new specific index:

select surname, firstname, emp_num, dob from employee
    where upper(firstname || ' ' || surname) = upper('Callum Aston');


create index fullname_upper_idx on employee(upper(firstname || ' ' || surname));

There you have it, this is a simple outline of function based indexes, there is a lot more to this subject; such as using your own PL/SQL functions as indexes and getting hold of the correct user privileges, so more may follow...

No comments: