SQL
Tuning/SQL Optimization Techniques:
1) The sql query becomes faster
if you use the actual columns names in SELECT statement instead of than '*'.
For
Example: Write
the query as
SELECT id,
first_name, last_name, age, subject FROM student_details;
Instead
of:
SELECT *
FROM student_details;
2) HAVING clause is used to filter the
rows after all the rows are selected. It is just like a filter. Do not use
HAVING clause for any other purposes.
For Example: Write the query as
For Example: Write the query as
SELECT
subject, count(subject) FROM student_details WHERE subject !=
'Science'
AND subject != 'Maths' GROUP BY subject;
AND subject != 'Maths' GROUP BY subject;
Instead
of:
SELECT
subject, count(subject) FROM student_details GROUP BY subject
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';
3) Sometimes you may have more than
one subqueries in your main query. Try to minimize the number of subquery block
in your query.
For Example: Write the query as
For Example: Write the query as
SELECT
name FROM employee WHERE (salary, age ) = (SELECT MAX (salary), MAX
(age)
FROM employee_details) AND dept = 'Electronics';
FROM employee_details) AND dept = 'Electronics';
Instead
of:
SELECT
name FROM employee WHERE salary = (SELECT MAX(salary) FROM
employee_details)
AND age = (SELECT MAX(age) FROM employee_details)
AND emp_dept = 'Electronics';
AND age = (SELECT MAX(age) FROM employee_details)
AND emp_dept = 'Electronics';
4) Use operator EXISTS, IN and table
joins appropriately in your query.
a) Usually IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-query.
c) EXISTS is efficient when most of the filter criteria is in the main query.
a) Usually IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-query.
c) EXISTS is efficient when most of the filter criteria is in the main query.
For
Example: Write
the query as
Select *
from product p where EXISTS (select * from order_items o where
o.product_id = p.product_id)
Instead
of:
Select *
from product p where product_id IN (select product_id from
order_items
5) Use EXISTS instead of DISTINCT when
using joins which involves tables having one-to-many relationship.
For Example: Write the query as
For Example: Write the query as
SELECT
d.dept_id, d.dept FROM dept d WHERE EXISTS ( SELECT 'X' FROM
employee e WHERE e.dept = d.dept);
Instead
of:
SELECT
DISTINCT d.dept_id, d.dept FROM dept d,employee e WHERE e.dept =
e.dept;
6) Try to use UNION ALL in place of
UNION.
For Example: Write the query as
For Example: Write the query as
SELECT id,
first_name FROM student_details_class10
UNION ALL
SELECT id, first_name FROM sports_team;
UNION ALL
SELECT id, first_name FROM sports_team;
Instead
of:
SELECT id,
first_name, subject FROM student_details_class10
UNION
SELECT id, first_name FROM sports_team;
UNION
SELECT id, first_name FROM sports_team;
7) Be careful while using conditions
in WHERE clause.
For Example: Write the query as
For Example: Write the query as
SELECT id,
first_name, age FROM student_details WHERE age > 10;
Instead
of:
SELECT id,
first_name, age FROM student_details WHERE age != 10;
Write the
query as
SELECT id,
first_name, age FROM student_details WHERE first_name LIKE 'Chan%';
Instead
of:
SELECT id,
first_name, age FROM student_details WHERE SUBSTR(first_name,1,3) =
'Cha';
Write the
query as
SELECT id,
first_name, age FROM student_details WHERE first_name LIKE NVL (
:name, '%');
Instead
of:
SELECT id,
first_name, age FROM student_details WHERE first_name = NVL (
:name, first_name);
Write the
query as
SELECT
product_id, product_name FROM product WHERE unit_price BETWEEN
MAX(unit_price) and MIN(unit_price)
Instead
of:
SELECT
product_id, product_name FROM product WHERE unit_price >=
MAX(unit_price) and unit_price <= MIN(unit_price)
Write the
query as
SELECT id,
name, salary FROM employee WHERE dept = 'Electronics' AND
location = 'Bangalore';
Instead
of:
SELECT id,
name, salary FROM employee WHERE dept || location=
'ElectronicsBangalore';
Use
non-column expression on one side of the query because it will be processed
earlier.
Write the
query as
SELECT id,
name, salary FROM employee WHERE salary < 25000;
Instead
of:
SELECT id,
name, salary FROM employee WHERE salary + 10000 < 35000;
Write the
query as
SELECT id,
first_name, age FROM student_details WHERE age > 10;
Instead
of:
SELECT id,
first_name, age FROM student_details WHERE age NOT = 10;
8) Use DECODE to avoid the scanning of
same rows or joining the same table repetitively. DECODE can also be made used
in place of GROUP BY or ORDER BY clause.
For Example: Write the query as
For Example: Write the query as
SELECT id
FROM employee WHERE name LIKE 'Ramesh%' and location = 'Bangalore';
Instead
of:
SELECT
DECODE(location,'Bangalore',id,NULL) id FROM employee WHERE name LIKE
'Ramesh%';
9) To store large binary objects,
first place them in the file system and add the file path in the database.
10) To write queries which provide
efficient performance follow the general SQL standard rules.
a) Use single case for all SQL verbs
b) Begin all SQL verbs on a new line
c) Separate all words with a single space
d) Right or left aligning verbs within the initial SQL verb
b) Begin all SQL verbs on a new line
c) Separate all words with a single space
d) Right or left aligning verbs within the initial SQL verb
SQL
Tutorial Tips:
SQL Tip
1
SQL
SELECT only
the columns needed, avoid using SELECT *. First, for each column that you do
not need every SQL Server performs additional work to retrieve and return them
to the client, and the second volume of data exchanged between the client and
SQL Server increases unnecessary.
SQL Tip
2
SQL
SELECT only
the rows needed. The less rows retrieved, the faster the SQL query will run.
SQL Tip
3
Prune
SELECT lists. Every column that is SELECTed consumes resources for processing.
There are several areas that can be examined to determine if column selection
is really necessary.
Example:
WHERE
(COL8 = ‘X’)
If a SQL
SELECT contains
a predicate where a column is equal to one value, that column should not have
to be retrieved for each row, the value will always be ‘X’.
SQL Tip
4
When you
create a new table always create a unique clustered index belong to it,
possibly it is a numeric type.
SQL Tip
5
Use JOIN
instead of subqueries. As a programmer, subqueries are something that you can
be tempted to use and abuse. Subqueries, as show below, can be very useful:
SELECT
a.id, (SELECT MAX(created) FROM posts WHERE author_id =
a.id)
AS latest_post FROM authors a
AS latest_post FROM authors a
Although
subqueries are useful, they often can be replaced by a join, which is
definitely faster to execute.
SELECT
a.id, MAX(p.created) AS latest_post FROM authors a
INNER JOIN posts p
ON (a.id = p.author_id) GROUP BY a.id
INNER JOIN posts p
ON (a.id = p.author_id) GROUP BY a.id
SQL Tip
The
following example use the OR statement to get the result:
SELECT *
FROM a, b WHERE a.p = b.q OR a.x = b.y;
The UNION
statement allows you to combine the result sets of 2 or more select queries.
The following example will return the same result that the above query gets,
but it will be faster
SELECT *
FROM a, b WHERE a.p = b.q
UNION
SELECT * FROM a, b WHERE a.x = b.y;
UNION
SELECT * FROM a, b WHERE a.x = b.y;
SQL Tip
Keep your
clustered index small. One thing you need to consider when determining where to
put your clustered index is how big the key for that index will be. The problem
here is that the key to the clustered index is also used as the key for every
non-clustered index in the table. So if you have a large clustered index on a
table with a decent number of rows, the size could blow out significantly. In
the case where there is no clustered index on a table, this could be just as
bad, because it will use the row pointer, which is 8 bytes per row.
SQL Tip
8. Avoid
cursors. A bit of a no-brainer. Cursors are less performing because every FETCH
statement executed is equivalent to another SELECT statement execution that
returns a single row. The optimizer can’t optimize a CURSOR statement, instead
optimizing the queries within each execution of the cursor loop, which is
undesirable. Given that most CURSOR statements can be re-written using set
logic, they should generally be avoided.
SQL Tip
9. Use
computed columns Computed columns are derived from other columns in a table. By
creating and indexing a computed column, you can turn what would otherwise be a
scan into a seek. For example, if you needed to calculate SalesPrice and you
had a Quantity and UnitPrice column, multiplying them in the SQL inline would
cause a table scan as it multiplied the two columns together for every single
row. Create a computed column called SalesPrice, then index it, and the query
optimiser will no longer need to retrieve the UnitPrice and Quantity data and
do a calculation – it’s already done.
Hints
Hints are comments embedded in SQL that can help influnce the behaviour of the Cost Based Optimizer.Hints are always specified immediately after the first word of a SQL statement. eg.
SELECT
/*+
place your hint here
*/ column_name ...
FROM table_name
- Hints for Access Methods
- Hints for Join Orders
- Hints for Join Operations
- Hints for Parallel Execution
- Additional Hints
Hint
|
Purpose
|
Use when...
|
Force a
Full Table Scan on tab.
|
Used to
stop Oracle from performing an index scan.
|
|
Force a
table access by Rowid on tab
|
Given an
equals condition on a rowid, Oracle will alwayse use it. This hint is used to
force a Rowid Rangescan on tab.
|
|
Force a
cluster scan on tab
|
This
would be rare. A cluster scan is pretty good, so Oracle will normally select
it automatically. If it doesn't, this hint will force a cluster scan.
|
|
Force a
hash access on tab if tabis
hash clustered.
|
Typically
an equals predicate on a hash clustered table will always use hash access,
unless the table is very small indeed. This hint may be required if accessing
a hash clustered table via an IN list, or an IN subquery
|
|
Force an
index scan on table tab
|
Specifying
just the table name (or alias) is the preferred method of stopping a Full
Table Scan. If the statistics are calculated against the tables and indexes,
Oracle should choose the best available index. The second form is dangerous,
as it assumes the name of the index to be used will not change. Only use it
if there are many indexes and Oracle will not choose the right one. Better
yet, use NO_INDEX to disable the index you want to avoid.
If you supply multiple indexes, Oracle will usually choose the best one from the list specified. Beware though that you don't fall into the AND-EQUAL trap. |
|
Forces a
bitmap index access path on tab
|
Primarily
this hint just tells Oracle to use the bitmap indexes on table tab. Otherwise Oracle will
choose the best combination of indexes it can think of based on the
statistics. If it is ignoring a bitmap index that you think would be helpful,
you may specify that index plus all of the others taht you want to be used.
Note that this does not force the use of those indexes, Oracle will still
make cost based choices.
|
|
All
columns in your SQL for a given table are contained in two or more indexes.
Oracle can merge the indexes to avoid a table lookup. If there are different
possible combinations of indexes that could be used, specify the index names
as well if there is a particular combination that would be faster.
|
||
Same as
the INDEX hint, except process range scans in descending order
|
Use this
hint if you are using an index to sort rows instead of an ORDER BY.
|
|
Forces a
Fast Full Scan on one of tab's
indexes
|
If all
columns required for a SQL reside in one index, then a Fast Full Scan may be
used instead of a Full Table Scan to avoid a table access.
|
|
Forces
Oracle to ignore indexes
|
Used
with just the table name (or alias), Oracle will ignore all indexes on that
table. This is equivalent to a FULL hint unless the table is clustered. If
index names are specified, they will not be used. If Oracle has two indexes
to choose from, this could be used to disable an index, instead of using the
INDEX hint to force the use of the other index.
|
|
Forces
Oracle to scan all nominated single column indexes used in AND col = ...
predicates
|
Don't
use this. You will probably never come across a good implementation of this
technique. See theAND-EQUAL trap.
|
|
Expand
OR predicates or IN lists into UNIONs
|
Each
predicate in the list of ORs can individually use and index, and collectively
the ORs return less than 4% of the table. Also useful in a join query where
each of the OR predicates is indexed and on a different table.
|
|
Stops
Oracle from expanding ORs and IN lists into UNIONs. See USE_CONCAT.
|
If in
Explain Plan you see that Oracle is expanding ORs or IN lists into UNIONs,
and you think a full table scan would be faster because the UNIONs
collectively return more than 4% of the table, then use this hint to check it
out.
|
|
Forces
Oracle to resolve the query using a meterialized view instead of the tables
in the FROM clause.
|
Use when
the materialized view resolves the same joins or aggregates as are used in
the query.
|
|
Forces
Oracle to stop using query rewrite.
|
Use when
the session or database parameter QUERY_REWRITE_ENABLED is set to true, but
you want to avoid using the materiazed view because it may be out of date.
|
|
Join the
tables in the FROM clause in the order they are specified
|
Use if
Oracle is joining table in the wrong order. Can also be used to encourage
Oracle to use a non-correlated WHERE col IN sub-query as the driving table in
a SELECT and then join back to the outer table. If you just want to suggest
the best table to lead the join, try the LEADING hint instead.
|
|
Forces
Oracle to use a star query plan.
|
Avoid
using this. Star queries are deprecated in favour of STAR_TRANSFORMATION which uses bitmap indexes in
favour of cartesian joins. See Star Query.
|
|
Use a
Nested Loops join
|
Use when
Oracle is using a Hash or Sort Merge join (high volume SQLs), and you want it
to use a Nested Loops join (low volume SQLs). Older versions of Oracle
required this hint to be used in conjunction with the ORDERED hint. This is
still advisable to avoid unexpected results.
|
|
Use a
Sort-Merge join on tab
|
Use when
Oracle is using a Nested Loops join, and you have a high volume join using
range predicates. Older versions of Oracle required this hint to be used in
conjunction with the ORDERED hint. This is still advisable to avoid
unexpected results.
|
|
Use a
Hash join on tab
|
Use when
Oracle is using a Nested Loops or Merge join, and you have a high volume join
using equals predicates. Older versions of Oracle required this hint to be
used in conjunction with the ORDERED hint. This is still advisable to avoid
unexpected results.
|
|
Forces
Oracle to evaluate a join involving a remote table on the remote table's
database.
|
Firstly,
try not to join to remote tables. If you must, use this hint when you are
joining a local table to a remote table, and the local table is smaller.
|
|
Forces tab to be the leading table in a join
|
Use instead
of the ORDERED hint if you only want to suggest the best starting table.
Oracle can have trouble choosing a leading table if there a two of more in
the SQL with non-indexed WHERE clauses.
|
|
Use a
Hash Anti-Join to evaluate a NOT IN sun-query.
|
Use this
when your high volume NOT IN sub-query is using a FILTER or NESTED LOOPS
join. SeeHigh Volumne Nested Loops Joins. Check Explain Plan to ensure
that it shows HASH JOIN (ANTI). Try MERGE_AJ if HASH_AJ refuses to work.
The HASH_AJ hint is sepcified from within the sub-query, not in the main SQL statement. |
|
Use a
Merge Anti-Join to evaluate a NOT IN sun-query.
|
Use this
when HASH_AJ does not work. MERGE_AJ will probably not work either, but it's
worth a try.
|
|
Use a
Hash Semi-Join to evaluate a correlated EXISTS sub-query.
|
Use this
when you have a high volume outer query, and a correlated single table
sub-query with equals joins back to the outer query, and no DISTINCT / GROUP
BY clause. Check Explain Plan to ensure that it shows HASH JOIN (SEMI). Try
MERGE_SJ if HASH_SJ refuses to work.
The HASH_SJ hint is sepcified from within the sub-query, not in the main SQL statement. |
|
MERGE_SJ
|
Use a
Merge Semi-Join to evaluate a correlated EXISTS sub-query.
|
Use this
when HASH_SJ does not work. MERGE_SJ will probably not work either, but it's
worth a try.
|
Parallel
Query hints have been deliberately omitted because they are a lazy way to
tune and wreak havoc for DBAs if over-used. Speak to your DBA about using
parallel query.
|
||
Direct
Path Insert
|
Use Direct Path data load to append inserted rows
to the end of the table, rather than searching for free space in previously
used data blocks.
|
|
Cache
blocks from Full Table Scan
|
Usually
Full Table Scans will not bump other blocks out of cache, the theory being
that they probably won't be used again. Use this hint if you are going to
perform another Full Table Scan on the same table straight away.
|
|
Do not
cache blocks from a Full Table Scan
|
This is
the default behaviour, so you should never need it. Perhaps if the CACHE hint
were hard coded into a view, the NO_CACHE hint on a select from the view
would override it. Just guessing.
|
|
Enables
Complex View Merging
|
Use when
you join to a view that contains a GROUP BY or DISTINCT. See Selecting from Views
|
|
Disable
Complex View Merging
|
Complex
View Merging is a good thing. Don't use this hint unless you are curious to
see how much faster complex view merging can be.
|
|
A global
panacea for badly written sub-queries. Can be used in place of Anti-joins and
Semi-joins if you are not really sure what you're doing.
|
If you
can't get your sub-query to stop using a FILTER step, try UNNEST. It uses
internal cleverness to rewrite your query.
|
|
Forces
Oracle not to Unnest sub-queries.
|
If
UNEST_SUBQUERY initialisation parameter is set, Oracle will automatically try
to unnest sub-queries. Use this hint to stop it from doing that for a
particular sub-query.
|
|
Push a
join predicate between a view (or inline view) and a table into the view.
|
Use with
a Nested Loop join to a view when the view is the outer (2nd) table in the
join. The join condition will be pushed into the view, potentially enabling
an index use. See Selecting from Views.
|
|
Stop
Oracle from pushing join predicates.
|
Pushing
Join Predicates is a good thing - don't use this hint.
|
|
Force
Oracle to evaluate sub-query before other non-indexed predicates.
|
Use this
if you have lots of non-indexed predicates, most of which almost always come
out true, and a non-merged sub-query that reduces the number of rows
significantly. The performance benefit will only be noticeable over larger
data volumes. Over those volumes you will probably be better off merging the
sub-query (see the UNNEST hint).
|
|
Use
bitmap indexes for a Star Transformation execution path.
|
Use this
when joining a fact table with bitmap indexes to dimension tables keyed by
those bitmap indexed columns. See Star Query.
|
|
Execute
the non-indexed non-join predicates in the order in which they are supplied.
|
If one
predicate eliminates a row for a query, Oracle does not evaluate the others.
If you order your predicates with the ones most likely to fail first, then
this hint can reduce the total number of predicates evaluated. Also see
PUSH_SUBQ.
|
TK*Prof
TK*Prof is an Oracle tool used to display the statistics generated during a trace. When an Oracle session is traced (by SQL*Trace, Oracle Trace, or Database Trace), a trace file is generated. This trace file is barely human-readable; TK*Prof collates and formats the data into a a more meaningful form.Finding the trace file
All trace files are written to the same location: a directory that is defined when the database is booted. To find out the location of this directory, run the following SQL.SELECT
value
FROM sys.v_$parameter
WHERE name = 'user_dump_dest'
sys.v_$parameter
to everybody. Go to the directory
shown, and list the files in date order; on Unix, this is ls -ltr
.
If the trace files are not readable, ask the DBA to change the privileges.
There is a database initialisation parameter that the DBA can set so that all
future trace files are created readable.Running TK*Prof
Running TK*Prof is simple:
tkprof trace_file output_file [ explain=userid/password@database ]
trace_file
is the name of the trace file you found in
the previous step, and output_file
is the file to which TK*Prof will send
the output. The optional explain
argument will display an Explain Plan
for all SQLs in the trace file. There are other optional arguments to tkprof
,
see the Oracle Utilities manual for more detail.TK*Prof output
The output of TK*Prof is very well described in the Oracle Utilities manual, so it will not be described again here. The sort of things you should be looking for are:- For each SQL, check the
Elapsed
statistic. This shows the elapsed time for each SQL. High values obviously indicate long-running SQL - Note the
Disk
andQuery
columns. These indicate data retrieval from disk and data retrieval from memory respectively. If theDisk
column is relatively low compared to theQuery
column, then it could mean that the SQL has been run several times and the data has been cached. This might not give a true indication of the performance when the data is not cached. Either have the database bounced by the DBA, or try the trace again another day. - The first row of statistics for each SQL is for the Parse step. If a SQL is run many times, it usually does not need to be re-parsed unless Oracle needs the memory it is taking up, and swaps it out of the shared pool. If you have SQLs parsed more than once, get the DBA to check whether the database can be tuned to reduce this.
- A special feature of the Explain Plan used in TK*Prof is that it shows the number of rows read for each step of the execution plan. This can be useful to track down Range Scan problems where thousands of rows are read from an index and table, but only a few are returned after the bulk are filtered out.
- In order to run SQL statements, Oracle must perform its own SQL statements to query the data dictionary, looking at indexes, statistics etc. This is called Recursive SQL. The last two entries in the TK*Prof output are summaries of the Recursive and Non-Recursive (ie. "normal") SQL. If the recursive SQL is taking up more than a few seconds, then it is a likely sign that the Shared Pool is too small. Show the TK*Prof output to the DBA to see if the database can be tuned.
- Make sure you turn tracing off or exit your traced session before running TK*Prof. Some statistics are only written at the end.
- Have you run any ALTER SESSION commands that affect the optimizer? If so, then the plan shown may differ from the real plan. Note that the real plan is not shown: TK*Prof re-evaluates the plan when you run TK*Prof. Make sure that you turn SQL_TRACE on before you ALTER SESSION. TK*Prof is clever enough to see the ALTER SESSION command in the trace file and evaluate plans accordingly. It will probably display two plans: the default plan, and the new plan taking the ALTER SESSION into account.
Explain Plan
Explain Plan is the name Oracle (and most other RDBMSs) gives to a tool that describes the algorithm the database will use to run a SQL statement. With Oracle, there are a number of ways of running Explain Plan. Some of these are:Oracle Enterprise Manager
Oracle enterprise manager provides a facility called SQL Scratch Pad. Scratch Pad has a series of buttons down the left hand side to perform functions such as Execute SQL, Save SQL, and – of course – Explain Plan. Type (or paste) your SQL statement into Scratch Pad and hit the Explain Plan button.TOAD
TOAD is great. If your site uses it, count your blessings. The "Ambulance" button in TOAD runs Explain Plan. Just position your cursor on the SQL you want to explain, and hit the ambulance button.SQL*Plus EXPLAIN
This is the low-tech method for those with no access to a nice GUI tool. This is a two step process:- Add the following line BEFORE your SQL statement
EXPLAIN
PLAN SET STATEMENT_ID = ‘YOUR USERID’ FOR
Eg.
EXPLAIN PLAN SET
STATEMENT_ID = ‘JOEBLOGGS’ FOR
SELECT *
FROM EMP;
SELECT *
FROM EMP;
Now run
the SQL. Note that it will not execute the SQL itself, it will only generate
the plan. The only output of running the SQL is
Explained.
If you
get an error telling you that ‘PLAN_TABLE does not exist’, see your DBA; you
don’t have the privileges to run Explain Plan.
- Now you have to retrieve the plan. This involves running a lengthy SQL (see Attachment 1). Download the atached rplan.sql and run:
@rplan
The plan
will be displayed on the screen. Depending on the width of your screen, the
output may wrap over 2 or more lines. If so, you will need to add COLUMN
commands to rplan.sql to format the output.
An
alternative for Unix users with telnet sessions is plan.sql. plan.sql assumes that you have your SQL
saved as a file in the current directory. From the SQL*Plus prompt, type:
@plan filename
plan.sql
will pick up your file, run it through Explain Plan, and display the plan to
the screen.
SQL*Plus AUTOTRACE
The is the easy low-tech method. In SQL*Plus, type:
SET
AUTOTRACE ON EXPLAIN
Now, every SQL you run will
generate and display a plan. NOTE however that using this method, the SQL
actually executes, whereas with the previous method it did not. If your SQL
takes ages to run, then you will have to wait for it to finish for the plan to
display. For this reason, the previous low-tech method is relatively superior.