Saturday, May 28, 2011

Query and It's Explain Plan ..



Was working on report extract and one of the table was having 13Cr + records. My whole query was taking hell lot of time, when I checked the explain plan -- table with 13Cr of records was going for Full Table Scan (FTS) and another table was going for FTS twice.

The question came suddenly in my mind, FTS is OK but why another table is getting executed for 2 times? Another team mate helped me and changed the OR clause to UNION and explain plan became fine.

Thought to replicate same issue at Home, but un-fortunately here even putting or condition table is not going for FTS 2 times. At a single go only its fetching the data. Need to check the query at office again...

Till now, what every query formation and its corresponding explain plan I have, I am putting the same here.

1. Normal Join
select emp.first_name || ' ' || emp.last_name, emp.
employee_id,

dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,
loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE, hop.MAX_SALARY
from
hr.employees emp,
hr.departments dep,

hr.locations loc,
hr.countries coun,

hr.regions reg,
hr.jobs hop

where emp.department_id = dep.department_id

and dep.location_id = loc.location_id

and coun.country_id = loc.country_id

and coun.region_id = reg.region_id

and emp.job_id = hop.job_id

and (dep.department_id = 10 or region_name = 'Europe')


In TOAD its an advantage that, it will number the steps which Oracle Optimizer has followed to fetch the rows from db. As in above image,
1. first LOCATIONS table got scanned, which was having all country_id
2. Using those country id got from location table, it went for index scan on country table - so that it will get rowids corresponding to country names
3. As Oracle Optimizer was having information about department_id from location table, it went for index scan in departments table using those ids.
4. From index scan in departments table, it got rowids for other information in departments table at Step No 5.
5. Resultset of Departments, Locations and Country table got merged at Step 6. using nested loop join.
6. As O.P. has already having Region_Id from Country table, using those Region_Ids it went for index scan in Region Table ; to fetch rowids from Region Table related to other information.
7. As department_ids information was already with Optimizer, using that information it went for index scan in Employees table and later on Table access using those rowids got from index scan.
8. Employees, Regions and Dep, Country, Location information got merged using Nested Loop join.
9. Using locations table, new optimizer got rowids related to region. Using those rowids it went for index scan
10. At last it went for Region table scan using the rowids got from Index Scan of Region table.

If you observe the explain plan here, below some key points we observe
1. Index Unique Scan - Index unique scan happens when Index is defined as a Unique or Primary Key. Generally in case of Primary - Foreign key relationship Index Unique Scan happens and which is very cost effective than other index scan.
2. Index Range Scan - When index is defined as a normal index, Index Range Scan happens. Which is slight costly that Index Unique Scan, but cost effective that Index Full Scan. Index Range Scan happens when there is a range defined in filter condition and that having the cardinality very high. [ I need to verify this again regarding cardinality.]
3. Index Full Scan - If cardinality is less, then Optimizer has to scan whole index which ultimately leads to Index Full Scans. If there is a table scan happening after the Index Full Scan, one should avoid the Index Full Scan and went for Full Table scan either using No Index or Full Hints.
4. Nested Loops - When indexes are defined or Primary Key - Foreign Keys are defined, generally Optimizer went for Nested Loops. If you drop any index on these tables, Optimizer goes for Hash Join. In hash join, Optimizer creates its on Has Function and combines 2 different data sets. For Hash Join, its not required to have indexes on tables whereas to perform Nested Loop Join, there has to have indexes defined on base tables.


Inner Query
select emp.first_name || ' ' || emp.last_name,
emp.employee_id
/*,

dep.DEPARTMENT_NAME,

dep.DEPARTMENT_ID,

loc.STATE_PROVINCE
,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY*/
from hr.employees emp,
hr.jobs hop

where emp.department_id in(select distinct department_id
from hr.departments dep,
hr.locations loc,

hr.countries coUn,
hr.regions reg

where dep.location_id = loc.location_id

and coun.country_id = loc.country_id

and coun.region_id = reg.region_id
and (dep.department_id = 10 or region_name = 'Europe') ) AND EMP.JOB_ID = HOP.JOB_ID


With Clause
with
dd as
(select distinct department_id

from hr.departments dep,
hr.locations loc,

hr.countries coUn,
hr.regions reg

where dep.location_id = loc.location_id

and coun.country_id = loc.country_id

and coun.region_id = reg.region_id

and (dep.department_id = 10 or region_name = 'Europe'))
select emp.first_name || ' ' || emp.last_name,
emp.employee_id
/*,
dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,

loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,

reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY*/

from hr.employees emp,
hr.jobs hop , dd

where emp.department_id = dd.department_ID

AND EMP.JOB_ID = HOP.JOB_ID




Because of the with clause, Optimizer went for execution of whatever defined inside it first. And later on with rest of the statement.

We can generate same explain plan using the Ordered hint without the with clause. Forcing optimizer to execute the tables which we want, we can get similar results as using with clause. You can see the example below of Ordered Hint which gave same results and explain plan as we got from With Clause.

Sub Table


select emp.first_name || ' ' || emp.last_name,
emp.employee_id/*,
dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,
loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY*/
from hr.employees emp, hr.jobs hop , (select distinct department_id
from hr.departments dep, hr.locations loc,
hr.countries coUn, hr.regions reg
where dep.location_id = loc.location_id
and coun.country_id = loc.country_id
and coun.region_id = reg.region_id
and (dep.department_id = 10 or region_name = 'Europe'))dd
where emp.department_id = dd.department_ID
AND EMP.JOB_ID = HOP.JOB_ID



Ordered Hint - To get same explain plan as with Clause


select /*+ ordered */ emp.first_name || ' ' || emp.last_name,
emp.employee_id/*,
dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,
loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY*/
from (select distinct department_id
from hr.departments dep, hr.locations loc,
hr.countries coUn, hr.regions reg
where dep.location_id = loc.location_id
and coun.country_id = loc.country_id
and coun.region_id = reg.region_id
and (dep.department_id = 10 or region_name = 'Europe'))dd, hr.employees emp, hr.jobs hop
where emp.department_id = dd.department_ID
AND EMP.JOB_ID = HOP.JOB_ID



Incorrect Query

select emp.first_name || ' ' || emp.last_name,
emp.employee_id,
dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,
loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY
from hr.employees emp, hr.departments dep,
hr.locations loc, hr.countries coun,
hr.regions reg, hr.jobs hop
where emp.department_id = dep.department_id
and dep.location_id = loc.location_id
and coun.country_id = loc.country_id
and coun.region_id = reg.region_id
and emp.job_id = hop.job_id
and dep.department_id = 10 or region_name = 'Europe'



If we check at above statement as there is no "(" defined in "Or" condition, single statement got considered as below statement

select emp.first_name || ' ' || emp.last_name,
emp.employee_id,
dep.DEPARTMENT_NAME,
dep.DEPARTMENT_ID,
loc.STATE_PROVINCE,
loc.STREET_ADDRESS,
coun.COUNTRY_NAME,
reg.REGION_NAME,
hop.JOB_TITLE,
hop.MAX_SALARY
from hr.employees emp, hr.departments dep,
hr.locations loc, hr.countries coun,
hr.regions reg, hr.jobs hop
where (emp.department_id = dep.department_id
and dep.location_id = loc.location_id
and coun.country_id = loc.country_id
and coun.region_id = reg.region_id
and emp.job_id = hop.job_id
and dep.department_id = 10)
or region_name = 'Europe'

So, if you check explain plan here, for department_id = 10 there are where conditions defined and as because of indexes are exists it went for Nested Loop Join,
whereas for Region_Name = 'Europe' there are no conditions and it went for Merge Join Cartesian. When ever in sql ( except very rare cases) you got to see Merge Join Cartesian, you really need to look at your query, and it will definitely going to end up with incorrect result.

No comments:

Post a Comment

All about CSS

From book HTML & CSS - Design and Build Websites - Jon Duckett CSS works by associating rules with HTML elements. These rules govern how...