Skip to main content

Analytical Functions in Oracle and it’s benefits #DOTC_Mdb

         Analytical  Functions in Oracle and it’s benefits

 

Hello Everyone,

Today we will discuss about Analytical Functions in Oracle. Analytic functions in Oracle calculate an aggregate value based on a group of rows and return multiple rows for each group. In other word we can say Analytical functions are used to do analyze data over multiple rows and return the result in the current row. Also with  analytic functions, however, you can group the data that is used to calculate the function’s result, but show a value one each record in the result set, without grouping as well.

 

Advantage of Analytical Function is as below:

Improve the Performance of Query

Analtical Functions  are easier to write than sub queries and joins to get the same result.

Easy to maintain.

Before Starting of analytical function we need to understand below two points:

OVER:

The scope of an analytical function is defined in an OVER clause. OVER is mandatory for all analytical functions. If you do not give any arguments after OVER, it will assume the scope as ALL.

PARTITION BY:

We know aggregate functions use GROUP BY to group rows whereas analytical functions use PARTITION BY clause .

Oracle had around 26 analytical functions , but commonly used Analytical Functions are as below:

ROW_NUMBER

RANK

DENSE_RANK

LEAD AND LAG

FIRST_VALUE AND LAST_VALUE

WINDOW CLAUSE

 

Let’s start with ROW_NUMBER analytical Function in Oracle : 

ROW_NUMBER() is an analytic function in Oracle that assigns a sequential unique integer to each row to which it is applied, either each row in the partition or each row in the result set. ROW_NUMBER gives a running serial number to a set of records. ORDER BY clause plays a crucial role in Analytical Function.

Syntax for ROW_number :

ROW_NUMBER() OVER (

   [query_partition_clause]

   order_by_clause

)

Here  order_by_clause is required. It specifies the order of rows in each partition or in the whole result set.

Note : Here  Query_partition_clause is optional. It distributes the rows into partitions to which the function is applied. If you omit the query_partition_clause, the function will treat the whole result set as a single partition.

Below is the simple example where i can see the row_number for each department by using partition clause.

SELECT

ename,

hiredate,

deptno,

row_number() over (partition by deptno order by hiredate) r1

FROM

Emp;

Output of above query is as below:


 

Another Simple example of ROW_NUMBER Analytical Function by using sub query is as below:

select

ename, deptno,hiredate

from

(

SELECT

ename,

hiredate,

deptno,

row_number() over (partition by deptno order by hiredate) r1

FROM emp

) where r1=5;

 

Here we are getting the records for row number 5 for each department. Below is the output of above query.



 

 

RANK :  RANK analytical function also gives a serial number to a set of rows ordered in a specific way like the ROW_NUMBER. The RANK() function is an analytic function that calculates the rank of a value in a set of values. The RANK() function returns the same rank for the rows with the same values. It adds the number of gap to the next rank.

The difference is: ROW_NUMBER gives a unique number to each row even if the rows are duplicate. But RANK gives the same number to the duplicate rows.

Syntax for RANK Function is similar to ROW_NUMBER

RANK()

            OVER ([ query_partition_clause ] order_by_clause)

Example of Rank : In below query we will see the output as both row_number and rank for better understanding.

SELECT

ename,

hiredate,

deptno,

row_number() over (order by hiredate) r1,

rank() over( order by hiredate) r2

 FROM emp

 

Below is the output of above query:

 


Here we can easily understand that row_number for Jones is 10 but rank is 9 only. As we put condition against the order by hiredate .

 

Another example of RANK by using subquery  where we can see the multiple records as we put condition for RANK.

select * from (

SELECT

ename,

hiredate,

deptno,

row_number() over (order by hiredate) r1,

rank() over( order by hiredate) r2

 FROM emp)

    where r2=9;

 

Below is the output of above query :

 


 

DENSE_RANK : DENSE_RANK works in a similar way as the RANK. Main difference is that it does not skip numbers while assigning to duplicate rows. In other word it’s not providing gap and provide the real rank value.

Note : DENSE_RANK() is an analytic function that calculates the rank of a row in an ordered set of rows. The returned rank is an integer starting from 1.

Syntax is Similar to RANK only .

DENSE_RANK( ) OVER([ query_partition_clause ] order_by_clause)

Example of DENSE_RANK  function is as below to get the exact RANK :

SELECT

ename,

hiredate,

deptno,

rank() over(order by hiredate) r1,

dense_rank() over( order by hiredate) r2

FROM emp          

Below is the output of above query :

 


Here you can easily understand the dense rank of Scott coming as 11 and RANK is 12.

Below is the another DENSE_RANK function by using Subquery where we will able to see both RANK and DENSE_RANK for Scott only .

select * from(

SELECT

ename,

hiredate,

deptno,

rank() over(order by hiredate) r1,

dense_rank() over( order by hiredate) r2

FROM emp)

where r2=11;

 

Below is the output of above query

 


LEAD AND LAG :

LEAD() is an analytic function that allows you to access the following row from the current row . In another word LEAD function  allows us to apply any computation on the next rows(rows after the current row).

Syntax of LEAD is as below :

LEAD(<expression>,<offset>,<default>) over(<analytic clause>)

Here <expression> is the expression to apply on the leading row

<offset> is the number of the leading row relative to the current rows.

<default> is the value to be returned if the <offset> points to a row outside the partition range

 

Let’s write a query with using LEAD and LAG where  we find out the previous lower salary and the next higher salary for department 30,  the query is as below:

 

SELECT

ename,

sal,

lead(sal,1,0) over(partition by deptno order by sal desc) next_lowsal,

lag(sal,1,0) over(partition by deptno order by sal desc) prev_highsal

FROM emp

 where deptno=30;

Note : LAG use the same syntax. The <offset> in LAG points to the previous rows related to the current row.

Output of above query is as below :

 

 


 

FIRST_VALUE AND LAST_VALUE Analytical Function : The FIRST_VALUE() is an analytic function that allows you to get the first value in an ordered set of value. Whereas The LAST_VALUE()  allows us to obtain the last value in an ordered set of values.

Let's take a simple example where we have to find the number of days between the first hire and current hiredate of each employee of department 30, the query is as below:

SELECT

ename,

sal,

hiredate-first_value(hiredate) over(order by hiredate) first_value_days

FROM emp

where deptno=30

 

Output of above query is as below:



 

LAST_VALUE Query we can write in similar way .

SELECT

ename,

sal,

hiredate-last_value(hiredate) over(order by hiredate) last_value_days

FROM emp

where deptno=30

 

Output of Above query is as below:

 


WINDOW CLAUSE :

Partition clause is not the only method of limiting the scope of an analytical function. We can use ROWS BETWEEN or RANGE BETWEEN clause to define an additional scope in Analytical Function.

Syntax:

[ROW or RANGE] BETWEEN <start_expr> AND <end_expr>

Note : <start_expr> can be any one of the following UNBOUNDED PRECEDING

CURRENT ROW

<sql_expr> PRECEDING or FOLLOWING.

<end_expr> can be any one of the following UNBOUNDED FOLLOWING or

CURRENT ROW or  <sql_expr> PRECEDING or FOLLOWING.

Below is a query which gives the highest salary in the previous rows relative to the current row.

SELECT

ename,

hiredate,

sal,

max(sal) over(order by hiredate,ename

 ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) maximumbefore_salary

FROM emp;

Output of above query is as below:

 


 

Here, for each row, the highest salary before the current row is returned.  ORDER BY clause here is not used for ranking but for specifying a window. UNBOUNDED PRECEDING is the lowest possible bound i.e. the first row.

Note :  Difference between Aggregate and Analytical function is Aggregate functions group the rows and return the output for the entire result set. whereas analytical functions return the aggregated output but they do not group the result set. They return the value for each row in the result set.

 

Please go through this blog carefully as Analytical Functions in Oracle are very powerful tools to aggregate and analyze the data across multiple dimensions. The  execution speed is also much better than the normal aggregate functions , Also improve the query performance as compared to aggregate functions. Please go through each topic and for any doubts put it in comment.

Thanks.

 

Comments

Post a Comment

Popular posts from this blog

SQL and Classification of SQL in Oracle Database #Diksha Oracle Training Centre

  SQL and Classification of SQL in Oracle Database.   SQL is Structured Query Language , which is used for storing, manipulating and retrieving data stored in a relational database .SQL is the standard language for RDBMS. All the Relational Database Management Systems (RDMS) like Oracle, MySQL, Sybase, Informix, IBM DB2 and Microsoft SQL Server use SQL as their standard database language. Oracle is one of the more secured database as compared to other databases. Importance of   SQL : SQL and PL/SQL is a backend process where all data is stored and retrieved in GUI which created either by any programming languages like Java, C++, PHP etc. so we need to have very secure database so that there will be no impact for users. SQL allows users to access data in the relational database management systems. SQL is used to communicate with a database.SQL and PL/SQL allows users to create and drop databases tables , views , stored procedures , functions , packages , trigger et...

Materialized View uses and Benefits in Database #DOTC_Mdb

Materialized View uses and Benefits in Database. Hello Everyone, Today we will discuss about Materialized view as it’s play important role in database. We already discussed about Simple Views and complex views in my previous blog. Before Materialized view go through my previous blog which related to simple view. https://dheeraj60.blogspot.com/2020/05/benefits-of-creating-and-using-view-in.html As we know View is not a database object and not like table which is stored in database, but view can be created from base table a view is an SQL statement that’s stored in the database. This statement, or view, has a name.A view looks and acts a lot like a table. It has columns and rows, and can be included in SELECT queries just like a table. In other word we can say View is a virtual/logical table which is basically used for security purpose. Let’s understand   about   Materialized view : A materialized view is a view that stores the results of the view’s query. Whenever you que...

Top 50 Interview Questions On SQL and PL/SQL #DOTC_Mdb

                    Top 50 Interview Questions On SQL and PL/SQL. Today we will Discuss Top 50 interview questions and answers of SQL and PL/SQL which is frequently asked in interview.     Question 1: What is SQL and Classification of SQL? Answer SQL is a Structure Query Language which is vastly used in RDBMS database like Oracle, Sybase, DB2 , Microsoft SQL server etc.   Classification of SQL is as below: DDL (Data Definition Language):  Commands are  create , alter , drop , truncate etc DML (Data Manipulation Language) : Commands are  insert,update and delete . TCL (Transaction Control Language ) : Commands are  Commit , Rollback and Save point. DCL (Data Control Language) : Commands are Grant , Revoke Question 2:    What is meant by Joins? What are the types of join? Answer Joins are basically used to extract/get data from multiple tables using s...