What is Window functions in SQL?
Window functions in SQL provide a way to perform calculations across a set of rows that are related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, window functions do not cause rows to become grouped into a single output row. The rows retain their separate identities.
A window function uses values from the rows in a window defined by the function. For a regular aggregate function, the window would consist of all input rows. For a window function, the window is defined by an OVER
clause that follows the window function's name.
Here are some common window functions:
ROW_NUMBER(): Returns the number of a current row within its partition, counting from 1.
RANK(): Returns the rank of each row within the partition of a result set.
DENSE_RANK(): Returns the rank of rows within the window partition, without any gaps.
LEAD(): Provides access to a row at a given physical offset that follows the current row.
LAG(): Provides access to a row at a given physical offset prior to the current row.
Here are a few examples:
1. ROW_NUMBER()
SELECT
SalesOrderID,
ProductID,
OrderQty,
ROW_NUMBER() OVER(PARTITION BY SalesOrderID ORDER BY SalesOrderID) as RowNum
FROM SalesOrderDetail;
In this example, the ROW_NUMBER()
function is used to assign a unique number to each row within each partition (in this case, each unique SalesOrderID
).
2. RANK()
SELECT
SalesOrderID,
ProductID,
OrderQty,
RANK() OVER(PARTITION BY SalesOrderID ORDER BY OrderQty DESC) as Rank
FROM SalesOrderDetail;
This example uses the RANK()
function to assign a rank to each row within each SalesOrderID
partition, ordered by OrderQty
in descending order.
3. LAG()
SELECT
SalesOrderID,
ProductID,
OrderQty,
LAG(OrderQty) OVER(PARTITION BY SalesOrderID ORDER BY SalesOrderID) as PrevOrderQty
FROM SalesOrderDetail;
In this example, the LAG()
function is used to access the OrderQty
of the previous row within each SalesOrderID
partition.
Remember, these window functions are extremely powerful as they allow you to perform calculations across related sets of rows. They are particularly useful when dealing with time-series data, among other things.