How to use PARTITION BY vs GROUP BY in SQL Server?
1. What is PARTITION BY?
PARTITION BY
normally goes within an OVER clause as a part of a window function in MS SQL Server.
PARTITION BY
clause is used to divide the result set into partitions and perform computation on each subset of partitioned data.
To make it easy to understand, table partition is the action of dividing a table into several smaller tables with more manageable sets of data without having to create more tables manually. The window function will then process data through each partition one by one, separately, and computation restarts for each partition.
Without the PARTITION BY
clause, the entire result set will be treated as a single partition.
2. Syntax:
Here is the generic syntax for PARTITION BY
within a window function:
window_function ( expression ) OVER (
PARTITION BY expression1, expression2, ...
order_clause
frame_clause
)
3. How is it different from GROUP BY
?
The GROUP BY
clause basically is a serial action of combining multiple rows into a single row using the given criteria, thus reduces the number of rows in the query results.
The PARTITION BY
clause groups table data into multiple partitions using the given criteria. It does not change the number of returned rows in the query results but changes the way a window function’s result is calculated.
In the surface, it appears more like reordering the rows for the window function to process by sets.
4. Example
Let’s take an example to demonstrate the use of PARTITION BY
and its difference from GROUP BY
.
We have this table Course:
CourseID | Title | Credits | DepartmentID |
---|---|---|---|
1045 | Calculus | 4 | 7 |
1050 | Chemistry | 4 | 1 |
1061 | Physics | 4 | 1 |
2021 | Composition | 3 | 2 |
2030 | Poetry | 2 | 2 |
2042 | Literature | 4 | 2 |
3141 | Tri nometry | 4 | 7 |
4022 | Microeconomics | 3 | 4 |
4041 | Macroeconomics | 3 | 4 |
4061 | Quantitative | 2 | 4 |
Now we calculate the average credits of each department.
Using GROUP BY
clause:
SELECT
DepartmentID,
AVG(Credits) average_credits
FROM
Course
GROUP BY
DepartmentID
ORDER BY
DepartmentID;
And the result will be:
DepartmentID | average_credits |
---|---|
1 | 4 |
2 | 3 |
4 | 2 |
7 | 4 |
And when you use PARTITION BY
clause:
SELECT
CourseID,
Title,
Credits,
DepartmentID,
AVG(Credits) OVER (PARTITION BY DepartmentID) average_credits
FROM
Course;
The result will be:
CourseID | Title | Credits | DepartmentID | average_credits |
---|---|---|---|---|
1050 | Chemistry | 4 | 1 | 4 |
1061 | Physics | 4 | 1 | 4 |
2021 | Composition | 3 | 2 | 3 |
2030 | Poetry | 2 | 2 | 3 |
2042 | Literature | 4 | 2 | 3 |
4022 | Microeconomics | 3 | 4 | 2 |
4041 | Macroeconomics | 3 | 4 | 2 |
4061 | Quantitative | 2 | 4 | 2 |
1045 | Calculus | 4 | 7 | 4 |
3141 | Tri nometry | 4 | 7 | 4 |
Need a good GUI tool for databases? TablePlus provides a native client that allows you to access and manage Oracle, MySQL, SQL Server, PostgreSQL and many other databases simultaneously using an intuitive and powerful graphical interface.
Not on Mac? Download TablePlus for Windows.
Need a quick edit on the go? Download for iOS