Difference between GROUP BY and ORDER BY in Simple Words

For someone who is learning SQL, one of the most common places to get stuck is when learning the GROUP BY command. GROUP BY and ORDER BY are two important keywords in SQL that we use to organize data. The difference between GROUP BY and ORDER BY is that ORDER BY is more simple than GROUP BY and is usually introduced early in an SQL course.

Sometimes, people are quite confused over these two concepts (SQL ORDER BY vs. GROUP BY), and the reason behind this confusion is a lack of understanding of both concepts. While both of them are used to arrange data based on their values, their use cases vastly differ from each other. If we learn both these concepts properly, the confusion will naturally go away.

In this article, I used an interesting database published here. The data set contains more than 2000 craft beers and 500 breweries used in the United States. The data is included in CSV files, which you can easily import to any of your databases using a GUI tool. One of my colleagues has shown how to do so in an example of LearnSQL’s survey data.

There are two tables in this dataset: beers and breweries. Let’s see the sample dataset from the beers table to get a better picture of it.

beers table

ORDER BY

Our first topic is ORDER BY. Before trying to understand the difference between GROUP BY and ORDER BY, let’s see how the SELECT command works.

Let’s SELECT the name, city, and state from every record in the breweries table.

SELECT name, 
 city, 
 state 
FROM breweries

When you execute the above command, you will get something like the following:

SELECT the name, city, and state from breweries table

You can see that these results are not ordered according to their names. The default order for a SELECT statement is non-deterministic, which means the order of results can differ anytime you run your code.

However, a non-deterministic order is usually not what we want. For example, suppose you’d like to order results by ascending or descending order of their state column; in that case, you would need the ORDER BY command. All you have to do is use the ORDER BY clause like this:

SELECT name, 
city, 
state 
FROM 	breweries 
ORDER BY state ASC

ORDER BY

Similarly, you can get results in the descending order of the state as well.

SELECT name, 
city, 
state 
FROM breweries
ORDER BY state DESC

descending order

You can see the records that are organized by descending order of the state. The default order of ORDER BY command is ascending order. So, if you didn’t use ASC or DESC, results will come in ascending order.

You can use ORDER BY in two or more columns. Check the following query:

SELECT name, 
 city, 
 state 
FROM breweries
ORDER BY state,
         city DESC

This query first orders results in the ascending order of the state, followed by the descending order of the city.

ORDER BY

Now, the interesting thing is that someone who doesn’t know how to use GROUP BY much could use ORDER BY to analyze data. For example, suppose that you want to find out how many breweries are there in a given state. What you could do is get the results ordered by their states. This will give you a nice grouped result from which you can manually count the number of breweries from a given state.

Therefore, someone who doesn’t have a strong grasp of GROUP BY might think that ORDER BY is an excellent way to group the same data. That idea is not wrong. If you want to display data in a nice grouped way, ORDER BY is a good bet.

GROUP BY is not a way to display data in groups, but it is more of a way to analyze data in groups. To further comprehend SQL ORDER BY vs. GROUP BY, let’s have a look at GROUP BY in more detail.

GROUP BY

In most texts, GROUP BY is defined as a way of aggregating records by the specified columns which allow you to perform aggregation functions on non-grouped columns (such as SUM, COUNT, AVG, etc). In other words, the GROUP BY clause’s purpose is to summarize unique combinations of columns values.

A few examples will further clarify:

Let’s group beers table based on style column.

SELECT style 
FROM beers 
GROUP BY style

The above SQL query will generate the following output

ORDER BY

So, let’s see what has happened here. This query has returned one result from each style value. That means SQL first makes groups from the same style values and returns one row representing the group.

And what is the exact use of GROUP BY? Of course, you can use GROUP BY to find distinct values. But SQL has a DISTINCT keyword specifically for that. The real importance of GROUP BY can be seen when you use it with aggregate functions like SUM(), COUNT(). To understand it better, execute the following SQL query:

SELECT style, 
 COUNT(Name) 
FROM beers 
GROUP BY style

It will generate the following result:

ORDER BY

Here, SQL first groups results based on the style column. Then, it checks how many names are there in each group and returns the style values and the count of names for each style.

If you look closely, you may notice that the results are already ordered in ascending order. This situation occasionally tricks people into believing that GROUP BY sorts results. In actuality, there is no guarantee that GROUP BY will display results in ascending order. If you need results in a specific order, you have to do it yourself like below:

SELECT style, 
COUNT(Name) 
FROM beers 
GROUP BY style 
ORDER BY style

Therefore, GROUP BY works nicely along with ORDER BY.

So, now you know how to use GROUP BY to count a set of values belongs to a certain group. I will give you two more real-world examples to understand the use of GROUP BY.

  1. Think of an employee table where you store information about employees, such as their name, job role, and salary. Each employee belongs to a different department, like finance, IT, or transport. Now, suppose you want to get the sum of salaries you paid for employees based on their department. You will need to execute an SQL command like below:

    SELECT department,
     SUM(salary) 
    FROM employee 
    GROUP BY department
    
  2. Suppose you have an eCommerce website that sells several product types. In your database, you have a table to store information about stocks. If you want to find the count of each product type, you can use the GROUP BY with COUNT aggregate function:

    SELECT product_type, 
     COUNT(product_id) 
    FROM stock
    GROUP BY product_type
    

Let’s take this lesson a bit further to increase your knowledge of GROUP BY. You can also group results using two columns. For example, let’s group the beers table based on style and brewery_id.

SELECT style,
 brewery_id,
 COUNT(Name) 
FROM beers 
GROUP BY style,
   brewery_id

This SQL query will divide style groups further using brewery_id. When you execute it, you will get the following result:

ORDER BY

Conclusion

In this article, I explained the difference between ORDER BY and GROUP BY. ORDER BY sorts data based on data of a column. Of course, as a result of sorting, data from the same values will be brought together as a group, which will make it easy for you to manually analyze later. But GROUP BY is the SQL way to analyze similar data.

There is no real use of GROUP BY without aggregation functions. Sometimes, GROUP BY can return results in sorted order, but you shouldn’t rely on it. The order in which results are returned is non-deterministic and depends on how the db engine executes the query.

After reading this article, you should be able to use GROUP BY and ORDER BY with more confidence. If you want to refine your knowledge of SQL ORDER BY vs. GROUP BY, check this practice set–it contains a special section dedicated to this subject.

Alternate Text Gọi ngay