Mean by Group in R (2 Example Codes) | dplyr Package vs. Base R

Mean by Group in R (2 Examples) | dplyr Package vs. Base R

 

In this tutorial you’ll learn how to compute the mean by group in the R programming language.

I’ll show two different alternatives including reproducible R codes.

Let’s dig into it!

 

Example Data

For the following examples, I’m going to use the Iris Flower data set. Let’s load the data to R:

data

(

iris

)

# Load Iris data

head

(

iris

)

# First rows of Iris

data(iris) # Load Iris data
head(iris) # First rows of Iris

 

nrow function in R - Iris Example Data Frame

Table 1: The Iris Data Matrix.

 

As you can see based on Table 1, the Iris Flower data contains four numeric columns as well as the grouping factor column Species

Next, I’ll show you how to calculate the average for each of these groups. Keep on reading!

 

Example 1: Compute Mean by Group in R with aggregate Function

The first example shows how to calculate the mean per group with the aggregate function.

We can compute the mean for each species factor level of the Iris Flower data by applying the aggregate function as follows:

aggregate

(

x

=

iris$Sepal

.

Length

,

# Specify data column

by

=

list

(

iris$Species

)

,

# Specify group indicator

FUN

=

mean

)

# Specify function (i.e. mean)

 

# Group.1 x

# setosa 5.006

# versicolor 5.936

# virginica 6.588

aggregate(x = iris$Sepal.Length, # Specify data column
by = list(iris$Species), # Specify group indicator
FUN = mean) # Specify function (i.e. mean)

# Group.1 x
# setosa 5.006
# versicolor 5.936
# virginica 6.588

The RStudio console output shows the mean by group: The setosa group has a mean of 5.006, the versicolor group has a mean of 5.936, and the virginica group has a mean of 6.588.

Note: By replacing the FUN argument of the aggregate function, we can also compute other metrics such as the median, the mode, the variance, or the standard deviation.

 

Example 2: Compute Mean by Group with dplyr Package

It’s definitely a matter of taste, but many people prefer to use the dplyr package to compute descriptive statistics such as the mean. This example shows how to get the mean by group based on the dplyr environment.

Let’s install and load the dplyr package to R:

install

.

packages

(

"dplyr"

)

# Install dplyr package

library

(

"dplyr"

)

# Load dplyr package

install.packages(“dplyr”) # Install dplyr package
library(“dplyr”) # Load dplyr package

Now, we can use all the functions of the dplyr package – in our case group_by and summarise_at:

iris 

%>%

# Specify data frame

group_by

(

Species

)

%>%

# Specify group indicator

summarise_at

(

vars

(

Sepal

.

Length

)

,

# Specify column

list

(

name

=

mean

)

)

# Specify function

 

# A tibble: 3 x 2

# Species Sepal.Length

#

# setosa 5.01

# versicolor 5.94

# virginica 6.59

iris %>% # Specify data frame
group_by(Species) %>% # Specify group indicator
summarise_at(vars(Sepal.Length), # Specify column
list(name = mean)) # Specify function

# A tibble: 3 x 2
# Species Sepal.Length
#
# setosa 5.01
# versicolor 5.94
# virginica 6.59

The output of the previous R syntax is a tibble instead of a data.frame. However, the results are the same as in Example 1.

 

Video, Further Resources & Summary

On the Statistics Globe YouTube channel, you can also find a tutorial video, where I explain the content of this topic in some more detail:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

This tutorial illustrated how to compute the means for certain data frame subsets (i.e. groups) in the R programming language. In case you want to learn more about the theoretical research concept of the mean, I can recommend the following video of the mathantics YouTube channel:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Furthermore, you could also have a look at some of the related R tutorials that I have published on my website:

I hope you found the tutorial helpful. However, if you have any questions or comments, don’t hesitate to let me know below.

 

Alternate Text Gọi ngay