A short description of the post as per Session 3 of ISSS608 Visual Analytics module.
In this article, we will show you how to plot a figure with multiple histograms by using ggplot2 and ggpubr packages.
Before you get started, you are required:
To insert a graph.
Next, you will use the code chunk below to install and launch ggpubr and tidyverse in RStudio.
packages = c('ggpubr', 'tidyverse')
for(p in packages){library
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
In this hands-on exercise, the Wine Quality Data Set of UCI Machine Learning Repository will be used. The data set consists of 13 variables and 6497 observations. For the purpose of this exercise, we have combined the red wine and white wine data into one data file. It is called wine_quality and is in csv file format.
First, let us import the data into R by using read_csv() of readr package.
wine <- read_csv("data/wine_quality.csv")
Notice that beside quality and type, the rest of the variables are numerical and continuous data type.
In the figure below, multiple histograms are plottted by using ggplot() and geom_histogram() of ggplot2 package. Then, ggarrange() of ggpubr package is used to patch these histogram into a single figure to reveal the distribution of the selected variables in the wine quality data sets.
fa <- ggplot(data=wine, aes(x= `fixed acidity`)) +
geom_histogram(bins=20, color="black", fill="light blue")
va <- ggplot(data=wine, aes(x= `volatile acidity`)) +
geom_histogram(bins=20, color="black", fill="light blue")
ca <- ggplot(data=wine, aes(x= `citric acid`)) +
geom_histogram(bins=20, color="black", fill="light blue")
rs <- ggplot(data=wine, aes(x= `residual sugar`)) +
geom_histogram(bins=20, color="black", fill="light blue")
ch <- ggplot(data=wine, aes(x= `chlorides`)) +
geom_histogram(bins=20, color="black", fill="light blue")
fSO2 <- ggplot(data=wine, aes(x= `free sulfur dioxide`)) +
geom_histogram(bins=20, color="black", fill="light blue")
tSO2 <- ggplot(data=wine, aes(x= `total sulfur dioxide`)) +
geom_histogram(bins=20, color="black", fill="light blue")
density <- ggplot(data=wine, aes(x= density)) +
geom_histogram(bins=20, color="black", fill="light blue")
pH <- ggplot(data=wine, aes(x= pH)) +
geom_histogram(bins=20, color="black", fill="light blue")
sulphates <- ggplot(data=wine, aes(x= sulphates)) +
geom_histogram(bins=20, color="black", fill="light blue")
alcohol <- ggplot(data=wine, aes(x= alcohol)) +
geom_histogram(bins=20, color="black", fill="light blue")
ggarrange(fa, va, ca, rs, ch, fSO2, tSO2, density, pH, sulphates, alcohol,
ncol = 4, nrow = 3)
For attribution, please cite this work as
Lim (2021, May 17). Yong Kai: My 1st Post. Retrieved from https://limyongkai.netlify.app/posts/2021-05-22-my-1st-post/
BibTeX citation
@misc{lim2021my, author = {Lim, Yong Kai}, title = {Yong Kai: My 1st Post}, url = {https://limyongkai.netlify.app/posts/2021-05-22-my-1st-post/}, year = {2021} }