How to view scatter plots for more than 2 variables in R


  R Interview Questions

plotting 2 variables is probably the most common graphic you would use to determine patterns between them. However, if there are more than 2 variables and you want to find out visually which of these variables are correlated, you might want to have a scatter plot across all these variables. This is more or less a visual version of the cor () function in R.

For example, there are 4 variables in the iris data set that can be used to identify the species. However, we want to identify if there is any correlation across these variables. The easier way to do it is via the pairs () function.

1. Using PAIRS () function

# We only want to find out the correlation between the first 4 fields.
> pairs(iris[,1:4])

And R gives out this convenient plot that does plotting across each pair or fields. In case there is confusion reading this graph, the way to interpret it is as below.

For example, the plot in the red box is equivalent to

> library(GGally)
> ggpairs(iris[1:4])

and the plot in the blue box is equivalent to

 > plot(iris$Petal.Length ~ iris$Petal.Width)

2. Using GGPAIRS () function from the library GGally

This one is like a turbo version of the pairs () function. It gives you

  • correlation plot
  • correlation coefficients
  • histogram.
> library(GGally)
> ggpairs(iris[1:4])

and the output looks something like this.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.