How to visualize multi-dimensional data in R
Multi-dimensional data cannot be visualized easily. Here are some methods.
2D Scatter plots
Scatter plots typically are 2 dimensional. You can use the following methods to include more variables into the plot.
Color can be used to map the third variable in a scatter plot.This is typically used when the third variable is categorical.
> attach(iris) > plot = plot(Sepal.Length,Sepal.Width,col=Species)
Shape can also be used to show the third variable. Similar to color, using shape for the 3rd variable makes sense when it is categorical. You can also use a combination of color and shape like below.
plot = plot(Sepal.Length,Sepal.Width,type="p", pch = c(16, 17, 18)[as.numeric(Species)], col = c("red", "green","blue")[as.numeric(Species)])
If the 3rd parameter is continuous, you can use the size to show big vs small values.
> plot = plot(Sepal.Length,Sepal.Width,type="p", cex = Petal.Length, bg = Species, # Background color pch = 21) > legend("topleft",legend = unique(Species), col = c("black","red","green"), lty = 1:2)
The parameter cex controls the scaling of the dots. In fact, we were able to plot 4 parameters here
- Sepal.Length ( x- axis )
- Sepal.Width ( y-axis )
- Species ( with color )
- Petal.Length ( with size )
One conclusion here could be, the Petal Length is greater for Virginca in general than Setosa species.
3D Scatter plots
3 dimensional plots takes this to the next level. You can use the “z” dimension to map a third variable.
col = c("#FF0000","#00FF00","#0000FF") col = col[as.numeric(Species)] scatterplot3d(Sepal.Length,Sepal.Width,Petal.Length, bg=Species, color = col)