How to visualize multi-dimensional data in R

Multi-dimensional data cannot be visualized easily. Here are some methods.

2D Scatter plots

Scatter plots typically are 2 dimensional. You can use the following methods to include more variables into the plot.

Color

Color can be used to map the third variable in a scatter plot.This is typically used when the third variable is categorical.

```> attach(iris)
> plot = plot(Sepal.Length,Sepal.Width,col=Species)
```

Shape

Shape can also be used to show the third variable. Similar to color, using shape for the 3rd variable makes sense when it is categorical. You can also use a combination of color and shape like below.

```plot = plot(Sepal.Length,Sepal.Width,type="p",
pch = c(16, 17, 18)[as.numeric(Species)],
col = c("red", "green","blue")[as.numeric(Species)])

```

Size

If the 3rd parameter is continuous, you can use the size to show big vs small values.

```> plot = plot(Sepal.Length,Sepal.Width,type="p",
cex = Petal.Length,
bg = Species, # Background color
pch = 21)
> legend("topleft",legend = unique(Species),
col = c("black","red","green"),
lty = 1:2)

```

The parameter cex controls the scaling of the dots. In fact, we were able to plot 4 parameters here

• Sepal.Length ( x- axis )
• Sepal.Width ( y-axis )
• Species ( with color )
• Petal.Length ( with size )

One conclusion here could be, the Petal Length is greater for Virginca in general than Setosa species.

3D Scatter plots

3 dimensional plots takes this to the next level. You can use the “z” dimension to map a third variable.

```col = c("#FF0000","#00FF00","#0000FF")
col = col[as.numeric(Species)]
scatterplot3d(Sepal.Length,Sepal.Width,Petal.Length,
bg=Species,
color = col)

```

This site uses Akismet to reduce spam. Learn how your comment data is processed.