 # What is the difference between dataframe and matrix in R

Let’s create an employee table.

```install.packages("randomNames")
require(randomNames)
# Get 100 random names
name = randomNames(100)
```
```# Get 100 random ages
age = round(rnorm(100,mean = 30, sd = 10))
```

Now, let’s create a data frame with just 2 columns – name and age

```employees = data.frame(names, age, stringsAsFactors=FALSE)
> str(employees)
'data.frame': 100 obs. of  2 variables:
\$ names: chr  "Persons, Shelby" "Taylor, Chukwuma" "Jarvis, Destiny" "Rape, Zachery" ...
\$ age  : num  13 20 31 42 23 37 27 27 20 22 ...
```

You could do this because data.frame can contain columns of different types. In this case, names is a string and age is a number.

Can you do this with a matrix ? Of course not. Matrix can only contain one type of data.

### Convert a dataframe to Matrix

If you try to convert this dataframe to a matrix, look at what happens.

```> employees_m = data.matrix(employees)
Warning message:
In data.matrix(employees) : NAs introduced by coercion
```

What does the matrix contain ? The names (string) column was coerced to NAs.

```> str(employees_m)
num [1:100, 1:2] NA NA NA NA NA NA NA NA NA NA ...
- attr(*, "dimnames")=List of 2
..\$ : NULL
..\$ : chr [1:2] "names" "age"
```

As you can see, the data in the names column is gone.

```> head(employees_m)
names age
[1,]    NA  13
[2,]    NA  20
[3,]    NA  31
[4,]    NA  42
[5,]    NA  23
[6,]    NA  37

```

Here are the differences

1. Matrix is homogeneous but a data frame can be heterogeneous.

2. You can have factors in a data frame but not in a matrix