How to remove all rows in a data frame with NAs in R
There is no straight forward way to remove NAs from a data frame in R. Let’s do this step by step.
> persons
age names cities zip
1 NA Ajay San Francisco 94000
2 32 Adam <NA> 40101
3 21 Mary Sunnyvale 94010
4 60 Aishu San Jose 94001
Step 1 – Identify all the elements that are NA
> is.na(persons)
age names cities zip
[1,] TRUE FALSE FALSE FALSE
[2,] FALSE FALSE TRUE FALSE
[3,] FALSE FALSE FALSE FALSE
[4,] FALSE FALSE FALSE FALSE
Step 2 – Inverse this to only show NAs as FALSE
> !is.na(persons1)
age names cities zip
[1,] FALSE TRUE TRUE TRUE
[2,] TRUE TRUE FALSE TRUE
[3,] TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE
Step 3 – Find out if all elements in a row are true
Step 4 – Apply this function across all rows of the data frame
# This gives a list of rows that have at least one NA
> apply(!is.na(persons1),1,all)
[1] FALSE FALSE TRUE TRUE
The first argument of apply() function is the data frame itself, the second function is 1( for row wise operation ) and all is a function to be called for each row.
Step 5 – Now let’s put it all together.
> persons = persons[apply(!is.na(persons),1,all),]
> persons
age names cities zip
3 21 Mary Sunnyvale 94010
4 60 Aishu San Jose 94001