How to remove all rows in a data frame with NAs in R
There is no straight forward way to remove NAs from a data frame in R. Let’s do this step by step.
> persons age names cities zip 1 NA Ajay San Francisco 94000 2 32 Adam <NA> 40101 3 21 Mary Sunnyvale 94010 4 60 Aishu San Jose 94001
Step 1 – Identify all the elements that are NA
> is.na(persons) age names cities zip [1,] TRUE FALSE FALSE FALSE [2,] FALSE FALSE TRUE FALSE [3,] FALSE FALSE FALSE FALSE [4,] FALSE FALSE FALSE FALSE
Step 2 – Inverse this to only show NAs as FALSE
> !is.na(persons1) age names cities zip [1,] FALSE TRUE TRUE TRUE [2,] TRUE TRUE FALSE TRUE [3,] TRUE TRUE TRUE TRUE [4,] TRUE TRUE TRUE TRUE
Step 3 – Find out if all elements in a row are true
Step 4 – Apply this function across all rows of the data frame
# This gives a list of rows that have at least one NA > apply(!is.na(persons1),1,all) [1] FALSE FALSE TRUE TRUE
The first argument of apply() function is the data frame itself, the second function is 1( for row wise operation ) and all is a function to be called for each row.
Step 5 – Now let’s put it all together.
> persons = persons[apply(!is.na(persons),1,all),] > persons age names cities zip 3 21 Mary Sunnyvale 94010 4 60 Aishu San Jose 94001