How to remove all rows in a data frame with NAs in R

How to remove all rows in a data frame with NAs in R


  R Interview Questions

There is no straight forward way to remove NAs from a data frame in R. Let’s do this step by step.

> persons
  age names        cities   zip
1  NA  Ajay San Francisco 94000
2  32  Adam          <NA> 40101
3  21  Mary     Sunnyvale 94010
4  60 Aishu      San Jose 94001

Step 1 – Identify all the elements that are NA

> is.na(persons)
       age names cities   zip
[1,]  TRUE FALSE  FALSE FALSE
[2,] FALSE FALSE   TRUE FALSE
[3,] FALSE FALSE  FALSE FALSE
[4,] FALSE FALSE  FALSE FALSE

Step 2 – Inverse this to only show NAs as FALSE

> !is.na(persons1)
       age names cities  zip
[1,] FALSE  TRUE   TRUE TRUE
[2,]  TRUE  TRUE  FALSE TRUE
[3,]  TRUE  TRUE   TRUE TRUE
[4,]  TRUE  TRUE   TRUE TRUE

Step 3 – Find out if all elements in a row are true

Step 4 – Apply this function across all rows of the data frame

# This gives a list of rows that have at least one NA
> apply(!is.na(persons1),1,all)
[1] FALSE FALSE  TRUE  TRUE

The first argument of apply() function is the data frame itself, the second function is 1( for row wise operation ) and all is a function to be called for each row.

Step 5 – Now let’s put it all together.

> persons = persons[apply(!is.na(persons),1,all),]
> persons
  age names    cities   zip
3  21  Mary Sunnyvale 94010
4  60 Aishu  San Jose 94001

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: