Wednesday, October 10, 2012

Looping through variable names in R

Ok, so most people probably know how to do this, but I have a mental block every time I have to do this.  If you have a list of variable names for which you want to do something, here are several ways to do it.

> set.seed(1234567)
> dat=data.frame(matrix(rbinom(100, 5,.3), ncol=5))
> head(dat)
  X1 X2 X3 X4 X5
1  2  3  1  2  1
2  2  1  1  3  0
3  3  3  2  1  2
4  0  1  2  2  2
5  2  1  0  1  1
6  1  1  3  2  2
> nms=names(dat)
> for(i in 1:length(nms)){
+   print(with(dat, eval(parse(text=paste("table(",nms[i],")")))))
+ }
X1
0 1 2 3 
3 7 6 4 
X2
0 1 2 3 4 5 
2 9 4 3 1 1 
X3
0 1 2 3 
4 8 6 2 
X4
 0  1  2  3 
 1 11  6  2 
X5
0 1 2 3 4 
3 7 8 1 1  
The innermost paste() component produces a character string which is the function to be evaluated.
 
> i=5 
> paste("table(",nms[i],")")
[1] "table( X5 )"
It is then parsed as text, and evaluated within the dat environment.  Finally, because the output can be displayed by wrapping print() around the whole expression.  This approach is proposed here

Alternatively, one could use lapply() and avoid the loop as proposed by UCLA ATS here.

> with(dat, lapply(names(dat), 
+                  function(x){
+                    table(eval(substitute(tmp, list(tmp=as.name(x)))))
+                  }))
[[1]]

0 1 2 3 
3 7 6 4 

[[2]]

0 1 2 3 4 5 
2 9 4 3 1 1 

[[3]]

0 1 2 3 
4 8 6 2 

[[4]]

 0  1  2  3 
 1 11  6  2 

[[5]]

0 1 2 3 4 
3 7 8 1 1

Note that this method doesn't print a name for each table. This problem can be solved by using sapply(), the "user-friendly version and wrapper of lapply()", and specifying the USE.NAMES=TRUE option.

> with(dat, sapply(names(dat), 
+                  function(x){
+                    table(eval(substitute(tmp, list(tmp=as.name(x)))))
+                    }, 
+                  USE.NAMES=TRUE))
$X1

0 1 2 3 
3 7 6 4 

$X2

0 1 2 3 4 5 
2 9 4 3 1 1 

$X3

0 1 2 3 
4 8 6 2 

$X4

 0  1  2  3 
 1 11  6  2 

$X5

0 1 2 3 4 
3 7 8 1 1 


In Stata there is a designated command for this - one could simply use
foreach var of varlist X1-X5{
   tab `var'
}

Subscribe via email

Enter your email address:

Delivered by FeedBurner

Followers

google analytics