2012年11月1日星期四

R语言数据操作:笔记 chap 9

添加计算的新列:
somedata = transform(somedata, logval = log(val))

划分值域:
sepal = cut(data$length, c(1,5,9,10), include.lowest=TRUE,right=FALSE)

car包,对变量重新编码:
newgroup = recode(group, 'c(1,5)=1; c(2,4)=2; else=3')

stack函数重排数据
data1 data2 data3   =>  values ind (其中ind取值为data1/data2/data3)

unstack转换回来
mydata = unstack(sdata, values~ind)

reshape包:

# obs: subj, time, x, y

mobs = melt(obs)

cast(subj ~ variable + time , data=mobs)

#subj  x_1 x_2 x_3 y_1 y_2 y_3


#后面1-3取值为time的值


cast(subj ~ variable | time, data=mobs)

# $'1' : subj  x y


# $'2' : subj x y


# $'3' :  subj x y


#按time分组


cast(subj ~ variable + time , subset = variable == 'x', data=mobs)

#只取出x的数据,不取y的


 

合并
merge(x,y,all=TRUE)

merge(x,y,all.x=TRUE)

merge(x,y,by.x=cola,by.y=colb)ch(x

找出匹配值的索引
indices = match(x$a, y$a, nomatch=0)

y$a[indices]

没有评论:

发表评论