根据最高百分比将行名称分配给列名称

时间:2018-05-14 08:07:23

标签: r cluster-analysis percentage rowname

voyages =(
VIC0016,
VIC0016,
VIC0016,
VIC0016,
VIC0016,
VIC0016,
Truck,
VIC0016,
VIC0016,
VIC0016,
JUL0983,
BB11356,
VIC0022,
VIC0022,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981,
ISK1981)

clusters = (5,
5,
5,
4,
4,
4,
1,
3,
4,
3,
5,
2,
4,
5,
6,
6,
6,
6,
6,
6,
6,
6,
6,
6,
6,
6)

>calculate.confusion <- function(voyages, clusters)  
{
  d <- data.frame(voyages, clusters)  
  td <- as.data.frame(table(d))  
  # convert the raw counts into percentage of each voyage number  
  pc <- matrix(ncol=max(clusters),nrow=0)  
  for (i in 1:11) # 11 different voyage numbers  
  {  
    total <- sum(td[td$voyages==td$voyages[i],3])   
    #,3 is the third column, showing the frequencies  
    pc <- rbind(pc, td[td$voyages==td$voyages[i],3]/total)  
  }   
  rownames(pc) <- td[1:11,1]  
  colnames(pc)<-1:11  
  return(pc)  
}  

拥有上述数据框(数字是百分比),如何用行名称替换列名[1:11],方式如下:

    行中的
  • ,该行中具有最高百分比的列以该行命名
  • 每行名称使用一次
希望有人可以帮助我。

1 个答案:

答案 0 :(得分:0)

这应该有所帮助:

# sample data
df <- data.frame(a = c(1,2,3), b = c(3,2,1), c = c(2,3,1))
colnames(df)
# [1] "a" "b" "c"
for(i in 1:nrow(df)) {colnames(df)[df[i, ] == max(df[i, ])] <- rownames(df)[i]}
colnames(df)
# [1] "3" "1" "2"