Question

所以我有一个Excel文件，格式为：

StudID      Score

2234          96
1056          20
9886          70
6542          65
4315          15
2234          40
6542          97
9886          56
4315          32
6542          54

，我正在尝试获取每次出现StudID的频率。我将在哪里获得：

StudID        Frequency

2234              2
1056              1
9886              2
4315              2
6542              3

此外，基于以上所述，我想获取具有最大频率的StudID，因此在这种情况下，它将为 StudID 6542 。

stud <- read.csv("student.csv")
freq <- table(stud$StudID)
colnames(freq) <- c("StudID", "Frequency")
freq[which.max(freq)]

但是似乎我收到一条错误消息：

colnames<-（*tmp*中的错误，值= c（“ StudID”，“ Frequency”））：尝试在尺寸小于二维的对象上设置'colnames'

Answer 1

在基数R中，我们可以使用aggregate，然后遵循您的which.max逻辑

freq <- aggregate(Score~StudID, df, length)
freq[which.max(freq$Score), ]

#  StudID Score
#4   6542     3

或者，如果您只想要ID

freq$StudID[which.max(freq$Score)]
#[1] 6542

或与table

names(which.max(table(df$StudID)))
#[1] "6542"

Answer 2

错误告诉您yu试图将from sqlalchemy import func result = session.query(Parent.name, func.count(Click.parent_id)).join(Click).group_by(Click.parent_id).having(Parent.grandparent_id == some_grandparent_id).all()分配给尺寸小于2维的对象。确实，如果我们检查colnames的结构，那么我们可以看到它是一维对象，即

table()

您要做的是先转换为数据框，然后分配名称，即

str(table(df$V1))
 'table' int [1:5**(1d)**] 1 2 2 2 2 #(1d = 1 dimension)
 - attr(*, "dimnames")=List of 1
  ..$ : chr [1:5] "1056" "2234" "4315" "6542" ...

要提取最大值，您可以这样做，

dd <- setNames(as.data.frame(table(df$V1)), c('StudID', 'Freq'))

#  StudID Freq
#1   1056    1
#2   2234    2
#3   4315    2
#4   6542    3
#5   9886    2

数据：

dd$StudID[which.max(dd$Freq)]
#[1] 6542
#Levels: 1056 2234 4315 6542 9886

编辑： 要使其不根据您的评论返回“关卡”，我们可以简单地转换为字符，即

dput(df)
structure(list(V1 = c(2234L, 1056L, 9886L, 4315L, 2234L, 6542L, 
9886L, 4315L, 6542L, 6542L), V2 = c(96L, 20L, 70L, 15L, 40L, 
97L, 56L, 32L, 54L, 13L)), class = "data.frame", row.names = c(NA, 
-10L))

计数频率

2 个答案: