说实话,这是一项相当复杂的任务。它基本上是我之前提出的问题的延伸 - Count unique values of a column by pairwise combinations of another column in R
让我们说这一次,我在R中有以下数据框:
data.frame(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A"))
数据看起来像这样 -
Reg.ID Location Product
1 1 X A
2 1 X B
3 2 Y A
4 2 Y B
5 2 Y C
6 3 X B
7 3 X A
我想通过“Product”列中的值的成对组合计算“Reg.ID”列的唯一值,按“Location”列分组。结果应如下所示 -
Location Prod.Comb Count
1 X A,B 2
2 Y A,B 1
3 Y A,C 1
4 Y B,C 1
我尝试使用基本R函数获取输出,但没有取得任何成功。我猜在R?
中使用data.table
包有一个相当简单的解决方案
非常感谢任何帮助。谢谢!
答案 0 :(得分:6)
没有太多经过考验的想法,但这是def safe_deallocate(self, statement_name):
curs.execute(
"select true from pg_prepared_statements where name = lower(%s)", (statement_name,))
if curs.rowcount:
curs.execute("deallocate {}".format(statement_name))
首先想到的:
data.table
答案 1 :(得分:2)
dplyr
解决方案,抄袭您提到的问题:
library(dplyr)
df <- data.frame(Reg.ID = c(1,1,2,2,2,3,3),
Location = c("X","X","Y","Y","Y","X","X"),
Product = c("A","B","A","B","C","B","A"),
stringsAsFactors = FALSE)
df %>%
full_join(df, by="Location") %>%
filter(Product.x < Product.y) %>%
group_by(Location, Product.x, Product.y) %>%
summarise(Count = length(unique(Reg.ID.x))) %>%
mutate(Prod.Comb = paste(Product.x, Product.y, sep=",")) %>%
ungroup %>%
select(Location, Prod.Comb, Count) %>%
arrange(Location, Prod.Comb)
# # A tibble: 4 × 3
# Location Prod.Comb Count
# <chr> <chr> <int>
# 1 X A,B 2
# 2 Y A,B 1
# 3 Y A,C 1
# 4 Y B,C 1