我有一个我现在似乎无法解决的问题。我有一个非常大的包含买家和卖家的数据库。每个都由一个 ID 号标识,我想知道让卖家 A 作为您的供应商是否会增加让卖家 B 作为您的供应商的可能性。
最终的想法是运行回归以找出与 A 建立关系的概率,因为您知道 B 作为先前的供应商。
为此,我考虑为数据库中的每个供应商创建列(此处为 98C、99C、25A)。因此,如果供应商签订合同,则会出现一个小 1。但是我有 400 个不同的供应商,有人知道怎么做吗? (最后数据库会不会太大了?)
library(data.table)
set.seed(1)
Data <- data.frame(
Month = c(1,2,3,4,5,6,3,4,5,6,3,4,5,6,7),
Code_ID_Buy = c("100D","100D","100D","100D","100D","100D","102D","102D","102D","102D","100D","100D","100D","100D","100D"),
Code_ID_Sell = c("98C","98C","98C","98C","98C","98C","99C","99C","99C","99C","25A","25A","25A","25A","25A"),
New = c(0,0,0,0,0,0,0,0,0,0,1,0,0,0,0),
"98C" = c(1,1,1,1,1,1,0,0,0,0,1,1,1,1,0),
"99C" = c(0,0,0,0,0,0,1,1,1,1,0,0,0,0,0),
"25A" = c(0,0,1,1,1,1,0,0,0,0,1,1,1,1,1))
View(Data)
先谢谢你,