来自另一个变量的新变量

时间:2021-04-26 15:28:37

标签: r

我有一个我现在似乎无法解决的问题。我有一个非常大的包含买家和卖家的数据库。每个都由一个 ID 号标识,我想知道让卖家 A 作为您的供应商是否会增加让卖家 B 作为您的供应商的可能性。

最终的想法是运行回归以找出与 A 建立关系的概率,因为您知道 B 作为先前的供应商。

为此,我考虑为数据库中的每个供应商创建列(此处为 98C、99C、25A)。因此,如果供应商签订合同,则会出现一个小 1。但是我有 400 个不同的供应商,有人知道怎么做吗? (最后数据库会不会太大了?)

library(data.table)
set.seed(1)
Data <- data.frame(
  Month = c(1,2,3,4,5,6,3,4,5,6,3,4,5,6,7),
  Code_ID_Buy = c("100D","100D","100D","100D","100D","100D","102D","102D","102D","102D","100D","100D","100D","100D","100D"),
  Code_ID_Sell = c("98C","98C","98C","98C","98C","98C","99C","99C","99C","99C","25A","25A","25A","25A","25A"),
  New = c(0,0,0,0,0,0,0,0,0,0,1,0,0,0,0),
  "98C" = c(1,1,1,1,1,1,0,0,0,0,1,1,1,1,0),
  "99C" = c(0,0,0,0,0,0,1,1,1,1,0,0,0,0,0),
  "25A" = c(0,0,1,1,1,1,0,0,0,0,1,1,1,1,1))

View(Data)

先谢谢你,

0 个答案:

没有答案
相关问题