将重复的日期行转换为列行?

时间:2019-06-10 02:04:36

标签: python r

我正在尝试转换行(具有一个键,但是由于更改期开始日期和更改期结束日期发生多次更改而被复制)。我认为将它们转换为行将删除重复的值。我尝试使用Python进行数据透视,但由于值将是date列,因此无法执行任何操作。

这是我所拥有的:

enter image description here

这就是我要实现的目标 enter image description here

PS-我有数百万个订单的记录。我需要一种可以使其自动化的解决方案。

1 个答案:

答案 0 :(得分:1)

Python解决方案:

import pandas as pd
df = pd.DataFrame({"Change Period Start":["2/2/2019", "2/2/2019", "2/2/2019", "9/11/2019"], 
                   "Change Period End":["9/11/2019", "9/11/2019", "5/5/2019", "9/11/2019"], 
                   "Change Period Supplier":["1/1/2020", "1/1/2020", "1/1/2025", "9/11/2019"]})

df.drop_duplicates(subset=['Change Period Supplier'])

Change Period Start Change Period End   Change Period Supplier
            2/2/2019        9/11/2019                 1/1/2020
            2/2/2019         5/5/2019                 1/1/2025
           9/11/2019        9/11/2019                9/11/2019

R解决方案:

Change.Period.Start <- c("2/2/2019", "2/2/2019", "2/2/2019", "9/11/2019")
Change.Period.End <- c("9/11/2019", "9/11/2019", "5/5/2019", "9/11/2019")
Change.Period.Supplier <- c("1/1/2020", "1/1/2020", "1/1/2025", "9/11/2019")
df = data.frame(Change.Period.Start, Change.Period.End, Change.Period.Supplier)

df[!duplicated(df$Change.Period.Supplier), ]

  Change.Period.Start Change.Period.End Change.Period.Supplier
1            2/2/2019         9/11/2019               1/1/2020
3            2/2/2019          5/5/2019               1/1/2025
4           9/11/2019         9/11/2019              9/11/2019

根据OP的评论更新了R版本

GR.Key <- c("A", "A", "A", "B")
Change.Period.Start <- c("2/2/2019", "2/2/2019", "2/2/2019", "9/11/2019")
Change.Period.End <- c("9/11/2019", "9/11/2019", "5/5/2019", "9/11/2019")
Change.Period.Supplier <- c("1/1/2020", "1/1/2020", "1/1/2025", "9/11/2019")
df = data.frame(GR.Key, Change.Period.Start, Change.Period.End, Change.Period.Supplier)

library(data.table)
dcast(df, GR.Key ~ paste0("Change.Period.Start", rowid(GR.Key)), value.var = "Change.Period.Start")

  GR.Key Change.Period.Start1 Change.Period.Start2 Change.Period.Start3
1      A             2/2/2019             2/2/2019             2/2/2019
2      B            9/11/2019                 <NA>                 <NA>