如何在Julia中将数组数组转换为DataFrame?

时间:2017-11-14 16:10:54

标签: dataframe julia

我有一个map调用会产生一行计算值,因此我有一个ArrayArray Any,就像这样

12-element Array{Array{Any,1},1}:
 Any[2015-09-01T00:00:00, 2016-09-01T00:00:00, 98, 53.1] 
 Any[2015-10-01T00:00:00, 2016-10-01T00:00:00, 92, 58.7] 
 Any[2015-11-01T00:00:00, 2016-11-01T00:00:00, 130, 64.6]
 Any[2015-12-01T00:00:00, 2016-12-01T00:00:00, 135, 67.4]
 Any[2016-01-01T00:00:00, 2017-01-01T00:00:00, 206, 59.2]
 Any[2016-02-01T00:00:00, 2017-02-01T00:00:00, 246, 54.1]
 Any[2016-03-01T00:00:00, 2017-03-01T00:00:00, 254, 53.9]
 Any[2016-04-01T00:00:00, 2017-04-01T00:00:00, 268, 65.7]
 Any[2016-05-01T00:00:00, 2017-05-01T00:00:00, 265, 61.5]
 Any[2016-06-01T00:00:00, 2017-06-01T00:00:00, 303, 52.8]
 Any[2016-07-01T00:00:00, 2017-07-01T00:00:00, 301, 59.1]
 Any[2016-08-01T00:00:00, 2017-08-01T00:00:00, 273, 54.6]

是否有一种简单的方法可以将其转换为DataFrame,列名称等等?如果没有一个简单的方法,我会向更难的方式开放:)我可以考虑重新运行map四次以提取列并从那些构建DataFrame,但是对于这种看似平凡的操作来说,这听起来像很多代码......

编辑我可以将行“转置”为这样的列

map(x -> map(y -> y[x], r), collect(1:4)

其中r是上表,所以我想解决方案是为DataFrame构造函数提供列名。因此,我的临时解决方案是

DataFrame(map(x -> map(y -> y[x], r), collect(1:4)), [:a, :b, :c, :d])

1 个答案:

答案 0 :(得分:2)

julia> df
12-element Array{Array{Any,1},1}:
 Any["2015-09-01T00:00:00", "2016-09-01T00:00:00", 98, 53.1] 
 Any["2015-10-01T00:00:00", "2016-10-01T00:00:00", 92, 58.7] 
 Any["2015-11-01T00:00:00", "2016-11-01T00:00:00", 130, 64.6]
 Any["2015-12-01T00:00:00", "2016-12-01T00:00:00", 135, 67.4]
 Any["2016-01-01T00:00:00", "2017-01-01T00:00:00", 206, 59.2]
 Any["2016-02-01T00:00:00", "2017-02-01T00:00:00", 246, 54.1]
 Any["2016-03-01T00:00:00", "2017-03-01T00:00:00", 254, 53.9]
 Any["2016-04-01T00:00:00", "2017-04-01T00:00:00", 268, 65.7]
 Any["2016-05-01T00:00:00", "2017-05-01T00:00:00", 265, 61.5]
 Any["2016-06-01T00:00:00", "2017-06-01T00:00:00", 303, 52.8]
 Any["2016-07-01T00:00:00", "2017-07-01T00:00:00", 301, 59.1]
 Any["2016-08-01T00:00:00", "2017-08-01T00:00:00", 273, 54.6]

julia> DataFrame(permutedims(Array(DataFrame(map(data,df))), [2, 1]))
12×4 DataFrames.DataFrame
│ Row │ x1                    │ x2                    │ x3  │ x4   │
├─────┼───────────────────────┼───────────────────────┼─────┼──────┤
│ 1   │ "2015-09-01T00:00:00" │ "2016-09-01T00:00:00" │ 98  │ 53.1 │
│ 2   │ "2015-10-01T00:00:00" │ "2016-10-01T00:00:00" │ 92  │ 58.7 │
│ 3   │ "2015-11-01T00:00:00" │ "2016-11-01T00:00:00" │ 130 │ 64.6 │
│ 4   │ "2015-12-01T00:00:00" │ "2016-12-01T00:00:00" │ 135 │ 67.4 │
│ 5   │ "2016-01-01T00:00:00" │ "2017-01-01T00:00:00" │ 206 │ 59.2 │
│ 6   │ "2016-02-01T00:00:00" │ "2017-02-01T00:00:00" │ 246 │ 54.1 │
│ 7   │ "2016-03-01T00:00:00" │ "2017-03-01T00:00:00" │ 254 │ 53.9 │
│ 8   │ "2016-04-01T00:00:00" │ "2017-04-01T00:00:00" │ 268 │ 65.7 │
│ 9   │ "2016-05-01T00:00:00" │ "2017-05-01T00:00:00" │ 265 │ 61.5 │
│ 10  │ "2016-06-01T00:00:00" │ "2017-06-01T00:00:00" │ 303 │ 52.8 │
│ 11  │ "2016-07-01T00:00:00" │ "2017-07-01T00:00:00" │ 301 │ 59.1 │
│ 12  │ "2016-08-01T00:00:00" │ "2017-08-01T00:00:00" │ 273 │ 54.6 │

我认为你的解决方案要好得多......!

相关问题