将嵌套列表转换为具有不同列长度的data.frame

时间:2017-10-18 20:26:35

标签: r dataframe nested-lists

我试图将嵌套列表转换为data.frame但没有运气。有一些并发症,主要是专栏"结果"位置1的位置与位置2不一致,因为位置2没有结果。

项目长度在不同位置不一致

var array1 = ["a",  "b",  "c",  "d",  "d",  "c",  "d"]
var array2 = [ 1,    2,    3,    4,    4,    3,    4]
var array3 = ["aa", "bb", "cc", "dd", "dd", "cc", "dx"]
                                               /*  ^^ obs */

var seen = Set<ZippedElement>()
zip(zip(array1, array2), array3)
    .map { ZippedElement($0.0, $0.1, $1) }
    .enumerated().filter { !seen.insert($1).inserted }
    .map { $0.offset }.reversed()
    .forEach {
        print($0)
        array1.remove(at: $0)
        array2.remove(at: $0)
        array3.remove(at: $0)
    }

print(array1) // ["a", "b", "c", "d", "d"]
print(array2) // [1, 2, 3, 4, 4]
print(array3) // ["aa", "bb", "cc", "dx"]
                                  /* ^^ ok */

我尝试了以下代码,但他们不是&#39;工作

[[1]]
[[1]]$html_attributions
list()

[[1]]$results
  geometry.location.lat geometry.location.lng
1              25.66544             -100.4354
                                        id                    place_id
1 6ce0a030663144c8e992cbce51eb00479ef7db89 ChIJVy7b7FW9YoYRdaH2I_gOJIk
                                                                                                                                                                                       reference
1 CmRSAAAATdtVfB4Tz1aQ8GhGaw4-nRJ5lZlVNgiOR3ciF4QjmYC56bn6b7omWh1SJEWWqQQEFNXxGZndgEwSgl8sRCOtdF8aXpngUY878Q__yH4in8EMZMCIqSHLARqNgGlV4mKgEhDlvkHLXLiBW4F_KQVT83jIGhS5DJipk6PAnpPDXP2p-4X5NPuG9w

[[1]]$status
[1] "OK"

[[2]]
[[2]]$html_attributions
list()

[[2]]$results
list()

[[2]]$status
[1] "ZERO_RESULTS"

1 个答案:

答案 0 :(得分:0)

我认为我有足够的原始JSON能够创建一个可重现的示例:

okjson <- '{"html_attributions":[],"results":[{"geometry":{"location":{"lat":25.66544,"lon":-100.4354},"id":"foo","place_id":"quux"}}],"status":"OK"}'
emptyjson <- '{"html_attributions":[],"results":[],"status":"ZERO_RESULTS"}'
jsons <- list(okjson, emptyjson, okjson)

从这里开始,我将逐步(慢慢地)完成整个过程。我已将大部分中间结构包括在内以便于重现,我为冗长而道歉。这可以很容易地组合在一起和/或放在magrittr管道中。

lists <- lapply(jsons, jsonlite::fromJSON)
str(lists)
# List of 3
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          :'data.frame': 1 obs. of  1 variable:
#   .. ..$ geometry:'data.frame':   1 obs. of  3 variables:
#   .. .. ..$ location:'data.frame':    1 obs. of  2 variables:
#   .. .. .. ..$ lat: num 25.7
#   .. .. .. ..$ lon: num -100
#   .. .. ..$ id      : chr "foo"
#   .. .. ..$ place_id: chr "quux"
#   ..$ status           : chr "OK"
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          : list()
#   ..$ status           : chr "ZERO_RESULTS"
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          :'data.frame': 1 obs. of  1 variable:
#   .. ..$ geometry:'data.frame':   1 obs. of  3 variables:
#   .. .. ..$ location:'data.frame':    1 obs. of  2 variables:
#   .. .. .. ..$ lat: num 25.7
#   .. .. .. ..$ lon: num -100
#   .. .. ..$ id      : chr "foo"
#   .. .. ..$ place_id: chr "quux"
#   ..$ status           : chr "OK"


goodlists <- Filter(function(a) "results" %in% names(a) && length(a$results) > 0, lists)
goodresults <- lapply(goodlists, `[[`, "results")
str(goodresults)
# List of 2
#  $ :'data.frame': 1 obs. of  1 variable:
#   ..$ geometry:'data.frame':  1 obs. of  3 variables:
#   .. ..$ location:'data.frame':   1 obs. of  2 variables:
#   .. .. ..$ lat: num 25.7
#   .. .. ..$ lon: num -100
#   .. ..$ id      : chr "foo"
#   .. ..$ place_id: chr "quux"
#  $ :'data.frame': 1 obs. of  1 variable:
#   ..$ geometry:'data.frame':  1 obs. of  3 variables:
#   .. ..$ location:'data.frame':   1 obs. of  2 variables:
#   .. .. ..$ lat: num 25.7
#   .. .. ..$ lon: num -100
#   .. ..$ id      : chr "foo"
#   .. ..$ place_id: chr "quux"

goodresultsdf <- lapply(goodresults, function(a) jsonlite::flatten(as.data.frame(a)))
str(goodresultsdf)
# List of 2
#  $ :'data.frame': 1 obs. of  4 variables:
#   ..$ geometry.id          : chr "foo"
#   ..$ geometry.place_id    : chr "quux"
#   ..$ geometry.location.lat: num 25.7
#   ..$ geometry.location.lon: num -100
#  $ :'data.frame': 1 obs. of  4 variables:
#   ..$ geometry.id          : chr "foo"
#   ..$ geometry.place_id    : chr "quux"
#   ..$ geometry.location.lat: num 25.7
#   ..$ geometry.location.lon: num -100

我们现在有一个list - { - 1}},这是个好地方。

data.frame