通过逻辑索引向量子集列表

时间:2014-03-13 08:18:52

标签: r indexing vectorization subset nested-lists

我有复杂列表,需要根据布尔元素的值从中选择子集(我需要带hidden值的记录等于FALSE)。我已经尝试了以下代码,基于索引向量,但它失败了(如此输出末尾所示):

startups <- data$startups[data$startups$hidden == FALSE]

或者,或者:

startups <- data$startups[!as.logical(data$startups$hidden)]

交互式R会话证明数据存在:

Browse[1]> str(data$startups, list.len=3)
List of 50
 $ :List of 23
  ..$ id               : num 357496
  ..$ hidden           : logi FALSE
  ..$ community_profile: logi FALSE
  .. [list output truncated]
 $ :List of 2
  ..$ id    : num 352159
  ..$ hidden: logi TRUE
 $ :List of 2
  ..$ id    : num 352157
  ..$ hidden: logi TRUE
  [list output truncated]

Browse[1]> data$startups[data$startups$hidden == FALSE]
list()

Browse[1]> data$startups[!as.logical(data$startups$hidden)]
list()

我的代码有什么问题?

更新(希望包含可重复的示例,抱歉复杂的结构)

aa <- dput(head(data$startups, n=3))

产生以下输出:

list(structure(list(id = 386938, hidden = FALSE, community_profile = FALSE, 
    name = "Pritunl", angellist_url = "https://angel.co/pritunl", 
    logo_url = "https://s3.amazonaws.com/photos.angel.co/startups/i/386938-fac0b8cba76c7e9252eee6646ec5b681-medium_jpg.jpg?buster=1398401450", 
    thumb_url = "https://s3.amazonaws.com/photos.angel.co/startups/i/386938-fac0b8cba76c7e9252eee6646ec5b681-thumb_jpg.jpg?buster=1398401450", 
    quality = 0, product_desc = "Enterprise VPN/cloud networking server", 
    high_concept = "Enterprise cloud networking", follower_count = 1, 
    company_url = "http://pritunl.com", created_at = "2014-04-25T04:50:57Z", 
    updated_at = "2014-04-25T06:02:05Z", crunchbase_url = NULL, 
    twitter_url = "http://twitter.com/pritunl", blog_url = "", 
    video_url = "", markets = list(structure(list(id = 12, tag_type = "MarketTag", 
        name = "enterprise software", display_name = "Enterprise Software", 
        angellist_url = "https://angel.co/enterprise-software"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 59, tag_type = "MarketTag", name = "open source", 
        display_name = "Open Source", angellist_url = "https://angel.co/open-source"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 123, tag_type = "MarketTag", name = "internet infrastructure", 
        display_name = "Internet Infrastructure", angellist_url = "https://angel.co/internet-infrastructure"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 306, tag_type = "MarketTag", name = "cloud management", 
        display_name = "Cloud Management", angellist_url = "https://angel.co/cloud-management"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url"))), locations = list(
        structure(list(id = 2071, tag_type = "LocationTag", name = "new york", 
            display_name = "New York", angellist_url = "https://angel.co/new-york"), .Names = c("id", 
        "tag_type", "name", "display_name", "angellist_url"))), 
    company_size = "1-10", company_type = list(structure(list(
        id = 94212, tag_type = "CompanyTypeTag", name = "startup", 
        display_name = "Startup", angellist_url = "https://angel.co/startup"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url"))), status = NULL, 
    screenshots = list(structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/5f7410543201d583eaba1975b931f3fd-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/5f7410543201d583eaba1975b931f3fd-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/006c4fb50d4b10df7caf7800ee482c6b-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/006c4fb50d4b10df7caf7800ee482c6b-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/741225c3de5021399c0cfc33cecb8830-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/741225c3de5021399c0cfc33cecb8830-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/969b60b6ccda577e77b7c9a5c169b2fd-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/969b60b6ccda577e77b7c9a5c169b2fd-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/2b2cc3a046c5a4d20b328045ca7f0254-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/2b2cc3a046c5a4d20b328045ca7f0254-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/053c3a1c74fc7f39de1117770f9debef-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/053c3a1c74fc7f39de1117770f9debef-original.png"), .Names = c("thumb", 
    "original")), structure(list(thumb = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/8adcf2d6a6cafc9c6b810f8359a3fedf-thumb_jpg.jpg", 
        original = "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/8adcf2d6a6cafc9c6b810f8359a3fedf-original.png"), .Names = c("thumb", 
    "original")))), .Names = c("id", "hidden", "community_profile", 
"name", "angellist_url", "logo_url", "thumb_url", "quality", 
"product_desc", "high_concept", "follower_count", "company_url", 
"created_at", "updated_at", "crunchbase_url", "twitter_url", 
"blog_url", "video_url", "markets", "locations", "company_size", 
"company_type", "status", "screenshots")), structure(list(id = 385596, 
    hidden = FALSE, community_profile = TRUE, name = "Lariat ", 
    angellist_url = "https://angel.co/lariat-1", logo_url = "https://s3.amazonaws.com/photos.angel.co/startups/i/385596-29de05d584176c3972da411aed5485f0-medium_jpg.jpg?buster=1398260121", 
    thumb_url = "https://s3.amazonaws.com/photos.angel.co/startups/i/385596-29de05d584176c3972da411aed5485f0-thumb_jpg.jpg?buster=1398260121", 
    quality = 0, product_desc = "Thus far, the internet has gone from discovery to search discovery, and then social discovery, but with little focus on recall. Remembering your digital footprint is difficult. We aim to solve that problem. Lariat is a cloud-based recall engine to securely recall information from any page in your search history instantly through intuitive keyword search, not just from page titles, but from the contents and context of the underlying pages.\r\n\r\nWrangle in the information you want, easier and faster.", 
    high_concept = "Recall your digital footprint on the web instantly", 
    follower_count = 1, company_url = "http://www.lariattech.com", 
    created_at = "2014-04-23T13:17:47Z", updated_at = "2014-04-23T13:48:38Z", 
    crunchbase_url = NULL, twitter_url = "", blog_url = "", video_url = NULL, 
    markets = list(structure(list(id = 4, tag_type = "MarketTag", 
        name = "digital media", display_name = "Digital Media", 
        angellist_url = "https://angel.co/digital-media"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 12, tag_type = "MarketTag", name = "enterprise software", 
        display_name = "Enterprise Software", angellist_url = "https://angel.co/enterprise-software"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 59, tag_type = "MarketTag", name = "open source", 
        display_name = "Open Source", angellist_url = "https://angel.co/open-source"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url")), structure(list(
        id = 282, tag_type = "MarketTag", name = "semantic search", 
        display_name = "Semantic Search", angellist_url = "https://angel.co/semantic-search"), .Names = c("id", 
    "tag_type", "name", "display_name", "angellist_url"))), locations = list(
        structure(list(id = 1620, tag_type = "LocationTag", name = "boston", 
            display_name = "Boston", angellist_url = "https://angel.co/boston"), .Names = c("id", 
        "tag_type", "name", "display_name", "angellist_url"))), 
    company_size = "1-10", company_type = structure(list(), class = "AsIs"), 
    status = NULL, screenshots = structure(list(), class = "AsIs")), .Names = c("id", 
"hidden", "community_profile", "name", "angellist_url", "logo_url", 
"thumb_url", "quality", "product_desc", "high_concept", "follower_count", 
"company_url", "created_at", "updated_at", "crunchbase_url", 
"twitter_url", "blog_url", "video_url", "markets", "locations", 
"company_size", "company_type", "status", "screenshots")), structure(list(
    id = 385595, hidden = TRUE), .Names = c("id", "hidden")))

以更易读的格式(aa):

[[1]]
[[1]]$id
[1] 386938

[[1]]$hidden
[1] FALSE

[[1]]$community_profile
[1] FALSE

[[1]]$name
[1] "Pritunl"

[[1]]$angellist_url
[1] "https://angel.co/pritunl"

[[1]]$logo_url
[1] "https://s3.amazonaws.com/photos.angel.co/startups/i/386938-fac0b8cba76c7e9252eee6646ec5b681-medium_jpg.jpg?buster=1398401450"

[[1]]$thumb_url
[1] "https://s3.amazonaws.com/photos.angel.co/startups/i/386938-fac0b8cba76c7e9252eee6646ec5b681-thumb_jpg.jpg?buster=1398401450"

[[1]]$quality
[1] 0

[[1]]$product_desc
[1] "Enterprise VPN/cloud networking server"

[[1]]$high_concept
[1] "Enterprise cloud networking"

[[1]]$follower_count
[1] 1

[[1]]$company_url
[1] "http://pritunl.com"

[[1]]$created_at
[1] "2014-04-25T04:50:57Z"

[[1]]$updated_at
[1] "2014-04-25T06:02:05Z"

[[1]]$crunchbase_url
NULL

[[1]]$twitter_url
[1] "http://twitter.com/pritunl"

[[1]]$blog_url
[1] ""

[[1]]$video_url
[1] ""

[[1]]$markets
[[1]]$markets[[1]]
[[1]]$markets[[1]]$id
[1] 12

[[1]]$markets[[1]]$tag_type
[1] "MarketTag"

[[1]]$markets[[1]]$name
[1] "enterprise software"

[[1]]$markets[[1]]$display_name
[1] "Enterprise Software"

[[1]]$markets[[1]]$angellist_url
[1] "https://angel.co/enterprise-software"


[[1]]$markets[[2]]
[[1]]$markets[[2]]$id
[1] 59

[[1]]$markets[[2]]$tag_type
[1] "MarketTag"

[[1]]$markets[[2]]$name
[1] "open source"

[[1]]$markets[[2]]$display_name
[1] "Open Source"

[[1]]$markets[[2]]$angellist_url
[1] "https://angel.co/open-source"


[[1]]$markets[[3]]
[[1]]$markets[[3]]$id
[1] 123

[[1]]$markets[[3]]$tag_type
[1] "MarketTag"

[[1]]$markets[[3]]$name
[1] "internet infrastructure"

[[1]]$markets[[3]]$display_name
[1] "Internet Infrastructure"

[[1]]$markets[[3]]$angellist_url
[1] "https://angel.co/internet-infrastructure"


[[1]]$markets[[4]]
[[1]]$markets[[4]]$id
[1] 306

[[1]]$markets[[4]]$tag_type
[1] "MarketTag"

[[1]]$markets[[4]]$name
[1] "cloud management"

[[1]]$markets[[4]]$display_name
[1] "Cloud Management"

[[1]]$markets[[4]]$angellist_url
[1] "https://angel.co/cloud-management"



[[1]]$locations
[[1]]$locations[[1]]
[[1]]$locations[[1]]$id
[1] 2071

[[1]]$locations[[1]]$tag_type
[1] "LocationTag"

[[1]]$locations[[1]]$name
[1] "new york"

[[1]]$locations[[1]]$display_name
[1] "New York"

[[1]]$locations[[1]]$angellist_url
[1] "https://angel.co/new-york"



[[1]]$company_size
[1] "1-10"

[[1]]$company_type
[[1]]$company_type[[1]]
[[1]]$company_type[[1]]$id
[1] 94212

[[1]]$company_type[[1]]$tag_type
[1] "CompanyTypeTag"

[[1]]$company_type[[1]]$name
[1] "startup"

[[1]]$company_type[[1]]$display_name
[1] "Startup"

[[1]]$company_type[[1]]$angellist_url
[1] "https://angel.co/startup"



[[1]]$status
NULL

[[1]]$screenshots
[[1]]$screenshots[[1]]
[[1]]$screenshots[[1]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/5f7410543201d583eaba1975b931f3fd-thumb_jpg.jpg"

[[1]]$screenshots[[1]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/5f7410543201d583eaba1975b931f3fd-original.png"


[[1]]$screenshots[[2]]
[[1]]$screenshots[[2]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/006c4fb50d4b10df7caf7800ee482c6b-thumb_jpg.jpg"

[[1]]$screenshots[[2]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/006c4fb50d4b10df7caf7800ee482c6b-original.png"


[[1]]$screenshots[[3]]
[[1]]$screenshots[[3]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/741225c3de5021399c0cfc33cecb8830-thumb_jpg.jpg"

[[1]]$screenshots[[3]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/741225c3de5021399c0cfc33cecb8830-original.png"


[[1]]$screenshots[[4]]
[[1]]$screenshots[[4]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/969b60b6ccda577e77b7c9a5c169b2fd-thumb_jpg.jpg"

[[1]]$screenshots[[4]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/969b60b6ccda577e77b7c9a5c169b2fd-original.png"


[[1]]$screenshots[[5]]
[[1]]$screenshots[[5]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/2b2cc3a046c5a4d20b328045ca7f0254-thumb_jpg.jpg"

[[1]]$screenshots[[5]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/2b2cc3a046c5a4d20b328045ca7f0254-original.png"


[[1]]$screenshots[[6]]
[[1]]$screenshots[[6]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/053c3a1c74fc7f39de1117770f9debef-thumb_jpg.jpg"

[[1]]$screenshots[[6]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/053c3a1c74fc7f39de1117770f9debef-original.png"


[[1]]$screenshots[[7]]
[[1]]$screenshots[[7]]$thumb
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/8adcf2d6a6cafc9c6b810f8359a3fedf-thumb_jpg.jpg"

[[1]]$screenshots[[7]]$original
[1] "https://s3.amazonaws.com/screenshots.angel.co/ae/386938/8adcf2d6a6cafc9c6b810f8359a3fedf-original.png"




[[2]]
[[2]]$id
[1] 385596

[[2]]$hidden
[1] FALSE

[[2]]$community_profile
[1] TRUE

[[2]]$name
[1] "Lariat "

[[2]]$angellist_url
[1] "https://angel.co/lariat-1"

[[2]]$logo_url
[1] "https://s3.amazonaws.com/photos.angel.co/startups/i/385596-29de05d584176c3972da411aed5485f0-medium_jpg.jpg?buster=1398260121"

[[2]]$thumb_url
[1] "https://s3.amazonaws.com/photos.angel.co/startups/i/385596-29de05d584176c3972da411aed5485f0-thumb_jpg.jpg?buster=1398260121"

[[2]]$quality
[1] 0

[[2]]$product_desc
[1] "Thus far, the internet has gone from discovery to search discovery, and then social discovery, but with little focus on recall. Remembering your digital footprint is difficult. We aim to solve that problem. Lariat is a cloud-based recall engine to securely recall information from any page in your search history instantly through intuitive keyword search, not just from page titles, but from the contents and context of the underlying pages.\r\n\r\nWrangle in the information you want, easier and faster."

[[2]]$high_concept
[1] "Recall your digital footprint on the web instantly"

[[2]]$follower_count
[1] 1

[[2]]$company_url
[1] "http://www.lariattech.com"

[[2]]$created_at
[1] "2014-04-23T13:17:47Z"

[[2]]$updated_at
[1] "2014-04-23T13:48:38Z"

[[2]]$crunchbase_url
NULL

[[2]]$twitter_url
[1] ""

[[2]]$blog_url
[1] ""

[[2]]$video_url
NULL

[[2]]$markets
[[2]]$markets[[1]]
[[2]]$markets[[1]]$id
[1] 4

[[2]]$markets[[1]]$tag_type
[1] "MarketTag"

[[2]]$markets[[1]]$name
[1] "digital media"

[[2]]$markets[[1]]$display_name
[1] "Digital Media"

[[2]]$markets[[1]]$angellist_url
[1] "https://angel.co/digital-media"


[[2]]$markets[[2]]
[[2]]$markets[[2]]$id
[1] 12

[[2]]$markets[[2]]$tag_type
[1] "MarketTag"

[[2]]$markets[[2]]$name
[1] "enterprise software"

[[2]]$markets[[2]]$display_name
[1] "Enterprise Software"

[[2]]$markets[[2]]$angellist_url
[1] "https://angel.co/enterprise-software"


[[2]]$markets[[3]]
[[2]]$markets[[3]]$id
[1] 59

[[2]]$markets[[3]]$tag_type
[1] "MarketTag"

[[2]]$markets[[3]]$name
[1] "open source"

[[2]]$markets[[3]]$display_name
[1] "Open Source"

[[2]]$markets[[3]]$angellist_url
[1] "https://angel.co/open-source"


[[2]]$markets[[4]]
[[2]]$markets[[4]]$id
[1] 282

[[2]]$markets[[4]]$tag_type
[1] "MarketTag"

[[2]]$markets[[4]]$name
[1] "semantic search"

[[2]]$markets[[4]]$display_name
[1] "Semantic Search"

[[2]]$markets[[4]]$angellist_url
[1] "https://angel.co/semantic-search"



[[2]]$locations
[[2]]$locations[[1]]
[[2]]$locations[[1]]$id
[1] 1620

[[2]]$locations[[1]]$tag_type
[1] "LocationTag"

[[2]]$locations[[1]]$name
[1] "boston"

[[2]]$locations[[1]]$display_name
[1] "Boston"

[[2]]$locations[[1]]$angellist_url
[1] "https://angel.co/boston"



[[2]]$company_size
[1] "1-10"

[[2]]$company_type
list()

[[2]]$status
NULL

[[2]]$screenshots
list()


[[3]]
[[3]]$id
[1] 385595

[[3]]$hidden
[1] TRUE

最后,通过逻辑索引向量应用子集操作:

aa[data$startups$hidden == FALSE]

结果是一个空列表(尽管第一和第二个元素hidden = FALSE):

list()

再次,抱歉输出的大小,但我必须保留列表的结构。

考虑:

根据R Project的“R简介”(http://cran.r-project.org/doc/manuals/R-intro.html#Index-vectors),

  

“向量元素的子集可以通过在方括号中附加向量的名称来选择。更一般地,任何求值为向量的表达式都可以通过附加索引来类似地选择其元素的子集。在表达式“。

之后立即用方括号中的向量

同时,根据Hadley Wickham的“高级R”(http://adv-r.had.co.nz/Subsetting.html),

  

“对列表进行子集化的工作方式与对原子向量进行子集化的方式完全相同”。

2 个答案:

答案 0 :(得分:6)

问题中的示例数据是长度为3的列表,我们称之为L。它的每个组件本身都是一个列表,每个子列表的一个组件是hidden。我们可以将子列表的hidden个组件提取到名为hidden的逻辑向量中。使用该逻辑向量,我们可以对原始列表L进行子集,从而生成一个新列表,其中仅包含hidden成分为TRUE的子列表。

hidden <- sapply(L, "[[", "hidden") # create logical vector hidden
L[hidden]

对于提供的数据,我们得到一个包含一个组件的列表:

> length(L[hidden])
[1] 1

如果我们知道只有一个组件,则L[hidden][[1]]L[[which(hidden)]]会提供该组件。

答案 1 :(得分:0)

使用two numbers对数据框编制索引。要仅选择行,您需要执行以下操作:

data$startups[data$startups$hidden == FALSE, ]